Science topic

# Estimation - Science topic

Explore the latest questions and answers in Estimation, and find Estimation experts.
Questions related to Estimation
• asked a question related to Estimation
Question
Can it become linear, then linear regression could be done.
Should I estimate using non linear estimates?
One more time you don't need to linearize anything it's the regression coefficients that matter. Please read the Kutner book it's a free download from the z-library. David Booth
• asked a question related to Estimation
Question
Say one has some devices to measure the temperature of a room. The devices don't provide me with an accurate measurement. Some overshoot the actual value of the reading, others underestimate it. Using this set of inaccurate readings, is it possible for me to obtain a reading having high accuracy?
If you have used the instruments before and can make assumptions about their performance (even things as far fetched as being unbiased and random), then you can do things like have lots of samples and if you believe things like the random and unbiased then as Christian Geiser says you can assume as n increases the estimates improve in accuracy. You need to provide more information about what you can assume in order for people to properly address your question.
• asked a question related to Estimation
Question
Estimation of the number of acceptor molecules surrounding a given donor in the (Forster resonance energy transfer) FRET system
Everything Saravanan said is correct of course. But I don't think it answers the question. I admit that I do not see a simple way to estimate the number of acceptors within FRET distance of a donor.
• asked a question related to Estimation
Question
Which commands are used for dynamic panel logit/probit model estimation in Stata?
xtprobit — Random-effects and population-averaged probit models
Description
Quick start
Syntax
Options for RE model
Options for PA model
Remarks and examples
Stored results
Methods and formulas
References
Also see
Description
xtprobit fits random-effects and population-averaged probit models for a binary dependent
variable. The probability of a positive outcome is assumed to be determined by the standard normal
cumulative distribution function.
Quick start
Random-effects probit model of y as a function of x1, x2, and indicators for levels of categorical
variable a using xtset data
xtprobit y x1 x2 i.a
Population-averaged model with robust standard errors
xtprobit y x1 x2 i.a, pa vce(robust)
As above, but specify an autoregressive correlation structure of order 1
xtprobit y x1 x2 i.a, pa vce(robust) corr(ar 1)
Random-effects model with cluster–robust standard errors for panels nested within cvar
xtprobit y x1 x2 i.a, vce(cluster cvar)
Statistics > Longitudinal/panel data > Binary outcomes > Probit regression (RE, PA)
• asked a question related to Estimation
Question
How to Estimate the Total flavonoid content from endophytic fungal extract?
Not more than 40oC
• asked a question related to Estimation
Question
What is the best dynamic panel model when T>N ?
For the long T panel, a possible option is the panel data techniques that account for possible cross-section dependence. See for instance: https://www.sciencedirect.com/science/article/abs/pii/S0304407620301020.
Hope this helps. Thank you.
• asked a question related to Estimation
Question
Hi,
Does anybody know how to extract the slopes' effect size of classes in Latent Class Growth Analysis using Mplus?
Thanks
Could elaborate a little bit on what you mean by effect size in this context? Do you mean the estimate of the slope factor mean in a particular class? The mean should be part of the parameter estimate output. From that, you could compute a standardized effect size measure by hand (using the estimate of the slope factor mean and variance).
• asked a question related to Estimation
Question
Greetings!
I am conducting a series of CFAs in R using the 'lavaan' package. I am interested in estimating the correlations between the factors taking my measuerment model into account instead of going back to the raw data and summing the items representing each factor. In the lavaan output, I can find the covariances only. I can turn the covariances to correlations by dividing them by the product of the standard deviations, but due to the number of CFAs and the number of factors, I am wondering if there is a more streamlined way in the lavaan syntax to do that (Or in another SEM package in R).
Any help would be greatly appreciated!
(* I checked the lavaan commands. I did not find something but I am fairly new to the package and I might have missed it. Furthermore, the only option in 'lavaan.Plot' is to plot covariances)
(**Thought: I used variance standardization method in an example and marker method in another. In the variance standardization method, the "estimate" collumn and the "std.all" collumn of my covariance table were identical. In the marker method they were not. Im thinking that standardized covariances in lavaan should be the correlations and they are identifiable using a marker method only. Or not?"
Christian Geiser Yes, they do give the same std.all solution. I forgot that the "estimate" collumn in the variance standardized method is supposed to be the same as the std.all collumn that is shared in both methods since variance is standardized and i got mixed up with that. Thank you for the fast reply!
• asked a question related to Estimation
Question
Need help regarding: Estimation of Soil Moisture Regimes using Remote Sensing. Either EIPC Model fits best? or there is something else that can be considered.
Please refer to the following paper that presents a comprehensive review of the progress in remote sensing as well as field methods for soil moisture studies.
Regards
• asked a question related to Estimation
Question
I am working on a research point that employs estimation techniques. I am trying to apply an algorithm in my work to estimate system poles. I wrote an m-file and tried to apply this technique on a simple transfer function to estimate its roots .any suggestions about estimation techniques ?
Ahmed Abdulsalam Thank you , Ahmed
• asked a question related to Estimation
Question
Hi, I am looking for available potassium quantification. But the suggested protocol said to have flame photometer, that I don't.
Please suggest me a good method for estimating available potassium of my soil samples without using flame photometer. Also suggest me a good method for estimating Calcium and magnesium.
Thank You
Best wishes,
Sabri
• asked a question related to Estimation
Question
Dear Researcher,
I have the following MATLAB codes that generate FMCW signal. However, I have two basic problem with code I appreciate it if you can help/guide me to resolve them:
1. Based on my understanding, this code generates FMCW for one Target as the dimension of the sig is 1 x N which must must be L x N (L is the number of target)
2. the Dechirped signal, which is Analog, at the receiver have to be converted to digital in my algorithm
Note/ I want to apply this time of signal (FMCW) to Direction of Arrival Estimation (DOA) algorithm
Again I highly appreciate your time and consideration to help me to overcome these uncertainness.
%%CODES for generating FMCW signal%%%%%%%%
% Compute hardware parameters from specified long-range requirements
fc = 77e9; % Center frequency (Hz)
c = physconst('LightSpeed'); % Speed of light in air (m/s)
lambda = freq2wavelen(fc,c); % Wavelength (m)
% Set the chirp duration to be 5 times the max range requirement
rangeMax = 100; % Maximum range (m)
% In general, for an FMCW radar system, the "sweep time" should be at least five to six times the round trip time
tm = 5*range2time(rangeMax,c); % Chirp duration (s)=Symbol duration (Tsym)
% Determine the waveform bandwidth from the required range resolution
rangeRes = 1; % Desired range resolution (m)
bw = rangeres2bw(rangeRes,c); % Corresponding bandwidth (Hz)
% Set the sampling rate to satisfy both the range and velocity requirements for the radar
sweepSlope = bw/tm; % FMCW sweep slope (Hz/s)
fbeatMax = range2beat(rangeMax,sweepSlope,c); % Maximum beat frequency (Hz)
vMax = 230*1000/3600; % Maximum Velocity of cars (m/s)
fdopMax = speed2dop(2*vMax,lambda); % Maximum Doppler shift (Hz)
fifMax = fbeatMax+fdopMax; % Maximum received IF (Hz)
fs = max(2*fifMax,bw); % Sampling rate (Hz)
% Configure the FMCW waveform using the waveform parameters derived from the long-range requirements
waveform = phased.FMCWWaveform('SweepTime',tm,'SweepBandwidth',bw,...
'SampleRate',fs,'NumSweeps',2,'SweepDirection','Up');
% if strcmp(waveform.SweepDirection,'Down')
% sweepSlope = -sweepSlope;
% end
N=tm*fs; % Number of fast-time samples
Nsweep = 192; % Number of slow-time samples
sigTx = waveform();
for i=1:K
sigRx=A*sigTgt';
sigRx=sigRx+awgn(sigRx,SNR);
%DeChirped and conevrt it to Digital
% DesigRx=dechirp(sigRx,sigREF);
DechirpedSignal= sigTgt .* conj(sigRx);
end
My suggestion would be to first understand your code before asking questions. In my opinion, you did not ask questions, you gave two observations. Related to your observation 2. what do you mean by analog? In computer/Matlab everything is digital. You can model analog-to-digital conversion in Matlab but signals will still be digital. Try to understand code and then pose questions.
• asked a question related to Estimation
Question
####(I also posted this on SO https://stackoverflow.com/q/71531275/16505198and SE https://stats.stackexchange.com/q/568112/340994 but didn't receive any answer until now. So here's another chance :-) However the code snippets might be more readable there... )#####
Hello,
I am estimating an ordinal logistic regression under the assumption of proportional odds with the ordinal::clm() function. As a RE see this model from the "housing" dataset (MASS::housing):

clm(Sat~Type*Cont + Freq, data = housing, link = "probit") %>% S
formula: Sat ~ Type * Cont + Freq
data: housing
Coefficients:
Estimate Std. Error z value Pr(>|z|)
TypeApartment -0.14387 0.54335 -0.265 0.791
TypeAtrium 0.20043 0.55593 0.361 0.718
TypeTerrace 0.18246 0.55120 0.331 0.741
ContHigh 0.05598 0.53598 0.104 0.917
Freq 0.01360 0.01116 1.219 0.223
TypeApartment:ContHigh -0.25287 0.78178 -0.323 0.746
TypeAtrium:ContHigh -0.17201 0.76610 -0.225 0.822
TypeTerrace:ContHigh -0.18917 0.76667 -0.247 0.805
Threshold coefficients:
Estimate Std. Error z value
Low|Medium -0.1130 0.4645 -0.243
Medium|High 0.7590 0.4693 1.617

If I want to test if the main effect and the interaction term are (simultaneously!) significant I used the glht function where I test the hypothesis that (bold for matrices or vectors) $\boldsymbol{\beta} \cdot \boldsymbol{K} = \boldsymbol{m}$.
So If I'd like to test if living in an apartment (main effect) **plus** the interaction of living in an apartment and having high contact is significantly different from zero it would be $(0; 0; 1; 0; 0; 0; 0; 1; 0;0 )\cdot \boldsymbol{\beta} = (0;0;...;0)$. (Assuming the two thresholds as intercepts and thus the first two estimates).
Is it right to test:

glht(mod, linfct = c("TypeApartment +TypeApartment:ContHigh =0")) %>% summary()
Simultaneous Tests for General Linear Hypotheses
Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
TypeApartment + TypeApartment:ContHigh == 0 -0.3967 0.6270 -0.633 0.527
(Adjusted p values reported -- single-step method)

or do I have to use:

glht(mod, linfct = c("TypeApartment= 0", "TypeApartment:ContHigh =0")) %>% summary()
Simultaneous Tests for General Linear Hypotheses
Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
TypeApartment == 0 -0.1439 0.5434 -0.265 0.946
TypeApartment:ContHigh == 0 -0.2529 0.7818 -0.323 0.921
(Adjusted p values reported -- single-step method)

Thanks a lot in advance I hope I posed the question right and understandable :-) If you have other options to test if a main effect and an interaction term are significant go ahead and tell me (and the others).
Thanks, Luise
Luise Novikov Apologies, my first answer was actually incorrect - I have removed it to avoid misleading anyone.
If you pass both arguments separately, the function will return the results for partial tests i.e. multiplicity-adjusted p-values for each hypothesis under the assumption that the coefficients being tested are simultaneously zero. What you are looking for is a global test for the second option. The first option is incorrect because the sum of the coefficients can be zero without either coefficient being zero. Thus, hypothesis tests assuming only that the sum is zero will have an inappropriate additional degree of freedom and the p-values will be incorrect as a result.
What you are looking for is a global test under the assumption that both coefficients are zero, which you can obtain using car::linearHypothesis. For your data:
mod <- ordinal::clm(Sat~Type*Cont + Freq, data = MASS::housing, link = "probit")
car::linearHypothesis(mod, c("TypeApartment = 0", "TypeApartment:ContHigh = 0"))
• asked a question related to Estimation
Question
I want to know which weight is better to estimate carbon. Dry weight or wet weight?
And what is the conversion formula?
Regards,
Shafagat
• asked a question related to Estimation
Question
(E1U) (E2G) (E2G) (B2U) (A1G) (E1U) (E1U) (E2G)
(E2G) (B2U)
Requested convergence on RMS density matrix=1.00D-08 within 128 cycles.
Requested convergence on MAX density matrix=1.00D-06.
Requested convergence on energy=1.00D-06.
No special actions if energy rises.
SCF Done: E(RB3LYP) = -13319.3349271 A.U. after 1 cycles
Convg = 0.2232D-08 -V/T = 2.0097
Range of M.O.s used for correlation: 1 5424
NBasis= 5424 NAE= 1116 NBE= 1116 NFC= 0 NFV= 0
NROrb= 5424 NOA= 1116 NOB= 1116 NVA= 4308 NVB= 4308
PrsmSu: requested number of processors reduced to: 4 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.I
.
.
.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
Symmetrizing basis deriv contribution to polar:
IMax=3 JMax=2 DiffMx= 0.00D+00
G2DrvN: will do 1 centers at a time, making 529 passes doing MaxLOS=2.
Estimated number of processors is: 3
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
.
.
.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
Erroneous write. Write 898609344 instead of 2097152000.
fd = 4
orig len = 3177921600 left = 3177921600
g_write
Hi, can you please tell me, how you have solved this problem.
Actually same problem i am facing now. So, it would be helpful if you can share your input regarding this error. Mostafa Yousefzadeh Borzehandani
• asked a question related to Estimation
Question
I asked about variables X and Y to the same respondents for two different product types. I mean, respondents firstly answered the items for Product1 then for Product2. In AMOS, I have two different models (Product1-Product2). We proposed that the effect of X on Y is stronger for product1. We have two Std. Estimates. The first beta is greater than the second one. Is this finding sufficient to confirm the hypothesis? Or do I still need the Chi-square difference test or something like it to prove the statistical significance? If not, what should I do for the statistical difference between these two betas?
check the pvalue
• asked a question related to Estimation
Question
An article March 10, 2020 on Annals of Internal Medicine, The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application, estimates a median incubation period of 5.8 days "(95% CI, 4.5 to 5.8 days), and 97.5% of those who develop symptoms will do so within 11.5 days (CI, 8.2 to 15.6 days) of infection."
Are there any updated estimates or more recent reports?
Have a look at this:
Thanks!
• asked a question related to Estimation
Question
I AM USING PANEL DATA, I WANT TO ESTIMATE THE IMPACT OF REGULATION ON FIRMS' INNOVATION THROUGH DID, PSM-DID APPROACHES, I CAN ABLE TO CALCULATE DID BUT NOT PSM-DID FOR PANEL DATA. PLEASE ANYONE CAN EXPERIENCE.
• asked a question related to Estimation
Question
Good evening everyone,
I am using time-series data in ARDL model in which, I have 1 dependent, 1 independent, 1 control and 2 dummy variables with interaction for analysis in E-Views 10 version software. But I didn't understand where to place control variable in the list of dynamic or fixed regressors of ARDL Estimation equation. Please guide me with your knowledge and experience. I will be very thankful to you.
Can we talk ? I have to talk regarding my paper. I have few queries regarding it. (@Mohamed-Mourad Lafifi)
• asked a question related to Estimation
Question
I want to evaluation of phytochemical and morphological studies on Iranian willow (Salix.) species. But I need to be the same age as the trees. Is there a way to measure the age of willow trees without cutting down?
Willow is a tree that you may not be able to hit the center all the time with an increment borer (as mentioned by Dr. Mohl). Since Salix if cut sprouts back from the roots (coppice), sometimes an increment bore may have two or more stem centers that grew together with time. As I remember, if especially worried about health of tree, return the increment core to the tree, or apply the spray that horticulturalist sand tree surgeons use on cut surfaces. The annual rings can be counted visually, or get a hand lens if the rings are close. Typically, increment cores should not damage or kill tree. If you don’t have any sealant, candle wax would probably due to cover hole, and discourage insects, excess moisture or disease entry.
• asked a question related to Estimation
Question
To generate a PCA from lcWGS SNPs, one may use ANGSD to generate genotype likelihoods and then use these as input to generate a covariance matrix using PCAngsd.
The covariance matrix generated by PCAngsd is a n x n matrix where n is the number of samples and p is the number of SNPs (variables). According to the [PCAngsd tutorial](http://www.popgen.dk/software/index.php/PCAngsdTutorial#Estimating_Individual_Allele_Frequencies), the principal components (i.e. the samples plotted in the space defined by the eigenvectors) can be generated directly from this covariance matrix by eigendecomposition.
This is in contrast to the 'usual' way that PCA is done (via a covariance matrix), where a p x p (not n x n) covariance matrix C is generated from a centered n x p data matrix X. Eigendecomposition of C, then generates the eigenvectors and eigenvalues. The transformed values of X into the space defined by the eigenvectors (i.e. the principal components) can then be generated through a linear transformation of X with the eigenvectors (e.g. see [this](https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca) excellent example).
The difference between the two methods appears to lie in the covariance matrix. With the PCAngsd method, the covariance matrix is n x n as apposed to the 'usual' p x p matrix.
So what is the difference between these two covariance matrices, and what is generated by the eigendecomposition of an n x n matrix? Is it really the sample principal components, or something else?
The eigenvectors with the highest eigenvalues correlate to the dimensions with the highest correlation in the dataset by calculating the eigenvalues and eigenvectors of the covariance matrix. This is the most important component
• asked a question related to Estimation
Question
I am currently working as a Master's student on several publications that methodically rely on latent profile analyses. In the context of these LPAs, I have repeatedly encountered the problem, that calculating the BLRT test with the help of Mplus (TECH14) is both time-consuming and often unsuccessful. In this specific case, an LPA with 30 Likert-scaled items (1-5) of the "Big Five" (OCEAN model) with a sample size of N= 738 was conducted. I would be interested to know, which approach you prefer in your research work.
Q1: Do you increase the number of bootstraps and LRT starts as the number of profiles increases, or do you exclude the BLRT test when you encounter error messages and instead refer to the Loglik value, LMR test, and the BIC fit indicator?
So far I have tried the following settings for TECH14 according to the recommendations of the forum entries by Muthen & Muthen:
LRTSTARTS 0 0 100 20 / LRTSTARTS 0 0 500 50.
Both of these options will result in unsuccessful bootstrap draws if more than three profiles are calculated for the model.
Q2: Do you treat your Likert scaled items as interval scaled variables and use the MLR Estimator or do you treat your indicator items as ordinal variables and use the WLSMV Estimator?
In my case, attributing the items as categorical with the WLSMV Estimator leads already with two profiles to a "non-positive definite first-order derivative product matrix".
There seem to be conflicting opinions here. Brown (2006) writes "The WLSMV is a robust estimator which does not assume normally distributed variables and provides the best option for modeling categorical or ordered data".
On the other hand, Bengt.O. Muthen (2006) :
The most important aspect is how strong floor and/or ceiling effects you have. If they are strong, you may want to use a categorical approach.
Q3: Would any of you be willing to cross-check my syntax for comparing distal outcomes with the BCH approach? (See appendix)
Philipp Schulz
References:
Brown, T. (2006). Confirmatory factor analysis for applied research. New York: Guildford.
Philipp Schulz I agree with David Eugene Booth . Since your items are ordinal (ordered categorical), you would typically want to use classical latent class analysis (LCA), not latent profile analysis (LPA). LPA is designed for continuous (metrical, interval) indicators, whereas LCA is for categorical (binary, ordinal) indicators. The problems that you encountered with the BLRT may be related to the large number of items (30 items is a lot for both LCA and LPA). This does not necessary mean that using fewer items is better (in principle, it is good to have many indicators, as long as they are good class indicators), but it may explain the problems in successfully conducting the bootstrap. The BIC is a simple and good alternative in my experience. And yes, as the number of classes / profiles goes up, you should definitely increase the number of starts for both the target (estimated) class model and the bootstrap. You should also check that the best loglikelihood value for each model can be replicated at least a few times so as to minimize the risk of local likelihood maxima (potentially invalid solutions and/or invalid bootstrap results).
With regard to ML vs. WLSMV, you may be confusing latent class/latent profile analysis with confirmatory factor analysis (CFA) and structural equation modeling (SEM). Both LPA and LCA typically make use of maximum likelihood (ML) estimation, not WLSMV. WLSMV is used for ordinal (ordered categorical) variables used as indicators of continuous latent variables ("factors") in CFA and SEM.
• asked a question related to Estimation
Question
This study was carried out in order to find out the level of sheep’s meat, liver and kidney contamination by heavy metals such as: copper, lead, zinc, cadmium and cobalt in different areas of al Sulaimanyah Governorate in comparison with international allowed levels. For the above purpose; three samples of (meat, liver and kidney) were taken in three different
districts of al Sulaimanyah Governorate were covered: Said Sadiq, Dokan district and sulaimanyah city center. The samples were collected during October and November 2020. The triple interference (factors) has affected significantly, the amount of copper in the different sheep tissues; so the amount was varied with the difference of tissue, the place and the time of taking the sample. The highest level of copper in Liver’s tissue was recorded in Dokan district during November, while the lowest level of copper in the meat tissue was recorded in Said Sadiq district during November. The triple interference for the study factors, also affected the level of Zinc in different sheep tissue were the amount varied by tissue difference, place and the time of sample taking. Highest level of Zinc was recorded in kidneys tissue, in Sulaimanyah city Centre during October, while less amount of Zinc was recorded in liver’s tissue in Said Sadiq during October. The triple interference within the study’s factor, significantly affected the amount of cadmium. The amounts were varied by difference of tissue, place and time of taking the samples. Highest level of cadmium was recorded in the meat tissue, at Sulaimanyah city Centre during October, while less amount of Cadmium was recorded in liver’s tissue, Said Sadiq district during October. The triple interference did not affect significantly the amount of Lead and Cobalt in different sheep’s tissue.
Interesting research!
Did you make a conclusion which sousrces hade made an effect of meat toxication by heavy metals?
• asked a question related to Estimation
Question
Are there any alternative techniques for ethanol estimation other than HPLC and GC?
Dear Yerra Kanakaraju thank you for posting this interesting technical question on RG. In order to give you a qualified answer, it would be helpful to know a few more details about the conditions / environment in which you want to determine the ethanol content (e.g. in blood, alcoholic beverages, biodiesel etc.). In general, NMR spectroscopy is a suitable method for determining ethanol in mixtures. For some potentially useful information please have a look at the following article which might help you in your analysis:
Determination of Alcohol Content in Alcoholic Beverages Using 45 MHz Benchtop NMR Spectrometer
This method does not require fancy and expensive equipment. The good thing about this paper is that it is freely accessible as public full text on RG. Thus you can download it as pdf file.
Good luck with your work and best wishes!
• asked a question related to Estimation
Question
As far as I know, there are four steps for recommending the N fertilizer rates based on NDVI readings and grain yield at N rich strip and farmer practice plot.
The steps are as follows:
1- Estimating Response Index (RI) RI=Ypn/Ypo
2- Estimating Ypo (yield at farmer practice strip)
3- Estimating Ypn (yield at N rich strip) based on RI, i.e., Ypn=Ypo*RI
4- N fertilizer rate recommendation using the formula:
NFRate=(N uptake at Ypn - N uptake at Ypo)/NUE
If the above-mentioned steps are correct, I would like to know to estimate Ypo what variables I should use as an independent variable? NDVIo (NDVI at farmer practice) or INSEY (In-season Estimated Yield based on NDVI, i.e., NDVI divided by days from planting to sensing)?
Thanks a million for the links.
• asked a question related to Estimation
Question
Example
Dominant Biceps Strength = 70%
Non-dominate Biceps Strength = 50%
for dominant Biceps as following :
1kg/ 8 times of Rep/ 30s/ 5 Sets/ 3m/ 3 sessions/week. Strengthening exercises
It would be uncommon to prescribe a lower relative load to the weaker muscle. Why don't you do 1-RM testing on each arm and then prescribe the same training load (% 1RM) for each arm based on it's own 1-RM... the weaker arm will ultimately have a lower absolute load, but same relative load...
The National Strength and Conditioning Association uses the following RepMax table:
1 repetition max = 100% of 1RM
2 repetition max = 95% of 1RM
4 repetition max = 90% of 1RM
6 repetition max = 85% of 1RM
8 repetition max = 80% of 1RM
This would be a good starting point.... but remember that these are repetition max estimates and therefore training should be slightly below that for repeated sets. Also, it looks like you are just doing as many repetitions as possible during each 30 sec work bout, if I am reading it correctly. You may want to go on the lower end of the relative load, but 50-70% seems like a reasonable range. It depends on how many repetitions you want them to complete.... there is no prescribed load for your proposed protocol. Try it out on a few people and see where their repetition ranges fall and let that guide your decision!
Good luck!
• asked a question related to Estimation
Question
I am in search of methods of quantitative and qualitative estimation of cellulase in a food sample(biological sample)?
• asked a question related to Estimation
Question
Are there any references for Estimated Average Requirement (for minerals and vitamins) of infants less than 6 months?
• asked a question related to Estimation
Question
Hi everyone
I'm looking for a quick and reliable way to estimate my missing climatological data. My data is daily and more than 40 years. These data include the minimum and maximum temperature, precipitation, sunshine hours, relative humidity and wind speed. My main problem is the sunshine hours data that has a lot of defects. These defects are diffuse in time series. Sometimes it encompasses several months and even a few years. The number of stations I work on is 18. Given the fact that my data is daily, the number of missing data is high. So I need to estimate missing data before starting work. Your comments and experiences can be very helpful.
Thank you so much for advising me.
It is in French
• asked a question related to Estimation
Question
logistic regression type of distribution estimation
what is the connection between Estimation and Distribution in machine learning
:Logistic distribution or sigmoid distribution is one of the essential portion of statistics . Here the specific question is m/c learning application - https://machinelearningmastery.com/logistic-regression-for-machine-learning/
• asked a question related to Estimation
Question
Hello dear
There is a significant variation between official statistics of FAO, NRCS, and HWSD about the amount of global SOC over 0-100 cm depth of soils around the world. all of them have studied same areas of lands with almost same methods and even more, they shared database with each other. For example, one of the main resources of HWSD to calculate global SOC comes from FAO. However, there is a considerable difference between them:
total global SOC over 0-100 cm depth:
HWSD: 2469 Pg
NRCS: 1399 Pg
FAO: 1459 Pg
references:
1- Hiederer, R. and M. Köchy1 (2011) Global Soil Organic Carbon Estimates and the Harmonized World Soil Database. EUR 25225 EN. Publications Office of the European Union.79pp
2- Hiederer, R. Kochy, M. 2012. Global Soil Organic Carbon Estimates and the Harmonized World Soil Database. EUR Scientific and Technical Research series â€ “ISSN 1831-9424 (online), ISSN 1018-5593 (print), ISBN 978-92-79-23108-7, doi:10.2788/1326.
3- 1. Köchy, M. Hiederer, R. Freibauer, A. 2015. Global distribution of soil organic carbon – Part 1: Masses and frequency distributions of SOC stocks for the tropics, permafrost regions, wetlands, and the world. SOIL, 1: 351–365. http://www.soil-journal.net/1/351/2015/
Dear Bajgai
Thanks. Over the past three weeks, more than 500 researchers and experts have read this key question, but apparently, there is not a clear answer for this tremendous gap between global-official statistics. Needless to say, such global-official statistics have been considered as the cornerstone of global environmental agreements, such as the Paris Climate Change Agreement. I want you to share this question with all of your followers to see can we found a possible answer or we have to wait for more.
Sincerely
• asked a question related to Estimation
Question
I would like to reunite those interested in estimating using different approach the true underlying number of SARS-CoV-2 infected individuals. This is important since this number give us an idea about those undetected individuals spreading the infection and causing deaths amongst the elderly and individuals with preexisting health conditions.
• asked a question related to Estimation
Question
Hi, I am looking to estimate a willingness to pay for a choice model, but for the given alternatives compared to a base rather than for the product attributes variables. I have run a mixed logit model and have coefficients for the attribute variables followed by coefficients for case-specific variables with each different option.
Hi Louie! You can find what you need on Kenneth Train's website, in my opinion, he is the main reference on this topic. I leave you this link to his book, I recommend especially chapter six.
All the best in your work!
• asked a question related to Estimation
Question
Anyone's can suggest me which optimization / estimation techniques is good for solar photovoltaic cells.
You have three main parameters to maximize in order to maximize the power conversion efficiency PCE of the solar cell.
It is so that the PCE= Isc. Voc . FF/ Incident solar radiation AM1.5
So, as you see the three factors are the short circuit current Isc,
the open circuit voltage Voc and the fill factor.
On has to adjust the physical and the technological parameters of the solar cell to achieve the maximum of the three quantities.
One has to make the absorber thickness at least equal to the absorption depth of the highest radiation wavelength so d>= 1/alph(almbdamax)
where alpha is the absorption coefficient,
The thickness d must also made at the same tome less than the diffusion length of the minority carriers L; so, d<=L
One has to minimize the reverse situation current Is of the dark current to achieve the highest Voc such that:
Voc= nVt ln Isc/Is
Is is minimized by reducing the injection across the junction and the minority carrier life time through out the device.
In order to increase the fill factor one has to reduce the shunt resistance Rs and increase the shunt resistance Rsh.
In this way one gets the highest efficiency.
Best wishes
• asked a question related to Estimation
Question
In irrigation science, the net depth of irrigation (NDI) is estimated from the equation
NDI=RZD*WHC*PD%
RZD: root zone depth, mm
WHC: water holding capacit, mm water/cm soil profile
PD%: percentage of depletion %
In genera, PD is limited between 40-60%.
If there is any other acceptable ratio of depletion in any irrigation system, please inform me.
Best Regards
Dear Isam Abdulhameed many thanks for asking this very interesting technical question. In addition to the relevant literature references suggested by Mohamed-Mourad Lafifi please also have a look at the following potentially useful link:
CHAPTER 6: Irrigation scheduling
This book chater is freely available as public full text on the internet (please see the attached pdf file).
Please also see the following irrigation depth calculator:
Good luck with your work and best wishes!
• asked a question related to Estimation
Question
I am a bit warry about asking this question regarding my ignorance and I am also not to sure if this question might bring out some emotional response.
Assume I want to have an estimate on the average R-squared and variability from literature for specific models y~x. I found 31 literature sources.
The questions are twofold :
1.) Can I shift the simulate of an ABC-rejection algorithm acting like it come from indeed from my target (see the first 4 figures)?
The parameter in this case is the draw from the prior deviating from the target and then shift it so it fits.
2.) I applied 4 methods in this case ABC-rejection (flat prior not really preferred), Bayesian bootstrap, Classical bootstrap and a one sided T-test (lower 4 figures). From all methods I extracted the 2.5-97.5% intervals. Given the information below, is it reasonable to go for Bayesian bootstrap in this case?
As sometimes suggested on RG and hidden deeply in some articles the intervals of the different intervals converse and are more-or-less-ish similar. However, I do have another smaller dataset which is also skewed. So I would personally prefer the Bayesian bootstrap as it smooths out and the extreme exactness in this case does not matter to much to me. Based on these results my coarse guestimate of the average variability would range from ~20-30% (To me it seems potato - potaato for either method disregarding the philosophical meaning). I also would like to use the individual estimates returned each bootstrap and technically it is not normal distributed (Beta-ish), although this does not seem to matter much in this pragmatic case.
I agree with David, use mean
• asked a question related to Estimation
Question
Hi,
Even though Artemis gives an error bar on the EXAFS fitting but it assumes the value to be zero at certain R values.
In the case of data with some experimental noise, the error bar needs to be corrected, weighted by the square root of the reduced chi-squared value, taking into account the experimental noise for each R-space spectrum from 15 to 25 Å, as described in
How do calculate the uncertainties in the coordination number when EXAFS is fitted using Artemis?
Thanks, Gerhard Martens it was indeed insightful
I have one more case where relatively good data (up to k=14) gives the following fitting (see attached).
it still show a huge error bar? can it be solved somehow?
Thanking you,
• asked a question related to Estimation
Question
for calculation of standardized residuals is it true?
Volatility= Estimated from MSGARCH package in R
Residuals= abs(returns)-volatility;
standardized residuals=Residuals/volatility;
what is YourObject?
• asked a question related to Estimation
Question
I am running K estimator for a gene family. I found that the K estimator is only working for certain gene combinations. It failed to estimate Ka and Ks values for some gene combinations. Is it possible that the other combinations are distantly related?
You may try the software KaKs calculator.
• asked a question related to Estimation
Question
I would like to measure the air pollution caused due to stone mines in the surrounding areas. So for this, what is the best method to identify sampling points.
It depends on the number of mining areas. If they are few there is no need for sampling but if there are many then you can use fishers method for sampling.https://www.geopoll.com/blog/sample-size-research/
• asked a question related to Estimation
Question
Dear all,
The complex I am trying to simulate has protein ( 10 chains), DNA, and RNA I ran the simulation for 2 us so far but the biological event I want to observe requires a very long MD simulation. I decided to accelerate the sampling using the AWH method. I am not familiar enough with the AWH method or other methods such as umbrella sampling. I have watched the two webinars on AWH but I am still not sure about the mdp options. Attached is the mdp I used and the error I got!
I believed that I need to choose a reference atom for pulling, But on what basis?
pull = yes ; The reaction coordinate (RC) is defined using pull coordinates. pull-ngroups = 12 ; The number of atom groups needed to define the pull coordinate. pull-ncoords = 7 ; Number of pull coordinates. pull-nstxout = 1000 ; Step interval to output the coordinate values to the pullx.xvg. pull-nstfout = 0 ; Step interval to output the applied force (skip here).
pull-group1-name = Protein_chain_A ; Name of pull group 1 corresponding to an entry in an index file. pull-group2-name = Protein_chain_B ; Same, but for group 2. pull-group3-name = Protein_chain_C pull-group4-name = Protein_chain_D pull-group5-name = Protein_chain_E pull-group6-name = Protein_chain_F pull-group7-name = Protein_chain_G pull-group8-name = Protein_chain_H pull-group9-name = Protein_chain_I pull-group10-name = Protein_chain_J pull-group11-name = RNA pull-group12-name = DNA
pull-group1-pbcatom = 0 pull-group2-pbcatom = 0 pull-group3-pbcatom = 0 pull-group4-pbcatom = 0 pull-group5-pbcatom = 0 pull-group6-pbcatom = 0 pull-group7-pbcatom = 0 pull-group8-pbcatom = 0 pull-group9-pbcatom = 0 pull-group10-pbcatom = 0 pull-group11-pbcatom = 0 pull-group12-pbcatom = 0
pull-coord1-groups = 1 2 ; Which groups define coordinate 1? Here, groups 1 and 2. pull-coord2-groups = 3 4 pull-coord3-groups = 5 6 pull-coord4-groups = 7 8 pull-coord5-groups = 9 10 pull-coord6-groups = 11 12 pull-coord7-groups = 11 12
pull-coord1-geometry = distance ; How is the coordinate defined? Here by the COM distance. pull-coord1-type = external-potential ; Apply the bias using an external module. pull-coord1-potential-provider = AWH ; The external module is called AWH!
awh = yes ; AWH on. awh-nstout = 50000 ; Step interval for writing awh*.xvg files. awh-nbias = 1 ; One bias, could have multiple. awh1-ndim = 1 ; Dimensionality of the RC, each dimension per pull coordinate. pull-coord1-groups awh1-dim1-coord-index = 1 ; Map RC dimension to pull coordinate index (here 1–>1) awh1-dim1-start = 0.25 ; Sampling interval min value (nm) awh1-dim1-end = 0.70 ; Sampling interval max value (nm) awh1-dim1-force-constant = 128000 ; Force constant of the harmonic potential (kJ/(mol*nm^2)) awh1-dim1-diffusion = 5e-5 ; Estimate of the diffusion (nm^2/ps),used to initial update size, how quezly the system moves awh1-error-init = 5 ; Estimate of the error of diffusion , used to set initial update size awh-share-multisim = yes ; Share bias across simulations awh1-share-group = 1 ; Non-zero share group index
ERROR 12 [file AWH3.mdp]: When the maximum distance from a pull group reference atom to other atoms in the group is larger than 0.5 times half the box size a centrally placed atom should be chosen as pbcatom. Pull group 12 is larger than that and does not have a specific atom selected as reference atom.
Any advice you could give would be much appreciated
Thank you so much!
Amnah
AWH calculates the free energy along an order parameter of the system. Free energy barriers are overcome by adaptively tuning a bias potential along the order parameter such that the biased distribution along the parameter converges toward a chosen target distribution. The fundamental equation governing the tuning is: log(target) = bias - free energy, where the bias and free energy are initially unknown. Typically the target distribution is simply chosen uniform, such that the bias completely flattens the free energy landscape.
Regards,
Shafagat
• asked a question related to Estimation
Question
I want to compare the efficiencies of different machine learning techniques for Structural reliability and surrogate modelling. Although my problem is specific, I think there should be well-known criteria for that. Unfortunately, I have not found any in the literature!
One simple idea is to multiply accuracy by the number of calls or time, but it is really not a very proper criterion most of the time.
How can we define good criteria? is it a good idea to go with a weighted multiplication based on a specific objective? or is there any well-known method for making this comparison?
I appreciate your help with this challenge!
Kind Regards
• asked a question related to Estimation
Question
I am implementing an unscented kalman filter for parameter estimation and I found its performances strongly related to the initialization of the parameter to estimate.
In particular, I don't understand why if I initialize the parameter below the reference value (the actual value to estimate) I get good performances, but if the parameter initialization is above the reference value, performances are very low and the estimation does not converge but continues increasing.
In other words, it seems that the estimation can only make the parameter increase. Is there any mathematical reason behind this?
Dear Mr. Magnani,
I agree with Janez Podobnik. I assume you are using the Joint Unscented Kalman Filter for this purpose. In this case, the parameter is considered as a state and all the initial conditions, noise and measurement covariances will affect the estimation of the parameter. In 2019, I have proposed a different approach for this purpose and I am attaching the article link below:
The main advantage of this approach is you can determine weights for the measurements (a Jacobian approach) for parameter estimation. If one of the measurements significantly affects the parameter estimation, then, you can increase the weight of that measurement. The only drawback of this approach is that the estimated parameter must be linear in terms of measurements. The system itself can be a nonlinear system, but parameter(s) must be linear in terms of measurements. For frequency estimation, we have modified this approach with my colleagues (link below). In this case, the parameter is no longer linear in terms of measurements. However, we haven't tested this approach for other nonlinear systems in which the parameter(s) is nonlinear in terms of measurements.
You may try this approach and it can be useful for your aim.
Best regards,
Altan
• asked a question related to Estimation
Question
I am trying to estimate nitrite from human serum using Griess reagent from Sigma. Sodium nitrite standards are used. Estimation of nitrite is done at 540 nm. Serum was processed in the following different ways before estimation.
1) Serum deproteinized with 92 mM Zinc Sulphate.
2) Serum deproteinized with 40% ethanol.
3) Serum deproteinized and reduced with Vanadium Chloride III.
I'm unable to measure nitrite from processed and unprocessed serum samples as suggested in literature. Kindly suggest your views.
Dear Nandi
Although there are various methods for determination of NOx, the simplicity, rapidity, and cheapness of the Griess assay have made this method more popular than others, in which deproteinization is a necessary step in measurement of it concentration in the serum, mostly because of the turbidity resulting from protein precipitation in an acidic environment. I considering acetonitrile an adeguate method to obtain a good extraction sample in which acetonitrile is mixed in ratio 1:1 with serum, vortexed for 1 min and centrifuged at 10000 × g for 10 min at 4 °C and supernatant used for NOx determination.
• asked a question related to Estimation
Question
The Nyquist-Shannon theorem provides an upper bound for the sampling period when designing a Kalman filter. Leaving apart the computational cost, are there any other reasons, e.g., noise-related issues, to set a lower bound for the sampling period? And, if so, is there an optimal value between these bounds?
More samples are generally better until such point as the difference in the real signal between samples is smaller than the quantization or other noise. At that point, especially with quantization, it may be a point of diminishing returns.
The other thing that nobody mentions is that faster sampling means less real-time processing time. In many systems, it's not really an issue as the time constants of the physical system are so slow as to never challenge the processing. In others, say high speed flexible meachatronic systems, the required sample rates may challenge the number of processing cycles available to complete the task.
Generally, the best bet is to return to the physical system's time constants and (if possible) sample 20-100x as fast as them.
• asked a question related to Estimation
Question
I want to estimate TOA and CFO of LTE signal. TOA is estimated for oversampled signal. after TOA estimation, the CFO is estimated. Please let me know, why the estimated CFO is wrong?
What is the approach used to achieve estimation?
• asked a question related to Estimation
Question
I have designed the mathematical model of the plant with nonlinear hystersis function f(x1) and is validated using simulation. Now I want to design the nonlinear observer to esttimate the speed (x2). Not that I have also modeled the nonlinear function in the model.
My state space model of the plant is
x1_dot = x2
x2_dot = q*x1 + c*x2 + f(x1) + u
Please suggest suitable observer to estimate the angular speed x2.
x1 is the angular position of the plant.
High Gain Observer (HGO) is good techniques for nonlinear system to estimate their states, which also hold the separation principle. I think HGO will be better.
• asked a question related to Estimation
Question
Anyone if knows about the estimation of population parameters for panel data, kindly recommend me some literature or web links?
• asked a question related to Estimation
Question
What is the difference between Maximum Likelihood Sequence Estimation and Maximum Likelihood Estimation? Which one is a better choice in case of channel non-linearities? And why and how oversampling helps in this?
The Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a specific model. It selects the set of values of the model parameters that maximizes the likelihood function. Intuitively, this maximizes the "agreement" of the selected model with the observed data.
• asked a question related to Estimation
Question
Hello,
using Cholesky decomposition in the UKF induces the possibility, that the UKF fails, if the covariance matrix P is not positiv definite.
Is this a irrevocable fact? Or is there any method to completely bypass that problem?
I know there are some computationally more stable algorithms, like the Square Root UKF, but they can even fail.
Can I say, that problem of failing the Cholesky decomposition occurs only for bad estimates during my filtering, when even an EKF would fail/diverge?
I want to understand if the UKF is not only advantagous to the EKF in terms of accuarcy, but also in terms of stability/robustness.
Best regards,
Max
If I understand your question correctly, it concerns not the initial covariance matrix rather the updated covariance matrix you get at the end of each Kalman iteration.
If such a condition arises you may use Higham's method to find an approximate positive-definite covariance matrix.
Reference:
Computing a nearest symmetric positive semidefinite matrix - ScienceDirect
• asked a question related to Estimation
Question
In Bayesian Inference, we have to choose a prior distribution of parameter for finding Bayes estimate which depends upon our belief and experience.
I would like to know what are steps or rule we should follow for taking a prior distribution of a parameter. Please help me with the same so that I can proceed.
Some other short articles that will be extremly helpful to you are
1. 3 Basics of Bayesian Statistics - CMU Statistics (http://www.stat.cmu.edu/~brian/463-663/week09/Chapter%2003.pdf)
2. Chapter 12 Bayesian Inference - CMU Statistics (http://www.stat.cmu.edu/~larry/=sml/Bayes.pdf)
3. Bayesian analysis - MIT OpenCourseWare (https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec15.pdf)
• asked a question related to Estimation
Question
I am currently working on a project where I am testing a Pairs Trading Strategy based on Cointegration. In this strategy I have 460 possible stock pairs to choose from every day over a time frame of 3 years. I am using daily Cointegration test and get trade signals based off of that to open or close a trade. According to this strategy I am holding my trades open until either the take profit condition (revert to mean of spread) or stop loss condition (spread exceeding [mean + 3* standard deviation]) holds. This means that some trades might be open a couple of days, others might be open for weeks or even months.
My question is now: How can i calculate the returns of my overall strategy?
I know how to calculate the returns per trade but when aggregating returns over a certain time period or over all traded pairs I have problems.
Let's say I am trying to calculate returns over 1 year. I could take the average of all the trade returns or calculate sum(profits per trade of each pair)/sum(invested or committed capital per trade), both of these would only give me some average return values.
Most of my single trades are profitable but in the end I am trying to show how profitable my whole trading strategy is, so I would like to compare it to some benchmark, but right now I don't really know how to do that.
One idea I had, was to possibly estimate the average daily return of my trading strategy by:
2. Taking the average of all the daily returns per trade
Then finally I would compare it to the average daily return of an index over the same time frame.
Does this make any sense or what would be a more appropriate approach?
I think the idea is good, it makes sense to calculate the average daily return on your trading strategy by:
1. Estimation of daily return on a transaction: (return on a transaction) / (number of days during which the trade was opened)
2. Taking the average of all daily income per transaction
Then compare with the average daily return on the index for the same period of time. In this case, the index is the base or reference and to estimate the deviation from it. You can also test a hypothesis about the differences and whether they are statistically significant.
• asked a question related to Estimation
Question
Can anyone offer sources that offer error (or uncertainty) bands to be applied to the annual energy production (AEP) of wind turbines that are calculated from mean annual wind speeds?
AEPs are reported to carry considerable errors with up to 15% uncertainty on power curve determination and up to 20% on wind resources [1], so if anyone has a source on how to treat the compound uncertainty please can you advise?
[1] Frandsen, S. et. al. Accuracy of Estimation of Energy Production from Wind Power Plants.
The terrain complexity, local roughness, the existence of obstacles and the distance of each turbine from the meteorological towers are among the factors that determine the magnitude of uncertainties. The range of uncertainty can be very wide, but a typical range is 3% - 6%.
Best regards
• asked a question related to Estimation
Question
I've got this 5 year Direct Normal Irradiance data I downloaded from https://re.jrc.ec.europa.eu/. I ordered this by year and by every 24 hours because I need the DNI curve by max in summer, min in winter and the average in spring and autum. The thing is that because not all days are sunny, the data is noisy and can't use simple "max" and "min" algorithms. Which algorithms do you recomend to filter and correct data? To achieve this graph, I used LabVIEW to order data.
• asked a question related to Estimation
Question
I'm using lmer4 package [lmer() function] to estimate several Average Models, wich I want to plot their Estimated Coefficients. I found this document, "Plotting Estimates (Fixed Effects) of Regression Models, by Daniel Lüdecke" that explains how to plot Estimates, and it works with Average Models, but uses Conditional Average values insted of Full Average values.
Model Script:
library(lme4)
options(na.action = "na.omit")
PA_model_clima1_Om_ST <- lmer(O.matt ~ mes_N + Temperatura_Ar_PM_ST + RH_PM_ST + Vento_V_PM_ST + Evapotranspiracao_PM_ST + Preci_total_PM_ST + (1|ID), data=Abund)
library(MuMIn)
options(na.action = "na.fail")
PA_clima1_Om_ST<-dredge(PA_model_clima1_Om_ST)
sort.PA_clima1_Om_ST<- PA_clima1_Om_ST[order(PA_clima1_Om_ST\$AICc),] top.models_PA_clima1_Om_ST<-get.models(sort.PA_clima1_Om_ST, subset = delta < 2)
model.sel(top.models_PA_clima1_Om_ST) Avg_PA_clima1_Om_ST<-model.avg(top.models_PA_clima1_Om_ST, fit = TRUE) summary(Avg_PA_clima1_Om_ST)
Plot scrip:
library(sjPlot) library(sjlabelled) library(sjmisc) library(ggplot2) data(efc) theme_set(theme_sjplot()) plot_model(Avg_PA_clima1_Om_ST, type="est", vline.color="black", sort.est = TRUE, show.values = TRUE, value.offset = .3, title= "O. mattogrossae")
The plot it creates uses the values of Conditional Average values insted of Full Average values. How can i plot Estimates of Average Models using Full Average values?
Thanks for your time and help
Hello, can anyone help me to interpret the GLM results about estimated value, intercept, coefficient, AICc, delta AICc, loglik, weight etc? What do the Null deviance and Residual deviance represent?
• asked a question related to Estimation
Question
Dear community,
I am facing some issues and scientific questionings regarding the dose-response analysis using the drc package from R.
Context :
I want to know if two strains have different responses to three drugs. To do so, I am using the dcr package from R. I then determine the EC50 for each strain regarding each drugs. I later plot my EC50 and use the estimate and standard error to determine if the EC50 is statistically different between strains. For each strain and drug I have four technical replicates and I will have three biological replicates. Visually, the model produced by the package matches my experimental data. However, I am looking for a statistical approach to determine if the model given by drc is not too far from my experimental data. How to know if I can be confident in the model ?
My approach :
I am using mselect() to determine which drc model is the most accurate with my data. However, I do not know how to interpret the results. I read that the higher the logLik is, the best the model describes the data provided. But do you know if a threshold does exist?
For example I have from the mselect() :
> mselect(KCl96WT.LL.4, list(LL.3(), LL.5(), W1.3(), W1.4(), W2.4(), baro5()), linreg = TRUE)
logLik IC Lack of fit Res var
LL.3 101.90101 -195.8020 0 0.0003878212
LL.3 101.90101 -195.8020 0 0.0003878212
W1.3 101.53204 -195.0641 0 0.0003950424
W2.4 102.48671 -194.9734 0 0.0003870905
LL.5 103.05880 -194.1176 0 0.0003869226
W1.4 101.52267 -193.0453 0 0.0004062060
Cubic 101.42931 -192.8586 NA 0.0004081066
baro5 101.98930 -191.9786 0 0.0004081766
Lin 96.45402 -186.9080 NA 0.0004958264
I also used the glht() element and coeftest(KCl96WT.LL.4, vcov= sandwich). But I am facing the same issue.
> coeftest(KCl96WT.LL.4, vcov= sandwich)
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
b:(Intercept) 4.3179185 0.6187043 6.979 3.024e-08 ***
d:(Intercept) 0.0907908 0.0080186 11.323 1.397e-13 ***
e:(Intercept) 0.9809981 0.0686580 14.288 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Do you know what approach could indicate if I can be statistically confident regarding my model? Can I be mathematically confident in the EC50 given by the package?
Thanks for your time! I am looking forward to discover new ways to be more critical regarding my data analysis. If you have any questions or comment regarding my approach, feel free to ask me !
• asked a question related to Estimation
Question
i am trying to conduct cfa anaylsis using R studio however, instead of giving me all the fit indecisive i am supposed to get , even with the summary function i dont get the GFI | AGFI | NFI | NNFI | CFI | RMSEA . can any one please help me with this issue
Estimator ML
Optimization method NLMINB
Number of free parameters 25
Number of observations 275
Model Test User Model:
Test statistic 228.937
Degrees of freedom 41
P-value (Chi-square) 0.000
By default, lavaan will always fix the factor loading of the first indicator to 1. In order to fix a parameter in a lavaan formula, you need to pre-multiply the corresponding variable in the formula by a numerical value. This is called the pre-multiplication mechanism.
# fit the model fit <- cfa(HS.model, data=HolzingerSwineford1939)
# display summary output summary(fit, fit.measures=TRUE)
• asked a question related to Estimation
Question
Hi everyone
I want to know which techniques of Machine Learning/Deep Learning I can use for more accurate SOC estimation of Lithium Ion batteries.
Thanks and Regards
It depends on the size of the dataset you have. There are works using simple machine learning models, such as GPR, SVM, etc. to estimate the SOC. When the dataset is large and covers a lot of scenarios, deep learning models may have a better performance. Have a look here:
• asked a question related to Estimation
Question
How might one go about estimating how recently an individual was infected with a given pathogen given antigen and antibody titers relevant to said pathogen?
Sounds like an exam question. Go back to your text book and read about the time course of the immune response. Then the answer should be obvious.
• asked a question related to Estimation
Question
What do you think about estimatting the level of automation? How to carry out the process of estimatting the level of automation, autonomy and intellectualization?
It is important to get a quantitative result.
1. Estimate the technical level of the equipment, software;
2. Taxonomy processes / business processes and decide on the level of automation;
3. Estimate the completeness of automation by calculating the proportion of controlled signals from the norm;
4. The share of computer time from the total time of the operation.
Etc.
• asked a question related to Estimation
Question
Is it possible to estimate the shape, scale, and location parameters of a generalized extreme distribution (gev) if I just know the mean, variance, and median of a given data set (i.e., no raw data available - - just its descriptive statistics)?
Christopher
You can use the maximum likelihood estimator, or the probabilistic weighted moments, or other methods. The choice depend on a lot of situations inherent to the data. But, first try MLE.
• asked a question related to Estimation
Question
Currently, I'm doing thesis using google earth engine to calculate the days of cessation of storms to CHIRPS Daily Precipitation Image Collection. The code flow is as follows:
Conceptually, it is like this. The first date of the image collection is set as the zero day for the cessation of storms. Coming to the image of next date, it examines the values of precipitation. If the precipitation is zero, the day of cessation of storms is added up one day and still continues adding until the precipitation of coming days becomes non-zero. At that case, the day of cessation of storm starts off with zero day.
The results I've obtained is an image collection in which all the elements inside have the calculated bands with value of zero which shouldn't be like this. I would like some suggestions on this problem, please.
Reference paper: Estimation of snow water equivalent from MODIS albedo for a non-instrumented watershed in Eastern Himalayan Region
Very interesting questions.
• asked a question related to Estimation
Question
Dear all,
I would like to estimate prediction accuracy using EBVs (Estimated Breeding Value) that computed from pedigree-based and genomic-based. In the models to estimate those EBVs, I have fitted a number of fixed effects (e.g. age, batch, tank,...), I wonder that if I re-fit those fixed effects in the cross-validation as predictors will lead to be overfitted? If no predictor, how can I do cross-validation between 2 sets of EBVs? Any suggestion?
Thanks and regards,
Vu
My problem seems a little bit different since I have used other software to predict y values that can not incorporate into the R/Python environment. Regression models with normal packages for cross-validation in R did not incorporate the relationship/genomic matrix of individuals that I have to use for genetic evaluation. I can still split data by hand into reference and validation sets (5 folds for example) and estimate the accuracy base on the correlation between actual y and predicted y in the validation data set but I can not perform, for example, 3 replicates as it will give the same y values. Kind regards, Vu.
• asked a question related to Estimation
Question
I have been reading this paper on how to analyze linear relationships using a repeated measure ANOVA: .
I was wondering though once you establish a linear relationship across your categorical variables (A, B, C, D) how can you check if the difference across conditions A vs. B vs. C vs. D is also significant?
I have been using pairwise t-tests (A vs. B; B vs. C; C vs. D), but is there a better test to look at this?
Just for completeness, I have been using "ols" from "statsmodels" to check for the linear relationship, and "pairwise_ttests" from "pingouin" to run post-hoc tests in Python.
There are contrasts (like Helmert, Deviation, Difference etc), which test several different cotrast. Maybe "Repeated" is what you are looking for, it tests A vs B, B vs C, and C vs D and nothing else. Just look it up.
• asked a question related to Estimation
Question
Is it possible to back-calculate/estimate the amount/dosage of a molecule consumed by looking at ante- or post- mortem toxicology blood levels? If you don't know, can you suggest a contact that might know? (I appreciate there will be issues around blood redistribution and site of sampling, etc.)
Pharmacokinetic equations are inverse equations that you can calculate any unknown parameter
• asked a question related to Estimation
Question
I have created three logistic models, model 4, 1 and 2, and calculated AICc values for each. Both model 4, with 2 covariates (location and camera), and model 1, with a single covariate (location), have approximately equivalent AICc values (less than 2 points). In this case one should chose the model with the least parameters, this is model 6 with only location included. However, to make things more confusing the likelihood ratio tests for model 4 vs 1 and model 4 vs 2 suggest that having location and camera in the same model is better than just having location or just camera. This contradicts the AICc values. So which model would you choose? I provide an example below. Thanks in advance.
> # location as a covariate on abundance
> m1 <- occuRN(~1 ~location, final)
> m1
Call:
occuRN(formula = ~1 ~ location, data = final)
Abundance:
Estimate SE z P(>|z|)
(Intercept) 2.01 0.704 2.86 4.24e-03
location2  -2.19 0.547 -4.02 5.94e-05
Detection:
Estimate SE z P(>|z|)
-2.32 0.756 -3.07 0.00215
AIC: 162.7214
> # camera as a covariate on detection
> m2 <- occuRN(~camera ~1, final)
> m2
Call:
occuRN(formula = ~camera ~ 1, data = final)
Abundance:
Estimate SE z P(>|z|)
0.682 0.371 1.84 0.0657
Detection:
Estimate SE z P(>|z|)
(Intercept) -2.589 0.763 -3.392 0.000694
camera2 1.007 0.774 1.301 0.193247
camera3 2.007 0.785 2.557 0.010555
camera4 0.639 0.803 0.796 0.425864
AIC: 178.696
# camera as a covariate on detection, location as covariate on abundance
> m4 <- occuRN(~camera ~location, final)
> m4
Call:
occuRN(formula = ~camera ~ location, data = final)
Abundance:
Estimate SE z P(>|z|)
(Intercept) 2.71 0.319 8.49 2.06e-17
location2  -2.25 0.509 -4.41 1.03e-05
Detection:
Estimate SE z P(>|z|)
(Intercept) -4.050 0.616 -6.571 5.00e-11
camera2 1.030 0.620 1.660 9.69e-02
camera3 1.776 0.613 2.897 3.76e-03
camera4 0.592 0.642 0.922 3.57e-01
AIC: 157.2511
> model_list<-list(null,m4,m2,m1)
> model_names<-c("modelnull","model4","model2","model1")
> modelsel<-aictab(model_list, model_names, second.ord=T)
> modelsel
Model selection based on AICc:
K     AICc      Delta_AICc AICcWt     Cum.Wt LL
model4       6     163.25    0.00            0.61       0.61 -72.63
model1       3     164.13    0.88            0.39       1.00 -78.36
modelnull   2     181.06    17.81           0.00      1.00 -88.20
model2      5     182.70    19.44            0.00      1.00 -84.35
I have two criteria: the AICC and the SIC. Supposedly the one with the lowest AICC or SIC value is chosen, but which is better?
• asked a question related to Estimation
Question
Dear All
Can you help me clarify the basic difference among the following command in STATA. xtmg, xtpmg, xtcce and xtdcce, xtdcce2. When and which method we should choose to estimate MG, AMG, PMG, CCE etc.
• asked a question related to Estimation
Question
The recent few research article showed that the rain attenuation prediction model even the ITU-R, predicts over estimated attenuation for short links. Most of such experimental link as per literature is smaller than 2km or even building to building link that is about 200m. Many researchers find the issues and given solution by correcting the effecting path. But I think the issue is not yet addressed technically.
Why such limiting value of path length exists for which existing models show discrepency to predict attenuation? Thank you.