Science topic

Estimation - Science topic

Explore the latest questions and answers in Estimation, and find Estimation experts.
Questions related to Estimation
  • asked a question related to Estimation
Question
5 answers
(E1U) (E2G) (E2G) (B2U) (A1G) (E1U) (E1U) (E2G)
(E2G) (B2U)
Requested convergence on RMS density matrix=1.00D-08 within 128 cycles.
Requested convergence on MAX density matrix=1.00D-06.
Requested convergence on energy=1.00D-06.
No special actions if energy rises.
SCF Done: E(RB3LYP) = -13319.3349271 A.U. after 1 cycles
Convg = 0.2232D-08 -V/T = 2.0097
Range of M.O.s used for correlation: 1 5424
NBasis= 5424 NAE= 1116 NBE= 1116 NFC= 0 NFV= 0
NROrb= 5424 NOA= 1116 NOB= 1116 NVA= 4308 NVB= 4308
PrsmSu: requested number of processors reduced to: 4 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.I
.
.
.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.
Symmetrizing basis deriv contribution to polar:
IMax=3 JMax=2 DiffMx= 0.00D+00
G2DrvN: will do 1 centers at a time, making 529 passes doing MaxLOS=2.
Estimated number of processors is: 3
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
.
.
.
Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.
CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.
Erroneous write. Write 898609344 instead of 2097152000.
fd = 4
orig len = 3177921600 left = 3177921600
g_write
Relevant answer
Answer
Thank you Sumit Naskar for the information.
  • asked a question related to Estimation
Question
2 answers
In a paper downloaded in Research Gate, there is a statement about Independent Estimators.
In my opinion, the two estimators are not independent.
See the file.
What do the colleagues think?
Thank you
MS
Relevant answer
Answer
AFTER another month..................
AGAIN No answer...
Reasons?
  1. It is not important for practitioners ...
  2. It is not interesting for Scholars ...
  3. Nobody knows what to answer ...
  4. Other ...
  • asked a question related to Estimation
Question
6 answers
My latest attempts:
1)(More general)
2)(More specific)
  • asked a question related to Estimation
Question
5 answers
I have been studying a particular set of issues in methodology, and looking to see how various texts have addressed this.  I have a number of sampling books, but only a few published since 2010, with the latest being Yves Tille, Sampling and Estimation from Finite Populations, 2020, Wiley. 
In my early days of survey sampling, William Cochran's Sampling Techniques, 3rd ed, 1977, Wiley, was popular. I would like to know which books are most popularly used today to teach survey sampling (sampling from finite populations).
I posted almost exactly the same message as above to the American Statistical Association's ASA Connect and received a few recommendations, notably Sampling: Design and Analysis,  Sharon Lohr, whose 3rd ed, 2022, is published by CRC Press.  Also, of note was Sampling Theory and Practice, Wu and Thompson, 2020, Springer.
Any other recommendations would also be appreciated. 
Thank you  -  Jim Knaub
Relevant answer
Answer
Here are some recommended ones: 1. "Sampling Techniques" by William G. Cochran This classic book covers a wide range of sampling methods with practical examples. It’s comprehensive and delves into both theory and application, making it valuable for students and professionals. 2. "Survey Sampling" by Leslie Kish'' This is another foundational text, known for its detailed treatment of survey sampling design and estimation methods. Kish's book is especially useful for those interested in practical survey applications. 3. "Model Assisted Survey Sampling" by Carl-Erik Särndal, Bengt Swensson, and Jan Wretman This book introduces model-assisted methods for survey sampling, which blend traditional design-based methods with model-based techniques. It's ideal for more advanced readers interested in complex survey designs. 4. "Sampling of Populations: Methods and Applications" by Paul S. Levy and Stanley Lemeshow This text is widely used in academia and provides thorough explanations of different sampling methods with a focus on real-world applications. It also includes case studies and practical exercises, making it helpful for hands-on learners. 5. "Introduction to Survey Sampling" by Graham Kalton This introductory book offers a concise and accessible overview of survey sampling methods. It’s well-suited for beginners who need a straightforward introduction to key concepts. 6. "Designing Surveys: A Guide to Decisions and Procedures" by Johnny Blair, Ronald F. Czaja, and Edward A. Blair This book focuses on the practical aspects of designing and conducting surveys, with particular emphasis on decision-making and procedural choices in the survey process.
  • asked a question related to Estimation
Question
1 answer
What are the environmental implications of this substitution?
Relevant answer
Answer
Para estimar la reducción de emisiones de gases de efecto invernadero (GEI) al reemplazar fertilizantes químicos con biocompost en un sistema típico de cultivo de trigo y arroz, se deben considerar varios factores, incluyendo las emisiones asociadas con la producción y uso de fertilizantes químicos, así como las emisiones generadas por el uso de biocompost.
1. Emisiones de Fertilizantes Químicos
Los fertilizantes químicos, especialmente los nitrogenados, son responsables de emisiones significativas de óxido nitroso (N₂O), un potente gas de efecto invernadero. Según la literatura, se estima que por cada kilogramo de nitrógeno aplicado, se pueden emitir entre 1.0 y 1.5 kg de N₂O.
Ejemplo de Cálculo:
  • Supongamos que un cultivo de trigo o arroz utiliza 100 kg de nitrógeno por hectárea.
  • Emisiones de N₂O por fertilizantes químicos: Emisiones=100kg N×1.25kg N₂O/kg N=125kg N₂O/ha
2. Emisiones de Biocompost
El uso de biocompost generalmente resulta en menores emisiones de N₂O en comparación con los fertilizantes químicos. Esto se debe a que el biocompost mejora la salud del suelo y su capacidad de retención de nutrientes, lo que puede reducir la necesidad de aplicaciones adicionales de nitrógeno.
Estimación de Emisiones:
  • Se estima que el uso de biocompost puede reducir las emisiones de N₂O en un 30-50% en comparación con los fertilizantes químicos.
  • Si tomamos un promedio del 40% de reducción: Emisiones con biocompost=125kg N₂O/ha×(1−0.40)=75kg N₂O/ha
3. Cálculo de la Reducción de Emisiones
La reducción de emisiones al reemplazar fertilizantes químicos con biocompost sería:
Reduccioˊn de Emisiones=Emisiones con fertilizantes quıˊmicos−Emisiones con biocompost Reduccioˊn de Emisiones=125kg N₂O/ha−75kg N₂O/ha=50kg N₂O/ha
4. Conversión a CO₂ Equivalente
Para tener una idea más clara del impacto, se puede convertir las emisiones de N₂O a CO₂ equivalente. El potencial de calentamiento global (PCG) del N₂O es aproximadamente 298 veces mayor que el del CO₂.
CO₂ Equivalente=50kg N₂O/ha×298kg CO₂e/kg N₂O=14,900kg CO₂e/ha
  • asked a question related to Estimation
Question
1 answer
I want to Estimate surface heat fluxes using MyLake, but I don't have all the initial values in model parameters section and other sections,is there a way?
Relevant answer
Answer
I think you could let ChatGPT or other AI tools make some sample data.
  • asked a question related to Estimation
Question
5 answers
Hi everyone
I want to know which techniques of Machine Learning/Deep Learning I can use for more accurate SOC estimation of Lithium Ion batteries.
Thanks and Regards
Relevant answer
Answer
Basically any types of machine learning that is used for time series data can be used to estimate battery SOC. In terms of RNN based approach mainly LSTM, GRU sometimes Bi-GRU is used. If you think about normal machine learning approach then you may consider GPR, XGBbost, ARIMA.
But now a days its gaining popularity of hybrid machone learning model where 2 or 3 different machine learning model is used may be in series or in parallel. Most of the cases people use to combine CNN and RNN based model to do that. For more details you may read my recent preprint https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4855442
  • asked a question related to Estimation
Question
1 answer
Dear all
Does anyone know how to implement a minimum distance estimator in Stata? I have the parameters and the var/cov matrix from an unrestricted model and would like to estimate the following function:
Br = argmin(Br - Bu)*IM*(Br - Bu) subject to fi(xi, Br) >= 0
with IM the inverse of the var/cov matrix, Bu the Betas from the unrestricted model, Br the Betas from the restricted model and fi the derivative of the dependent variable with respect to each of the regressors (this is a translog production function, hence this derivatives do not correspond to the first order coefficient of the corresponding regressor but depends also on the crossed-effects, but that is not really relevant). How can I to do this in Stata? Please lend me a hand. This is the last thing I need to finish and publish the first article of my phd.
I posted on the Stata forum but haven’t got a response yet and this really urges me.
Please.
Thank you all,  Juan
Relevant answer
Answer
Hi did you ever find the answer? If so, care to share? Thanks!
  • asked a question related to Estimation
Question
3 answers
Hi, I am stuck with my Amos Analysis and need help. I am unable to get any path coefficients or standardized estimates for my model. The model runs without any hassle but output file is almost empty. Every path coefficient is marked as 'Unidentified'. Seek help, please look into the attached diagram and suggest the possible reasons. Thank You.
Relevant answer
Answer
Yes, I found that need at least 3 additional constraints.
  • asked a question related to Estimation
Question
7 answers
I am looking to estimate the diameter (nm) of a variety of double stranded plasmids (pUC19, pMAL pIII, pKLAC2, etc.) when they are natively supercoiled and when they are relaxed.
If someone could point me towards a formula it would be much appreciated! Thanks. 
Relevant answer
Answer
Calculating the diameter of plasmids typically involves determining the length of the plasmid DNA molecule. Plasmids are circular, double-stranded DNA molecules, and their size is commonly expressed in terms of base pairs (bp). Each base pair corresponds to approximately 0.34 nanometers (nm) of linear distance along the DNA molecule's length.
Here's how you can calculate the diameter of a plasmid:
  1. Determine the size of the plasmid: The size of the plasmid is usually provided in terms of base pairs (bp). For example, if a plasmid is 5,000 base pairs long, its length would be 5,000 bp.
  2. Convert base pairs to linear length: Multiply the number of base pairs by the length of each base pair, which is approximately 0.34 nm. This gives you the linear length of the plasmid DNA in nanometers.Linear length (nm) = Number of base pairs × 0.34 nm/bp
  3. Calculate the diameter: Since the plasmid is circular, its diameter can be calculated using the formula for the circumference of a circle.Diameter (nm) = Linear length (nm) / π
Here's a step-by-step example: Let's say you have a plasmid with 3,000 base pairs.
  1. Determine the linear length: Linear length = 3,000 bp × 0.34 nm/bp = 1,020 nm
  2. Calculate the diameter: Diameter = 1,020 nm / π ≈ 325.05 nm
So, the estimated diameter of the plasmid is approximately 325.05 nanometers.
It's important to note that this calculation provides an approximation of the plasmid's diameter based on its linear length. In reality, the plasmid molecule is not a perfect circle, and its shape and size can be influenced by factors such as supercoiling and protein binding. Additionally, experimental techniques like electron microscopy can provide more accurate measurements of plasmid size and shape.
l Take a look at this protocol list; it could assist in understanding and solving the problem.
  • asked a question related to Estimation
Question
2 answers
Semi-Supervised Learning in Statistics
Relevant answer
Answer
you can deploy logistic regression algorithm for this task
  • asked a question related to Estimation
Question
1 answer
DOI 10.1109/ACA57612.2023.10346958
Relevant answer
Answer
I'd be glad to assist you with understanding the losses in switched reluctance motors (SRMs) and the research article you mentioned. However, I'm unable to access the full content of the article due to restrictions. To provide the most accurate and relevant information, I'll need more context about the specific losses you're interested in and any key findings or methodologies discussed in the article.
Here are some general points to consider regarding SRM losses:
Types of Losses in SRMs:
  • Copper losses: Caused by the resistance of the stator and rotor windings, increasing with current squared (I^2). Minimized by using larger conductors, optimizing winding design, and reducing operating temperature.
  • Iron losses: Comprise hysteresis losses (due to magnetization reversal in the iron core) and eddy current losses (induced currents within the core). Minimized by using high-grade electrical steel, proper lamination thickness, and core shaping to reduce eddy current paths.
  • Mechanical losses: Include windage friction, bearing friction, and cogging torque (torque pulsations due to rotor position). Reduced by using low-friction bearings, optimized air gaps, and minimizing cogging through rotor pole shaping.
  • Switching losses: Occur during commutation (turning on/off of phases) due to high voltage and current transients. Minimized by using advanced gate drivers, optimizing switching times, and employing soft-switching techniques.
Factors Affecting Losses:
  • Operating speed and torque: Higher speeds and torques typically lead to higher losses due to increased current and magnetic interaction.
  • Temperature: Iron losses rise significantly with temperature, while copper losses increase moderately. Proper thermal management is crucial.
  • Design parameters: Motor geometry, material selection, and winding design all impact losses. Optimizing these factors is essential for efficiency.
Estimation Methods:
  • Analytical models: Based on simplifying assumptions and empirical formulas, they provide quick estimates but may not be highly accurate for complex designs.
  • Finite element analysis (FEA): Provides detailed simulations of electromagnetic and thermal behavior, offering more accurate loss predictions but requiring significant computational resources.
  • Experimental measurements: Direct measurement of motor losses using specialized equipment is the most accurate method but can be expensive and time-consuming.
  • asked a question related to Estimation
Question
3 answers
Hello
I want to estimate LST by Landsat 8 collection 2 level 2.
What is its steps and formula.
with regards
Relevant answer
Answer
Arc Gs pro or Envi or Matlab.
  • asked a question related to Estimation
Question
3 answers
I am looking for Articles related to above mentioned title.
Your Assistance will be highly appreciated.
Relevant answer
  • asked a question related to Estimation
Question
2 answers
Estimation of the Bubble Point Pressure of Multicomponent Reservoir Hydrocarbon Fluids
Relevant answer
Answer
Vasquez and Beggs empirical correlations have proven the most reliable for me in my comparison with PVT measurements from black and volatile oils.
Here is a link to get you started...
  • asked a question related to Estimation
Question
5 answers
Greetings,
I am currently in the process of conducting a Confirmatory Factor Analysis (CFA) on a dataset consisting of 658 observations, using a 4-point Likert scale. As I delve into this analysis, I have encountered an interesting dilemma related to the choice of estimation method.
Upon examining my data, I observed a slight negative kurtosis of approximately -0.0492 and a slight negative skewness of approximately -0.243 (please refer to the attached file for details). Considering these properties, I initially leaned towards utilizing the Diagonally Weighted Least Squares (DWLS) estimation method, as existing literature suggests that it takes into account the non-normal distribution of observed variables and is less sensitive to outliers.
However, to my surprise, when I applied the Unweighted Least Squares (ULS) estimation method, it yielded significantly better fit indices for all three factor solutions I am testing. In fact, it even produced a solution that seemed to align with the feedback provided by the respondents. In contrast, DWLS showed no acceptable fit for this specific solution, leaving me to question whether the assumptions of ULS are being violated.
In my quest for guidance, I came across a paper authored by Forero et al. (2009; DOI: 10.1080/10705510903203573), which suggests that if ULS provides a better fit, it may be a valid choice. However, I remain uncertain about the potential violations of assumptions associated with ULS.
I would greatly appreciate your insights, opinions, and suggestions regarding this predicament, as well as any relevant literature or references that can shed light on the suitability of ULS in this context.
Thank you in advance for your valuable contributions to this discussion.
Best regards, Matyas
Relevant answer
Answer
Thank you for your question. I have searched the web for information about the Diagonally Weighted Least Squares (DWLS) and Unweighted Least Squares (ULS) estimators, and I have found some relevant sources that may help you with your decision.
One of the factors that you should consider when choosing between DWLS and ULS is the sample size. According to Forero et al. (2009)1, DWLS tends to perform better than ULS when the sample size is small (less than 200), but ULS tends to perform better than DWLS when the sample size is large (more than 1000). Since your sample size is 658, it falls in the intermediate range, where both methods may provide similar results.
Another factor that you should consider is the degree of non-normality of your data. According to Finney and DiStefano (2006), DWLS is more robust to non-normality than ULS, especially when the data are highly skewed or kurtotic. However, ULS may be more efficient than DWLS when the data are moderately non-normal or close to normal. Since your data have slight negative skewness and kurtosis, it may not be a serious violation of the ULS assumptions.
A third factor that you should consider is the model fit and parameter estimates. According to Forero et al. (2009)1, both methods provide accurate and similar results overall, but ULS tends to provide more accurate and less variable parameter estimates, as well as more precise standard errors and better coverage rates. However, DWLS has higher convergence rates than ULS, which means that it is less likely to encounter numerical problems or estimation errors.
Based on these factors, it seems that both DWLS and ULS are reasonable choices for your data and model, but ULS may have some advantages over DWLS in terms of efficiency and accuracy. However, you should also check the sensitivity of your results to different estimation methods, and compare them with other criteria such as theoretical plausibility, parsimony, and interpretability.
I hope this answer helps you with your analysis. If you need more information, you can refer to the sources that I have cited below.
1: Factor analysis with ordinal indicators: A Monte Carlo study comparing DWLS and ULS estimation by Carlos G. Forero, Alberto Maydeu-Olivares & David Gallardo-Pujol in British Journal of Mathematical and Statistical Psychology (2009)
: Non-normal and categorical data in structural equation modeling by Sara J. Finney & Christine DiStefano in Structural equation modeling: A second course (2006)
Good luck
  • asked a question related to Estimation
Question
6 answers
I'm conducting an IRT analysis on 60 MCQs to estimate the item parameters (b, a, & c) as well as Standard Error of the Estimate (SEE). However, some items showed large positive threshold values of b-parameter (i.e. 547.98, 600.38, 423.47), and also, large negative threshold values of c-parameter (i.e. -201.59, -155.78). Consequently, the SEE values of these items also displayed similar values.
Relevant answer
Answer
If you don't see anything odd (like almost all accurate for an item, or near infinity log-odds ratios for bivariate), look up Heywood cases. But in general, don't start with a complex model without looking at the data. Or make friends with your data as I say here: . Anything that you find in your IRT models you should be able to show why that arises from the log-odds ratio table.
  • asked a question related to Estimation
Question
4 answers
I have read several papers on the same but I haven't yet succeeded in finding one with a script I can use on my data
Relevant answer
Answer
A complete code in R is the best option right now. I would appreciate that. Learning Python is on my agenda too but I'm yet to allocate time for it. Thanks for the video.
  • asked a question related to Estimation
Question
2 answers
Which commands are used for dynamic panel logit/probit model estimation in Stata?
Relevant answer
Answer
Dynamic panel logit/probit models are used to estimate the probability of an event occurring over time. These models are used when the dependent variable is binary and the independent variables are continuous or categorical.
  • asked a question related to Estimation
Question
3 answers
Estimation of the number of acceptor molecules surrounding a given donor in the (Forster resonance energy transfer) FRET system
Relevant answer
Answer
There is a user-friendly program called FRET-Calc, which offers a straightforward interface and is available for free. This program allows users to easily obtain a range of FRET parameters.
  • asked a question related to Estimation
Question
1 answer
Estimate the potential economic benefits of implementing geospatial technologies in precision agriculture for crop yield optimization and soil health management.
Relevant answer
Answer
Implementing geospatial technologies in precision agriculture for crop yield optimization and soil health can bring significant economic benefits. These include increased crop yields, resource efficiency, improved soil health, reduced losses, data-driven decision-making, and improved access to markets and certifications.
  • asked a question related to Estimation
Question
1 answer
Estimate the potential impact of widespread soil carbon stabilization on global carbon dioxide emissions.
Relevant answer
Answer
While estimating the precise impact of widespread soil carbon stabilization is complex, studies have shown that it has the potential to offset a significant portion of global carbon dioxide emissions. By sequestering carbon in the soil, it can contribute to achieving climate change mitigation targets and help limit global temperature rise.
  • asked a question related to Estimation
Question
1 answer
Hello All,
I just have one statistical question about running generalised estimating equations (GEE). The variables I'm working with are not normally distributed, and I'm not sure what type of family and link functions would be appropriate for them. Appreciat if you have any sources or guidelines that can help me understand them better. Thanks
Relevant answer
Answer
Certainly! When working with variables that are not normally distributed in the context of generalized estimating equations (GEE), it's important to select appropriate families and link functions that suit the distributional characteristics of your data. Here are some guidelines and resources that can help you understand and choose suitable options:
1. General Guidelines: Depending on the nature of your data, you can consider different families and link functions. Here are some common combinations:
- For continuous data with a symmetric distribution, the Gaussian family with the identity link function is often appropriate.
- For binary or dichotomous data, the binomial family with the logit or probit link function is commonly used.
- For count data, the Poisson or negative binomial family with the log link function is frequently employed.
- For ordinal data, the cumulative logit or proportional odds model can be applied.
2. Statistical Literature: Several statistical textbooks and research papers provide detailed information on the selection of appropriate families and link functions for GEE models. Some recommended sources include:
- "Generalized Estimating Equations" by Hardin and Hilbe.
- "Generalized Linear Models for Categorical and Continuous Limited Dependent Variables" by Long and Freese.
- "Generalized Linear Models" by McCullagh and Nelder.
- "Analysis of Longitudinal Data" by Diggle, Heagerty, Liang, and Zeger.
3. Software Documentation: If you are using statistical software packages like R, SAS, or Stata to fit GEE models, their documentation often provides guidance on the available families and link functions. For example, the documentation for the "geepack" package in R or the "PROC GENMOD" procedure in SAS can be helpful in understanding the options and making appropriate choices.
4. Consultation and Collaboration: If you have access to a statistician or a research collaborator with expertise in GEE modeling or generalized linear models (GLMs), it would be beneficial to consult with them. They can provide personalized guidance based on the specific characteristics of your data and research objectives.
Remember that the choice of the family and link functions in GEE depends on the distributional assumptions of your data and the specific research question you are addressing. It's important to evaluate the goodness of fit and interpret the estimated coefficients accordingly.
The family and link functions are:
When selecting appropriate family and link functions for Generalized Estimating Equations (GEE), it is essential to consider the distributional characteristics of your data. Here are some guidelines to help you make suitable choices:
1. Gaussian Family with Identity Link:
- Appropriate for continuous data with a symmetric distribution, assuming the response variable follows a normal distribution.
- It is often used when the response variable is measured on an interval or ratio scale and shows a relatively symmetric distribution.
2. Binomial Family with Logit or Probit Link:
- Suitable for binary or dichotomous data where the response variable takes only two possible outcomes.
- The logit or probit link functions are commonly used to model the relationship between the linear predictor and the probability of success.
- The logit function is the default choice and is often preferred due to its simplicity and interpretability.
3. Poisson Family with Log Link:
- Appropriate for count data where the response variable represents the number of occurrences within a given interval.
- The log link function is typically used to ensure the estimated rates remain positive.
- If there is evidence of overdispersion (variance exceeds the mean), the negative binomial family with log link can be considered as an alternative.
4. Gamma Family with Inverse Link:
- Suitable for continuous data with a positively skewed distribution, such as durations or response times.
- The inverse link function is often used to model the reciprocal of the response variable.
- The gamma distribution assumes that the response variable follows a gamma distribution with a specific shape and rate parameter.
5. Other Families and Link Functions:
- There are additional families and link functions available for specific data types and distributions. For example, the ordinal family with a logit or probit link can handle ordered categorical responses.
- Different software packages may offer additional options, such as the quasi-likelihood family for accommodating data with non-standard distributions.
It is important to note that the choice of family and link functions should align with the characteristics and assumptions of your data. Consider the scale, distributional properties, and nature of the response variable when making your selection.
Additionally, consulting statistical textbooks, research papers, and software documentation specific to your chosen statistical package can provide further guidance and examples in selecting appropriate families and link functions for GEE models.
  • asked a question related to Estimation
Question
6 answers
In 2007 I did an Internet search for others using cutoff sampling, and found a number of examples, noted at the first link below. However, it was not clear that many used regressor data to estimate model-based variance. Even if a cutoff sample has nearly complete 'coverage' for a given attribute, it is best to estimate the remainder and have some measure of accuracy. Coverage could change. (Some definitions are found at the second link.)
Please provide any examples of work in this area that may be of interest to researchers. 
Relevant answer
Answer
I would like to restart this question.
I have noted a few papers on cutoff or quasi-cutoff sampling other than the many I have written, but in general, I do not think those others have had much application. Further, it may be common to ignore the part of the finite population which is not covered, and to only consider the coverage, but I do not see that as satisfactory, so I would like to concentrate on those doing inference. I found one such paper by Guadarrama, Molina, and Tillé which I will mention later below.
Following is a tutorial i wrote on quasi-cutoff (multiple item survey) sampling with ratio modeling for inference, which can be highly useful for repeated official establishment surveys:
"Application of Efficient Sampling with Prediction for Skewed Data," JSM 2022: 
This is what I did for the US Energy Information Administration (EIA) where I led application of this methodology to various establishment surveys which still produce perhaps tens of thousands of aggregate inferences or more each year from monthly and/or weekly quasi-cutoff sample surveys. This also helped in data editing where data collected in the wrong units or provided to the EIA from the wrong files often showed early in the data processing. Various members of the energy data user community have eagerly consumed this information and analyzed it for many years. (You might find the addenda nonfiction short stories to be amusing.)
There is a section in the above paper on an article by Guadarrama, Molina, and Tillé(2020) in Survey Methodology, "Small area estimation methods under cut-off sampling," which might be of interest, where they found that regression modeling appears to perform better than calibration, looking at small domains, for cutoff sampling. Their article, which I recommend in general, is referenced and linked in my paper.
There are researchers looking into inference from nonprobability sampling cases which are not so well-behaved as what I did for the EIA, where multiple covariates may be needed for pseudo-weights, or for modeling, or both. (See Valliant, R.(2019)*.) But when many covariates are needed for modeling, I think the chances of a good result are greatly diminished. (For multiple regression, from an article I wrote, one might not see heteroscedasticity that should theoretically appear, which I attribute to the difficulty in forming a good predicted-y 'formula'. For psuedo-inclusion probabilities, if many covariates are needed, I suspect it may be hard to do this well either, but perhaps that may be more hopeful. However, in Brewer, K.R.W.(2013)**, he noted an early case where failure using what appears to be an early version of that helped convince people that probability sampling was a must.)
At any rate, there is research on inference from nonprobability sampling which would generally be far less accurate than what I led development for at the EIA.
So, the US Energy Information Administration makes a great deal of use of quasi-cutoff sampling with prediction, and I believe other agencies could make good use of this too, but in all my many years of experience and study/exploration, I have not seen much evidence of such applications elsewhere. If you do, please respond to this discussion.
Thank you - Jim Knaub
..........
*Valliant, R.(2019), "Comparing Alternatives for Estimation from Nonprobability Samples," Journal of Survey Statistics and Methodology, Volume 8, Issue 2, April 2020, Pages 231–263, preprint at 
**Brewer, K.R.W.(2013), "Three controversies in the history of survey sampling," Survey Methodology, Dec 2013 -  Ken Brewer - Waksberg Award article: 
  • asked a question related to Estimation
Question
18 answers
The Seemingly Uncorrelated Regression Models
Relevant answer
Answer
Bruce Weaver , he also states that SPSS has a procedure for SUR with correlated errors. Given the specifics that Samuel Oluwaseun Adeyemo 's post includes either he is working with a system beyond what (https://www.ibm.com/support/pages/seemingly-unrelated-regression) is refers to that we don't know about, or there is some other explanation for how he came up with this information. Shame in his response above he did not answer where he got this information. Since he didn't answer on the other thread, the using a new version explanation seems unlikely.
  • asked a question related to Estimation
Question
4 answers
The mass faction and volume fraction is same thing. I calculate the the Percent volume fraction by XRD data with the help of formula.
(CuTl-1223) % = × 100
(CuTl-1234) % = × 100
(CuTl-1212) % = × 100 ……….. (1)
(Unknown impurity) % = × 100
Relevant answer
Answer
Dear friend Yaseen Muhammad
Rietveld analysis is a powerful method for determining the crystal structure and phase composition of materials from X-ray diffraction (XRD) data. However, it cannot directly provide the mass fractions of individual phases.
To estimate the mass fractions, additional information is needed, such as the densities of the individual phases and the total mass of the sample. The mass fraction of a phase is equal to the volume fraction of the phase multiplied by its density. The volume fraction of a phase can be calculated from the Rietveld analysis by dividing the integrated intensity of the diffraction peaks from that phase by the total integrated intensity of all phases in the sample.
Once the volume fractions of the individual phases are determined, the mass fractions can be estimated using the following formula:
Mass fraction = Volume fraction x Density
The densities of the individual phases can be obtained from the literature or by experimental measurements such as Archimedes' principle or pycnometry.
It is important to note that the estimated mass fractions obtained from XRD data may not be very accurate due to various factors such as preferred orientation, peak overlap, and other instrumental and experimental factors.
References:
1. Cullity, B. D., & Stock, S. R. (2001). Elements of X-ray diffraction (3rd ed.). Prentice-Hall.
2. Young, R. A. (1995). The Rietveld method. Oxford University Press.
  • asked a question related to Estimation
Question
2 answers
How does Least Squares Estimation work
Relevant answer
  • asked a question related to Estimation
Question
9 answers
Hi,
I had done an intervention study with two groups (Treatment group=1, Control group=0). I have three time points (Baseline=t0, post-intervnetion=T1, follow-up=T2). My outcome variables are quality of life and anxiety level (Both measured in continuous scales). As my outcome variables didn't follow the normal distribution, I am conducting GEE. I would like to know do I need to adjust baseline values of outcome variables? If yes, how should I interpret the output tables? If anyone has example of similar study, I would be grateful to read that. Appreciating your support
Relevant answer
Answer
Just a note: if your goal is to report the findings to regulatory agencies (FDA, EMA), it's commonly advised to do that, please find the EMA guideline linked below. But not because of importance of the baseline differences (it's an RCT, so even if present, differences are due to chance and must be ignored), but to "filter out" these potential fake differences from the treatment and improve inference. The adjustment applies for both analysing post-values and change from baseline*.
----
* Just a warning about modelling the change-from-baseline in RCTs. It poses a serious problem if there was actual "virtual" (regardless of the randomization) difference at baseline and also the between-arm difference in post-values, of similar magnitude. In this case the slopes connecting pre-post in both arms may be close-to-parallel suggesting "no difference in differences", EVEN if the post-values DO indicate statistically signifiant (and hopefully also clinical) difference. This is called the Lord's paradox and it's a sound reason to use change scores in observational studies and post-values in RCTs. Change scores = parallelism of slopes approach actually "looks" at the baseline differences (which makes a lot of sense in non-RCTs and not in RCTs), post-values approach ignores the baseline difference and looks only at the follow-up post-treatment findings (makes a lot of sense in RCT and not non-RCT). It's up to you which approach is the best for you.
  • asked a question related to Estimation
Question
5 answers
I am working on a research point that employs estimation techniques. I am trying to apply an algorithm in my work to estimate system poles. I wrote an m-file and tried to apply this technique on a simple transfer function to estimate its roots .any suggestions about estimation techniques ?
Relevant answer
Answer
There are many estimation techniques that can be used to estimate system poles. Here are a few popular ones:
  1. Least Squares Method: This method involves fitting a model to the data in a way that minimizes the sum of the squares of the errors. This can be used to estimate system parameters such as poles and zeros.
  2. Maximum Likelihood Method: This method involves finding the parameter values that maximize the likelihood of the observed data. This can be used to estimate system parameters such as poles and zeros. (See reference [1-3])
  3. Prony's Method: This method involves fitting an exponential function to the data using the method of least squares. The method can be used to estimate system poles and can be useful when the system poles are well-separated.
  4. Eigenvector Method: This method involves calculating the eigenvectors of the system and using them to estimate the system poles. This can be useful when the system is large and complex.
  5. System Identification Method: This method involves using a set of input and output data to estimate the system parameters. The method can be used to estimate system poles as well as other parameters such as gains and time delays.
To apply an algorithm to estimate system poles, you can start with a simple transfer function and apply the algorithm to estimate the poles. You can then compare the estimated poles with the known poles of the transfer function to evaluate the accuracy of the algorithm. It may also be useful to test the algorithm on more complex systems to see how well it performs.
[1] Bazzi, Ahmad, Dirk TM Slock, and Lisa Meilhac. "Efficient maximum likelihood joint estimation of angles and times of arrival of multiple paths." 2015 IEEE Globecom Workshops (GC Wkshps). IEEE, 2015.
[2] Bazzi, Ahmad, Dirk TM Slock, and Lisa Meilhac. "On a mutual coupling agnostic maximum likelihood angle of arrival estimator by alternating projection." 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 2016.
[3] Bazzi, Ahmad, Dirk TM Slock, and Lisa Meilhac. "On Maximum Likelihood Angle of Arrival Estimation Using Orthogonal Projections." 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018.
  • asked a question related to Estimation
Question
2 answers
It is Microtus arvalis
Estimated area -1 hectare
If there is a formula to calculate the galleries?
Relevant answer
Answer
Look the link, maybe useful.
Regards,
Shafagat
  • asked a question related to Estimation
Question
2 answers
Hi, everyone.
I am currently testing a second order latent growth curve.
One of the models showed poor fit and I was testing ways to find what the
optimal trajectory may be.
The fit looked like this for the linear solution.
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.068
90 Percent C.I. 0.061 0.075
Probability RMSEA <= .05 0.000
CFI/TLI
CFI 0.863
TLI 0.839
SRMR (Standardized Root Mean Square Residual)
Value 0.114
Then, I eliminated one item from the model because 1) It showed non-significant loadings in
another study 2) there was some evidence in the literature that suggested its weakness.
(The scale consists of 5 items)
The GOF after eliminating that particular item was this.
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.037
90 Percent C.I. 0.022 0.049
Probability RMSEA <= .05 0.968
CFI/TLI
CFI 0.965
TLI 0.956
SRMR (Standardized Root Mean Square Residual)
Value 0.087
Although I did have some rationale to eliminate the item, my first intention of deleting the
item was more of a hunch.
I wanted to know what may be behind the intense increase of fit by eliminating one item
from one model. The item works fine when I ran an EFA and CFA but showed inconsistent
p-values when I run more complex models with it (e.g. Bifactor SEM, ESEM)
The item also works fine for the same model (second order growth model) with a different
sample (a sample, who is a year younger than the one which shows poor fit)
Also, if you can point to me to some resources to understand this it would be most appreciated.
Relevant answer
Answer
Hello, and thank you so much for responding to such a ill described question!
I will definitely go through the articles your referred to me, some of which
I already possess. I might have skipped the answers in them.
Thanks again!
  • asked a question related to Estimation
Question
1 answer
Hello
I am working on an article with difference in difference model and I have 5 variables and one of my independent variables is Dummy.
Estimation by the difference in difference method eliminates this variable. I want to use the interaction method. How can I use this effect so that it is not deleted?
  • asked a question related to Estimation
Question
5 answers
Dear All,
We are trying to build LSTM network for programming language syntax errors fixing. Here is a sample paper that describes our purpose "Source Code Assessment and Classification Based on Estimated Error Probability Using Attentive LSTM Language Model and Its Application in Programming Education". Are there anyone who can provide us a github link about sample deep learning implementation (in python) about that?
Regards.
Relevant answer
  • asked a question related to Estimation
Question
1 answer
i am trying to do umbrella sampling of 2 AU molecules but whenever I am trying to run the md_pull.mdp file it giving an error :
ERROR 1 [file md_pull.mdp]:
When the maximum distance from a pull group reference atom to other atoms
in the group is larger than 0.5 times half the box size a centrally
placed atom should be chosen as pbcatom. Pull group 1 is larger than that
and does not have a specific atom selected as reference atom.
ERROR 2 [file md_pull.mdp]:
When the maximum distance from a pull group reference atom to other atoms
in the group is larger than 0.5 times half the box size a centrally
placed atom should be chosen as pbcatom. Pull group 2 is larger than that
and does not have a specific atom selected as reference atom.
Pull group natoms pbc atom distance at start reference at t=0
1 4221 2111
2 4221 6332 0.488 nm 0.488 nm
Estimate for the relative computational load of the PME mesh part: 0.10
NOTE 3 [file md_pull.mdp]:
This run will generate roughly 5408 Mb of data
  • asked a question related to Estimation
Question
5 answers
Hello Respected
I am trying to measure the total chlorophyll, but I am unable to calculate its amount.
I am following the papers written by (LICHTENTHALER and WELLBURN, 1983) and (Arnon, 1949) but those equations are not clear to me.
Please help me, with how to use the mentioned papers equations to estimate chlorophyll.
Regards
Relevant answer
Answer
Dear Uzzal
Have a Good Day!
Will you please share with us the reference in which they talked about the 0.2g weight of leaves because whatever the paper you have mentioned as a reference does not have that information?
Regards
Harsh
  • asked a question related to Estimation
Question
4 answers
What are the appropriate time to estimate the incidence for any medical problem and how can differentiate from prevalence to be more reliable and realistic number figures?
Relevant answer
Answer
Please note that the incidence is usually (but not necessarily) calculated for acute communicable diseases, and the prevalence usually calculated for the chronic non-communicable disease. You have asked about the best time to calculate the incidence: of course it should be at the end of the period. eg incidence of COVID-19 during the year 2021, it is the number of cases of COVID-19 developed during 2021 (numerator) among people who were free from COVID-19 at Jan 1, 2021 (denominator). If you calculate the point prevalence of COVID-19 now, it would be very low, but if you calculate the period prevalence (eg during the previous year) it would be high.
  • asked a question related to Estimation
Question
7 answers
for calculation of standardized residuals is it true?
Volatility= Estimated from MSGARCH package in R
Residuals= abs(returns)-volatility;
standardized residuals=Residuals/volatility;
  • asked a question related to Estimation
Question
5 answers
Can it become linear, then linear regression could be done.
Should I estimate using non linear estimates?
Relevant answer
Answer
One more time you don't need to linearize anything it's the regression coefficients that matter. Please read the Kutner book it's a free download from the z-library. David Booth
  • asked a question related to Estimation
Question
9 answers
Say one has some devices to measure the temperature of a room. The devices don't provide me with an accurate measurement. Some overshoot the actual value of the reading, others underestimate it. Using this set of inaccurate readings, is it possible for me to obtain a reading having high accuracy?
Relevant answer
Answer
If you have used the instruments before and can make assumptions about their performance (even things as far fetched as being unbiased and random), then you can do things like have lots of samples and if you believe things like the random and unbiased then as Christian Geiser says you can assume as n increases the estimates improve in accuracy. You need to provide more information about what you can assume in order for people to properly address your question.
  • asked a question related to Estimation
Question
2 answers
How to Estimate the Total flavonoid content from endophytic fungal extract?
Relevant answer
Answer
Not more than 40oC
  • asked a question related to Estimation
Question
1 answer
What is the best dynamic panel model when T>N ?
Relevant answer
Answer
For the long T panel, a possible option is the panel data techniques that account for possible cross-section dependence. See for instance: https://www.sciencedirect.com/science/article/abs/pii/S0304407620301020.
Hope this helps. Thank you.
  • asked a question related to Estimation
Question
1 answer
Hi,
Does anybody know how to extract the slopes' effect size of classes in Latent Class Growth Analysis using Mplus?
Thanks
Relevant answer
Answer
Could elaborate a little bit on what you mean by effect size in this context? Do you mean the estimate of the slope factor mean in a particular class? The mean should be part of the parameter estimate output. From that, you could compute a standardized effect size measure by hand (using the estimate of the slope factor mean and variance).
  • asked a question related to Estimation
Question
2 answers
Greetings!
I am conducting a series of CFAs in R using the 'lavaan' package. I am interested in estimating the correlations between the factors taking my measuerment model into account instead of going back to the raw data and summing the items representing each factor. In the lavaan output, I can find the covariances only. I can turn the covariances to correlations by dividing them by the product of the standard deviations, but due to the number of CFAs and the number of factors, I am wondering if there is a more streamlined way in the lavaan syntax to do that (Or in another SEM package in R).
Any help would be greatly appreciated!
(* I checked the lavaan commands. I did not find something but I am fairly new to the package and I might have missed it. Furthermore, the only option in 'lavaan.Plot' is to plot covariances)
(**Thought: I used variance standardization method in an example and marker method in another. In the variance standardization method, the "estimate" collumn and the "std.all" collumn of my covariance table were identical. In the marker method they were not. Im thinking that standardized covariances in lavaan should be the correlations and they are identifiable using a marker method only. Or not?"
Relevant answer
Answer
Christian Geiser Yes, they do give the same std.all solution. I forgot that the "estimate" collumn in the variance standardized method is supposed to be the same as the std.all collumn that is shared in both methods since variance is standardized and i got mixed up with that. Thank you for the fast reply!
  • asked a question related to Estimation
Question
3 answers
Need help regarding: Estimation of Soil Moisture Regimes using Remote Sensing. Either EIPC Model fits best? or there is something else that can be considered.
Relevant answer
Answer
Please refer to the following paper that presents a comprehensive review of the progress in remote sensing as well as field methods for soil moisture studies.
Regards
  • asked a question related to Estimation
Question
20 answers
Hi, I am looking for available potassium quantification. But the suggested protocol said to have flame photometer, that I don't.
Please suggest me a good method for estimating available potassium of my soil samples without using flame photometer. Also suggest me a good method for estimating Calcium and magnesium.
Thank You
Relevant answer
Answer
I think the protocol found in this article (attached) may actually help you in your research.
Best wishes,
Sabri
  • asked a question related to Estimation
Question
1 answer
Dear Researcher,
I have the following MATLAB codes that generate FMCW signal. However, I have two basic problem with code I appreciate it if you can help/guide me to resolve them:
1. Based on my understanding, this code generates FMCW for one Target as the dimension of the sig is 1 x N which must must be L x N (L is the number of target)
2. the Dechirped signal, which is Analog, at the receiver have to be converted to digital in my algorithm
Note/ I want to apply this time of signal (FMCW) to Direction of Arrival Estimation (DOA) algorithm
Again I highly appreciate your time and consideration to help me to overcome these uncertainness.
%%CODES for generating FMCW signal%%%%%%%%
% Compute hardware parameters from specified long-range requirements
fc = 77e9; % Center frequency (Hz)
c = physconst('LightSpeed'); % Speed of light in air (m/s)
lambda = freq2wavelen(fc,c); % Wavelength (m)
% Set the chirp duration to be 5 times the max range requirement
rangeMax = 100; % Maximum range (m)
% In general, for an FMCW radar system, the "sweep time" should be at least five to six times the round trip time
tm = 5*range2time(rangeMax,c); % Chirp duration (s)=Symbol duration (Tsym)
% Determine the waveform bandwidth from the required range resolution
rangeRes = 1; % Desired range resolution (m)
bw = rangeres2bw(rangeRes,c); % Corresponding bandwidth (Hz)
% Set the sampling rate to satisfy both the range and velocity requirements for the radar
sweepSlope = bw/tm; % FMCW sweep slope (Hz/s)
fbeatMax = range2beat(rangeMax,sweepSlope,c); % Maximum beat frequency (Hz)
vMax = 230*1000/3600; % Maximum Velocity of cars (m/s)
fdopMax = speed2dop(2*vMax,lambda); % Maximum Doppler shift (Hz)
fifMax = fbeatMax+fdopMax; % Maximum received IF (Hz)
fs = max(2*fifMax,bw); % Sampling rate (Hz)
% Configure the FMCW waveform using the waveform parameters derived from the long-range requirements
waveform = phased.FMCWWaveform('SweepTime',tm,'SweepBandwidth',bw,...
'SampleRate',fs,'NumSweeps',2,'SweepDirection','Up');
% if strcmp(waveform.SweepDirection,'Down')
% sweepSlope = -sweepSlope;
% end
N=tm*fs; % Number of fast-time samples
Nsweep = 192; % Number of slow-time samples
sigTx = waveform();
for i=1:K
doas_rad=AOA_Degree*pi/180;
A=exp(-1i*2*pi*d*(0:M-1)'*sin([doas_rad(:).']));
sigRx=A*sigTgt';
sigRx=sigRx+awgn(sigRx,SNR);
%DeChirped and conevrt it to Digital
% DesigRx=dechirp(sigRx,sigREF);
DechirpedSignal= sigTgt .* conj(sigRx);
end
Relevant answer
Answer
My suggestion would be to first understand your code before asking questions. In my opinion, you did not ask questions, you gave two observations. Related to your observation 2. what do you mean by analog? In computer/Matlab everything is digital. You can model analog-to-digital conversion in Matlab but signals will still be digital. Try to understand code and then pose questions.
  • asked a question related to Estimation
Question
13 answers
####(I also posted this on SO https://stackoverflow.com/q/71531275/16505198and SE https://stats.stackexchange.com/q/568112/340994 but didn't receive any answer until now. So here's another chance :-) However the code snippets might be more readable there... )#####
Hello,
I am estimating an ordinal logistic regression under the assumption of proportional odds with the ordinal::clm() function. As a RE see this model from the "housing" dataset (`MASS::housing`):
```
clm(Sat~Type*Cont + Freq, data = housing, link = "probit") %>% S
formula: Sat ~ Type * Cont + Freq
data: housing
Coefficients:
Estimate Std. Error z value Pr(>|z|)
TypeApartment -0.14387 0.54335 -0.265 0.791
TypeAtrium 0.20043 0.55593 0.361 0.718
TypeTerrace 0.18246 0.55120 0.331 0.741
ContHigh 0.05598 0.53598 0.104 0.917
Freq 0.01360 0.01116 1.219 0.223
TypeApartment:ContHigh -0.25287 0.78178 -0.323 0.746
TypeAtrium:ContHigh -0.17201 0.76610 -0.225 0.822
TypeTerrace:ContHigh -0.18917 0.76667 -0.247 0.805
Threshold coefficients:
Estimate Std. Error z value
Low|Medium -0.1130 0.4645 -0.243
Medium|High 0.7590 0.4693 1.617
```
If I want to test if the main effect and the interaction term are (simultaneously!) significant I used the glht function where I test the hypothesis that (bold for matrices or vectors) $\boldsymbol{\beta} \cdot \boldsymbol{K} = \boldsymbol{m}$.
So If I'd like to test if living in an apartment (main effect) **plus** the interaction of living in an apartment and having high contact is significantly different from zero it would be $(0; 0; 1; 0; 0; 0; 0; 1; 0;0 )\cdot \boldsymbol{\beta} = (0;0;...;0)$. (Assuming the two thresholds as intercepts and thus the first two estimates).
Is it right to test:
```
glht(mod, linfct = c("TypeApartment +TypeApartment:ContHigh =0")) %>% summary()
Simultaneous Tests for General Linear Hypotheses
Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
TypeApartment + TypeApartment:ContHigh == 0 -0.3967 0.6270 -0.633 0.527
(Adjusted p values reported -- single-step method)
```
or do I have to use:
```
glht(mod, linfct = c("TypeApartment= 0", "TypeApartment:ContHigh =0")) %>% summary()
Simultaneous Tests for General Linear Hypotheses
Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
TypeApartment == 0 -0.1439 0.5434 -0.265 0.946
TypeApartment:ContHigh == 0 -0.2529 0.7818 -0.323 0.921
(Adjusted p values reported -- single-step method)
```
Thanks a lot in advance I hope I posed the question right and understandable :-) If you have other options to test if a main effect and an interaction term are significant go ahead and tell me (and the others).
Thanks, Luise
Relevant answer
Answer
Luise Novikov Apologies, my first answer was actually incorrect - I have removed it to avoid misleading anyone.
If you pass both arguments separately, the function will return the results for partial tests i.e. multiplicity-adjusted p-values for each hypothesis under the assumption that the coefficients being tested are simultaneously zero. What you are looking for is a global test for the second option. The first option is incorrect because the sum of the coefficients can be zero without either coefficient being zero. Thus, hypothesis tests assuming only that the sum is zero will have an inappropriate additional degree of freedom and the p-values will be incorrect as a result.
What you are looking for is a global test under the assumption that both coefficients are zero, which you can obtain using car::linearHypothesis. For your data:
mod <- ordinal::clm(Sat~Type*Cont + Freq, data = MASS::housing, link = "probit")
car::linearHypothesis(mod, c("TypeApartment = 0", "TypeApartment:ContHigh = 0"))
  • asked a question related to Estimation
Question
7 answers
I want to know which weight is better to estimate carbon. Dry weight or wet weight?
And what is the conversion formula?
Relevant answer
Answer
Look the link, maybe useful.
Regards,
Shafagat
  • asked a question related to Estimation
Question
5 answers
I asked about variables X and Y to the same respondents for two different product types. I mean, respondents firstly answered the items for Product1 then for Product2. In AMOS, I have two different models (Product1-Product2). We proposed that the effect of X on Y is stronger for product1. We have two Std. Estimates. The first beta is greater than the second one. Is this finding sufficient to confirm the hypothesis? Or do I still need the Chi-square difference test or something like it to prove the statistical significance? If not, what should I do for the statistical difference between these two betas?
Thanks ahead
Relevant answer
Answer
check the pvalue
  • asked a question related to Estimation
Question
3 answers
An article March 10, 2020 on Annals of Internal Medicine, The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application, estimates a median incubation period of 5.8 days "(95% CI, 4.5 to 5.8 days), and 97.5% of those who develop symptoms will do so within 11.5 days (CI, 8.2 to 15.6 days) of infection."
Are there any updated estimates or more recent reports?
Relevant answer
Answer
  • asked a question related to Estimation
Question
2 answers
I AM USING PANEL DATA, I WANT TO ESTIMATE THE IMPACT OF REGULATION ON FIRMS' INNOVATION THROUGH DID, PSM-DID APPROACHES, I CAN ABLE TO CALCULATE DID BUT NOT PSM-DID FOR PANEL DATA. PLEASE ANYONE CAN EXPERIENCE.
  • asked a question related to Estimation
Question
6 answers
Good evening everyone,
I am using time-series data in ARDL model in which, I have 1 dependent, 1 independent, 1 control and 2 dummy variables with interaction for analysis in E-Views 10 version software. But I didn't understand where to place control variable in the list of dynamic or fixed regressors of ARDL Estimation equation. Please guide me with your knowledge and experience. I will be very thankful to you.
Relevant answer
Answer
Can we talk ? I have to talk regarding my paper. I have few queries regarding it. (@Mohamed-Mourad Lafifi)
  • asked a question related to Estimation
Question
4 answers
I want to evaluation of phytochemical and morphological studies on Iranian willow (Salix.) species. But I need to be the same age as the trees. Is there a way to measure the age of willow trees without cutting down?
Relevant answer
Answer
Willow is a tree that you may not be able to hit the center all the time with an increment borer (as mentioned by Dr. Mohl). Since Salix if cut sprouts back from the roots (coppice), sometimes an increment bore may have two or more stem centers that grew together with time. As I remember, if especially worried about health of tree, return the increment core to the tree, or apply the spray that horticulturalist sand tree surgeons use on cut surfaces. The annual rings can be counted visually, or get a hand lens if the rings are close. Typically, increment cores should not damage or kill tree. If you don’t have any sealant, candle wax would probably due to cover hole, and discourage insects, excess moisture or disease entry.
  • asked a question related to Estimation
Question
4 answers
To generate a PCA from lcWGS SNPs, one may use ANGSD to generate genotype likelihoods and then use these as input to generate a covariance matrix using PCAngsd.
The covariance matrix generated by PCAngsd is a n x n matrix where n is the number of samples and p is the number of SNPs (variables). According to the [PCAngsd tutorial](http://www.popgen.dk/software/index.php/PCAngsdTutorial#Estimating_Individual_Allele_Frequencies), the principal components (i.e. the samples plotted in the space defined by the eigenvectors) can be generated directly from this covariance matrix by eigendecomposition.
This is in contrast to the 'usual' way that PCA is done (via a covariance matrix), where a p x p (not n x n) covariance matrix C is generated from a centered n x p data matrix X. Eigendecomposition of C, then generates the eigenvectors and eigenvalues. The transformed values of X into the space defined by the eigenvectors (i.e. the principal components) can then be generated through a linear transformation of X with the eigenvectors (e.g. see [this](https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca) excellent example).
The difference between the two methods appears to lie in the covariance matrix. With the PCAngsd method, the covariance matrix is n x n as apposed to the 'usual' p x p matrix.
So what is the difference between these two covariance matrices, and what is generated by the eigendecomposition of an n x n matrix? Is it really the sample principal components, or something else?
Relevant answer
Answer
The eigenvectors with the highest eigenvalues correlate to the dimensions with the highest correlation in the dataset by calculating the eigenvalues and eigenvectors of the covariance matrix. This is the most important component
  • asked a question related to Estimation
Question
4 answers
Dear Ladies and Gentlemen,
I am currently working as a Master's student on several publications that methodically rely on latent profile analyses. In the context of these LPAs, I have repeatedly encountered the problem, that calculating the BLRT test with the help of Mplus (TECH14) is both time-consuming and often unsuccessful. In this specific case, an LPA with 30 Likert-scaled items (1-5) of the "Big Five" (OCEAN model) with a sample size of N= 738 was conducted. I would be interested to know, which approach you prefer in your research work.
Q1: Do you increase the number of bootstraps and LRT starts as the number of profiles increases, or do you exclude the BLRT test when you encounter error messages and instead refer to the Loglik value, LMR test, and the BIC fit indicator?
So far I have tried the following settings for TECH14 according to the recommendations of the forum entries by Muthen & Muthen:
LRTSTARTS 0 0 100 20 / LRTSTARTS 0 0 500 50.
Both of these options will result in unsuccessful bootstrap draws if more than three profiles are calculated for the model.
Q2: Do you treat your Likert scaled items as interval scaled variables and use the MLR Estimator or do you treat your indicator items as ordinal variables and use the WLSMV Estimator?
In my case, attributing the items as categorical with the WLSMV Estimator leads already with two profiles to a "non-positive definite first-order derivative product matrix".
There seem to be conflicting opinions here. Brown (2006) writes "The WLSMV is a robust estimator which does not assume normally distributed variables and provides the best option for modeling categorical or ordered data".
On the other hand, Bengt.O. Muthen (2006) :
The most important aspect is how strong floor and/or ceiling effects you have. If they are strong, you may want to use a categorical approach.
Q3: Would any of you be willing to cross-check my syntax for comparing distal outcomes with the BCH approach? (See appendix)
Thanks in advance for your help.
Philipp Schulz
References:
Brown, T. (2006). Confirmatory factor analysis for applied research. New York: Guildford.
Relevant answer
Answer
Philipp Schulz I agree with David Eugene Booth . Since your items are ordinal (ordered categorical), you would typically want to use classical latent class analysis (LCA), not latent profile analysis (LPA). LPA is designed for continuous (metrical, interval) indicators, whereas LCA is for categorical (binary, ordinal) indicators. The problems that you encountered with the BLRT may be related to the large number of items (30 items is a lot for both LCA and LPA). This does not necessary mean that using fewer items is better (in principle, it is good to have many indicators, as long as they are good class indicators), but it may explain the problems in successfully conducting the bootstrap. The BIC is a simple and good alternative in my experience. And yes, as the number of classes / profiles goes up, you should definitely increase the number of starts for both the target (estimated) class model and the bootstrap. You should also check that the best loglikelihood value for each model can be replicated at least a few times so as to minimize the risk of local likelihood maxima (potentially invalid solutions and/or invalid bootstrap results).
With regard to ML vs. WLSMV, you may be confusing latent class/latent profile analysis with confirmatory factor analysis (CFA) and structural equation modeling (SEM). Both LPA and LCA typically make use of maximum likelihood (ML) estimation, not WLSMV. WLSMV is used for ordinal (ordered categorical) variables used as indicators of continuous latent variables ("factors") in CFA and SEM.
  • asked a question related to Estimation
Question
5 answers
Are there any alternative techniques for ethanol estimation other than HPLC and GC?
Relevant answer
Answer
Dear Yerra Kanakaraju thank you for posting this interesting technical question on RG. In order to give you a qualified answer, it would be helpful to know a few more details about the conditions / environment in which you want to determine the ethanol content (e.g. in blood, alcoholic beverages, biodiesel etc.). In general, NMR spectroscopy is a suitable method for determining ethanol in mixtures. For some potentially useful information please have a look at the following article which might help you in your analysis:
Determination of Alcohol Content in Alcoholic Beverages Using 45 MHz Benchtop NMR Spectrometer
This method does not require fancy and expensive equipment. The good thing about this paper is that it is freely accessible as public full text on RG. Thus you can download it as pdf file.
Good luck with your work and best wishes!
  • asked a question related to Estimation
Question
6 answers
This study was carried out in order to find out the level of sheep’s meat, liver and kidney contamination by heavy metals such as: copper, lead, zinc, cadmium and cobalt in different areas of al Sulaimanyah Governorate in comparison with international allowed levels. For the above purpose; three samples of (meat, liver and kidney) were taken in three different
districts of al Sulaimanyah Governorate were covered: Said Sadiq, Dokan district and sulaimanyah city center. The samples were collected during October and November 2020. The triple interference (factors) has affected significantly, the amount of copper in the different sheep tissues; so the amount was varied with the difference of tissue, the place and the time of taking the sample. The highest level of copper in Liver’s tissue was recorded in Dokan district during November, while the lowest level of copper in the meat tissue was recorded in Said Sadiq district during November. The triple interference for the study factors, also affected the level of Zinc in different sheep tissue were the amount varied by tissue difference, place and the time of sample taking. Highest level of Zinc was recorded in kidneys tissue, in Sulaimanyah city Centre during October, while less amount of Zinc was recorded in liver’s tissue in Said Sadiq during October. The triple interference within the study’s factor, significantly affected the amount of cadmium. The amounts were varied by difference of tissue, place and time of taking the samples. Highest level of cadmium was recorded in the meat tissue, at Sulaimanyah city Centre during October, while less amount of Cadmium was recorded in liver’s tissue, Said Sadiq district during October. The triple interference did not affect significantly the amount of Lead and Cobalt in different sheep’s tissue.
Relevant answer
Answer
Very important and useful research.
  • asked a question related to Estimation
Question
2 answers
As far as I know, there are four steps for recommending the N fertilizer rates based on NDVI readings and grain yield at N rich strip and farmer practice plot.
The steps are as follows:
1- Estimating Response Index (RI) RI=Ypn/Ypo
2- Estimating Ypo (yield at farmer practice strip)
3- Estimating Ypn (yield at N rich strip) based on RI, i.e., Ypn=Ypo*RI
4- N fertilizer rate recommendation using the formula:
NFRate=(N uptake at Ypn - N uptake at Ypo)/NUE
If the above-mentioned steps are correct, I would like to know to estimate Ypo what variables I should use as an independent variable? NDVIo (NDVI at farmer practice) or INSEY (In-season Estimated Yield based on NDVI, i.e., NDVI divided by days from planting to sensing)?
Relevant answer
Answer
Thanks a million for the links.
  • asked a question related to Estimation
Question
4 answers
Example
Dominant Biceps Strength = 70%
Non-dominate Biceps Strength = 50%
Estimated loading prescription
for dominant Biceps as following :
1kg/ 8 times of Rep/ 30s/ 5 Sets/ 3m/ 3 sessions/week. Strengthening exercises
Kindly find out the loading prescription of non-dominate Biceps and the reason for choosing this loading ?
Relevant answer
Answer
It would be uncommon to prescribe a lower relative load to the weaker muscle. Why don't you do 1-RM testing on each arm and then prescribe the same training load (% 1RM) for each arm based on it's own 1-RM... the weaker arm will ultimately have a lower absolute load, but same relative load...
The National Strength and Conditioning Association uses the following RepMax table:
1 repetition max = 100% of 1RM
2 repetition max = 95% of 1RM
4 repetition max = 90% of 1RM
6 repetition max = 85% of 1RM
8 repetition max = 80% of 1RM
This would be a good starting point.... but remember that these are repetition max estimates and therefore training should be slightly below that for repeated sets. Also, it looks like you are just doing as many repetitions as possible during each 30 sec work bout, if I am reading it correctly. You may want to go on the lower end of the relative load, but 50-70% seems like a reasonable range. It depends on how many repetitions you want them to complete.... there is no prescribed load for your proposed protocol. Try it out on a few people and see where their repetition ranges fall and let that guide your decision!
Good luck!
  • asked a question related to Estimation
Question
3 answers
I am in search of methods of quantitative and qualitative estimation of cellulase in a food sample(biological sample)?
  • asked a question related to Estimation
Question
3 answers
Are there any references for Estimated Average Requirement (for minerals and vitamins) of infants less than 6 months?
Relevant answer
Answer
You can check this: https://www.nrv.gov.au/introduction for more information
  • asked a question related to Estimation
Question
23 answers
Hi everyone
I'm looking for a quick and reliable way to estimate my missing climatological data. My data is daily and more than 40 years. These data include the minimum and maximum temperature, precipitation, sunshine hours, relative humidity and wind speed. My main problem is the sunshine hours data that has a lot of defects. These defects are diffuse in time series. Sometimes it encompasses several months and even a few years. The number of stations I work on is 18. Given the fact that my data is daily, the number of missing data is high. So I need to estimate missing data before starting work. Your comments and experiences can be very helpful.
Thank you so much for advising me.
Relevant answer
Answer
It is in French
  • asked a question related to Estimation
Question
5 answers
logistic regression type of distribution estimation
what is the connection between Estimation and Distribution in machine learning
Relevant answer
Answer
:Logistic distribution or sigmoid distribution is one of the essential portion of statistics . Here the specific question is m/c learning application - https://machinelearningmastery.com/logistic-regression-for-machine-learning/
  • asked a question related to Estimation
Question
1 answer
I would like to reunite those interested in estimating using different approach the true underlying number of SARS-CoV-2 infected individuals. This is important since this number give us an idea about those undetected individuals spreading the infection and causing deaths amongst the elderly and individuals with preexisting health conditions.
Relevant answer
  • asked a question related to Estimation
Question
4 answers
Hi, I am looking to estimate a willingness to pay for a choice model, but for the given alternatives compared to a base rather than for the product attributes variables. I have run a mixed logit model and have coefficients for the attribute variables followed by coefficients for case-specific variables with each different option.
Relevant answer
Answer
Hi Louie! You can find what you need on Kenneth Train's website, in my opinion, he is the main reference on this topic. I leave you this link to his book, I recommend especially chapter six.
All the best in your work!
  • asked a question related to Estimation
Question
12 answers
Anyone's can suggest me which optimization / estimation techniques is good for solar photovoltaic cells.
Relevant answer
You have three main parameters to maximize in order to maximize the power conversion efficiency PCE of the solar cell.
It is so that the PCE= Isc. Voc . FF/ Incident solar radiation AM1.5
So, as you see the three factors are the short circuit current Isc,
the open circuit voltage Voc and the fill factor.
On has to adjust the physical and the technological parameters of the solar cell to achieve the maximum of the three quantities.
One has to make the absorber thickness at least equal to the absorption depth of the highest radiation wavelength so d>= 1/alph(almbdamax)
where alpha is the absorption coefficient,
The thickness d must also made at the same tome less than the diffusion length of the minority carriers L; so, d<=L
One has to minimize the reverse situation current Is of the dark current to achieve the highest Voc such that:
Voc= nVt ln Isc/Is
Is is minimized by reducing the injection across the junction and the minority carrier life time through out the device.
In order to increase the fill factor one has to reduce the shunt resistance Rs and increase the shunt resistance Rsh.
In this way one gets the highest efficiency.
For more information about building the solar cell to achieve the highest efficiency please refer to the book chapter:
Best wishes
  • asked a question related to Estimation
Question
3 answers
In irrigation science, the net depth of irrigation (NDI) is estimated from the equation
NDI=RZD*WHC*PD%
RZD: root zone depth, mm
WHC: water holding capacit, mm water/cm soil profile
PD%: percentage of depletion %
In genera, PD is limited between 40-60%.
If there is any other acceptable ratio of depletion in any irrigation system, please inform me.
Best Regards
Relevant answer
Answer
Dear Isam Mohammed Abdulhameed many thanks for asking this very interesting technical question. In addition to the relevant literature references suggested by Mohamed-Mourad Lafifi please also have a look at the following potentially useful link:
CHAPTER 6: Irrigation scheduling
This book chater is freely available as public full text on the internet (please see the attached pdf file).
Please also see the following irrigation depth calculator:
Good luck with your work and best wishes!
  • asked a question related to Estimation
Question
4 answers
I am a bit warry about asking this question regarding my ignorance and I am also not to sure if this question might bring out some emotional response.
Assume I want to have an estimate on the average R-squared and variability from literature for specific models y~x. I found 31 literature sources.
The questions are twofold :
1.) Can I shift the simulate of an ABC-rejection algorithm acting like it come from indeed from my target (see the first 4 figures)?
The parameter in this case is the draw from the prior deviating from the target and then shift it so it fits.
2.) I applied 4 methods in this case ABC-rejection (flat prior not really preferred), Bayesian bootstrap, Classical bootstrap and a one sided T-test (lower 4 figures). From all methods I extracted the 2.5-97.5% intervals. Given the information below, is it reasonable to go for Bayesian bootstrap in this case?
As sometimes suggested on RG and hidden deeply in some articles the intervals of the different intervals converse and are more-or-less-ish similar. However, I do have another smaller dataset which is also skewed. So I would personally prefer the Bayesian bootstrap as it smooths out and the extreme exactness in this case does not matter to much to me. Based on these results my coarse guestimate of the average variability would range from ~20-30% (To me it seems pot`a`to - pot`aa`to for either method disregarding the philosophical meaning). I also would like to use the individual estimates returned each bootstrap and technically it is not normal distributed (Beta-ish), although this does not seem to matter much in this pragmatic case.
Thank you in advance.
Relevant answer
Answer
I agree with David, use mean
  • asked a question related to Estimation
Question
7 answers
Hi,
Even though Artemis gives an error bar on the EXAFS fitting but it assumes the value to be zero at certain R values.
In the case of data with some experimental noise, the error bar needs to be corrected, weighted by the square root of the reduced chi-squared value, taking into account the experimental noise for each R-space spectrum from 15 to 25 Å, as described in
How do calculate the uncertainties in the coordination number when EXAFS is fitted using Artemis?
Thanks in advance
Relevant answer
Answer
Thanks, Gerhard Martens it was indeed insightful
I have one more case where relatively good data (up to k=14) gives the following fitting (see attached).
it still show a huge error bar? can it be solved somehow?
Thanking you,
  • asked a question related to Estimation
Question
2 answers
I am running K estimator for a gene family. I found that the K estimator is only working for certain gene combinations. It failed to estimate Ka and Ks values for some gene combinations. Is it possible that the other combinations are distantly related?
Relevant answer
Answer
You may try the software KaKs calculator.
  • asked a question related to Estimation
Question
4 answers
I would like to measure the air pollution caused due to stone mines in the surrounding areas. So for this, what is the best method to identify sampling points.
Relevant answer
Answer
It depends on the number of mining areas. If they are few there is no need for sampling but if there are many then you can use fishers method for sampling.https://www.geopoll.com/blog/sample-size-research/
  • asked a question related to Estimation
Question
1 answer
Dear all,
The complex I am trying to simulate has protein ( 10 chains), DNA, and RNA I ran the simulation for 2 us so far but the biological event I want to observe requires a very long MD simulation. I decided to accelerate the sampling using the AWH method. I am not familiar enough with the AWH method or other methods such as umbrella sampling. I have watched the two webinars on AWH but I am still not sure about the mdp options. Attached is the mdp I used and the error I got!
I believed that I need to choose a reference atom for pulling, But on what basis?
pull = yes ; The reaction coordinate (RC) is defined using pull coordinates. pull-ngroups = 12 ; The number of atom groups needed to define the pull coordinate. pull-ncoords = 7 ; Number of pull coordinates. pull-nstxout = 1000 ; Step interval to output the coordinate values to the pullx.xvg. pull-nstfout = 0 ; Step interval to output the applied force (skip here).
pull-group1-name = Protein_chain_A ; Name of pull group 1 corresponding to an entry in an index file. pull-group2-name = Protein_chain_B ; Same, but for group 2. pull-group3-name = Protein_chain_C pull-group4-name = Protein_chain_D pull-group5-name = Protein_chain_E pull-group6-name = Protein_chain_F pull-group7-name = Protein_chain_G pull-group8-name = Protein_chain_H pull-group9-name = Protein_chain_I pull-group10-name = Protein_chain_J pull-group11-name = RNA pull-group12-name = DNA
pull-group1-pbcatom = 0 pull-group2-pbcatom = 0 pull-group3-pbcatom = 0 pull-group4-pbcatom = 0 pull-group5-pbcatom = 0 pull-group6-pbcatom = 0 pull-group7-pbcatom = 0 pull-group8-pbcatom = 0 pull-group9-pbcatom = 0 pull-group10-pbcatom = 0 pull-group11-pbcatom = 0 pull-group12-pbcatom = 0
pull-coord1-groups = 1 2 ; Which groups define coordinate 1? Here, groups 1 and 2. pull-coord2-groups = 3 4 pull-coord3-groups = 5 6 pull-coord4-groups = 7 8 pull-coord5-groups = 9 10 pull-coord6-groups = 11 12 pull-coord7-groups = 11 12
pull-coord1-geometry = distance ; How is the coordinate defined? Here by the COM distance. pull-coord1-type = external-potential ; Apply the bias using an external module. pull-coord1-potential-provider = AWH ; The external module is called AWH!
awh = yes ; AWH on. awh-nstout = 50000 ; Step interval for writing awh*.xvg files. awh-nbias = 1 ; One bias, could have multiple. awh1-ndim = 1 ; Dimensionality of the RC, each dimension per pull coordinate. pull-coord1-groups awh1-dim1-coord-index = 1 ; Map RC dimension to pull coordinate index (here 1–>1) awh1-dim1-start = 0.25 ; Sampling interval min value (nm) awh1-dim1-end = 0.70 ; Sampling interval max value (nm) awh1-dim1-force-constant = 128000 ; Force constant of the harmonic potential (kJ/(mol*nm^2)) awh1-dim1-diffusion = 5e-5 ; Estimate of the diffusion (nm^2/ps),used to initial update size, how quezly the system moves awh1-error-init = 5 ; Estimate of the error of diffusion , used to set initial update size awh-share-multisim = yes ; Share bias across simulations awh1-share-group = 1 ; Non-zero share group index
ERROR 12 [file AWH3.mdp]: When the maximum distance from a pull group reference atom to other atoms in the group is larger than 0.5 times half the box size a centrally placed atom should be chosen as pbcatom. Pull group 12 is larger than that and does not have a specific atom selected as reference atom.
Any advice you could give would be much appreciated
Thank you so much!
Amnah
Relevant answer
Answer
AWH calculates the free energy along an order parameter of the system. Free energy barriers are overcome by adaptively tuning a bias potential along the order parameter such that the biased distribution along the parameter converges toward a chosen target distribution. The fundamental equation governing the tuning is: log(target) = bias - free energy, where the bias and free energy are initially unknown. Typically the target distribution is simply chosen uniform, such that the bias completely flattens the free energy landscape.
Regards,
Shafagat
  • asked a question related to Estimation
Question
6 answers
I want to compare the efficiencies of different machine learning techniques for Structural reliability and surrogate modelling. Although my problem is specific, I think there should be well-known criteria for that. Unfortunately, I have not found any in the literature!
One simple idea is to multiply accuracy by the number of calls or time, but it is really not a very proper criterion most of the time.
How can we define good criteria? is it a good idea to go with a weighted multiplication based on a specific objective? or is there any well-known method for making this comparison?
I appreciate your help with this challenge!
Relevant answer
Answer
Your very welcome Sajad Saraygord Afshari
Kind Regards
  • asked a question related to Estimation
Question
4 answers
I am implementing an unscented kalman filter for parameter estimation and I found its performances strongly related to the initialization of the parameter to estimate.
In particular, I don't understand why if I initialize the parameter below the reference value (the actual value to estimate) I get good performances, but if the parameter initialization is above the reference value, performances are very low and the estimation does not converge but continues increasing.
In other words, it seems that the estimation can only make the parameter increase. Is there any mathematical reason behind this?
Relevant answer
Answer
Dear Mr. Magnani,
I agree with Janez Podobnik. I assume you are using the Joint Unscented Kalman Filter for this purpose. In this case, the parameter is considered as a state and all the initial conditions, noise and measurement covariances will affect the estimation of the parameter. In 2019, I have proposed a different approach for this purpose and I am attaching the article link below:
The main advantage of this approach is you can determine weights for the measurements (a Jacobian approach) for parameter estimation. If one of the measurements significantly affects the parameter estimation, then, you can increase the weight of that measurement. The only drawback of this approach is that the estimated parameter must be linear in terms of measurements. The system itself can be a nonlinear system, but parameter(s) must be linear in terms of measurements. For frequency estimation, we have modified this approach with my colleagues (link below). In this case, the parameter is no longer linear in terms of measurements. However, we haven't tested this approach for other nonlinear systems in which the parameter(s) is nonlinear in terms of measurements.
You may try this approach and it can be useful for your aim.
Best regards,
Altan
  • asked a question related to Estimation
Question
2 answers
I am trying to estimate nitrite from human serum using Griess reagent from Sigma. Sodium nitrite standards are used. Estimation of nitrite is done at 540 nm. Serum was processed in the following different ways before estimation.
1) Serum deproteinized with 92 mM Zinc Sulphate.
2) Serum deproteinized with 40% ethanol.
3) Serum deproteinized and reduced with Vanadium Chloride III.
I'm unable to measure nitrite from processed and unprocessed serum samples as suggested in literature. Kindly suggest your views.
Relevant answer
Answer
Dear Nandi
Although there are various methods for determination of NOx, the simplicity, rapidity, and cheapness of the Griess assay have made this method more popular than others, in which deproteinization is a necessary step in measurement of it concentration in the serum, mostly because of the turbidity resulting from protein precipitation in an acidic environment. I considering acetonitrile an adeguate method to obtain a good extraction sample in which acetonitrile is mixed in ratio 1:1 with serum, vortexed for 1 min and centrifuged at 10000 × g for 10 min at 4 °C and supernatant used for NOx determination.
  • asked a question related to Estimation
Question
13 answers
The Nyquist-Shannon theorem provides an upper bound for the sampling period when designing a Kalman filter. Leaving apart the computational cost, are there any other reasons, e.g., noise-related issues, to set a lower bound for the sampling period? And, if so, is there an optimal value between these bounds?
Relevant answer
Answer
More samples are generally better until such point as the difference in the real signal between samples is smaller than the quantization or other noise. At that point, especially with quantization, it may be a point of diminishing returns.
The other thing that nobody mentions is that faster sampling means less real-time processing time. In many systems, it's not really an issue as the time constants of the physical system are so slow as to never challenge the processing. In others, say high speed flexible meachatronic systems, the required sample rates may challenge the number of processing cycles available to complete the task.
Generally, the best bet is to return to the physical system's time constants and (if possible) sample 20-100x as fast as them.
  • asked a question related to Estimation
Question
4 answers
I want to estimate TOA and CFO of LTE signal. TOA is estimated for oversampled signal. after TOA estimation, the CFO is estimated. Please let me know, why the estimated CFO is wrong?
Relevant answer
Answer
What is the approach used to achieve estimation?
  • asked a question related to Estimation
Question
12 answers
I have designed the mathematical model of the plant with nonlinear hystersis function f(x1) and is validated using simulation. Now I want to design the nonlinear observer to esttimate the speed (x2). Not that I have also modeled the nonlinear function in the model.
My state space model of the plant is
x1_dot = x2
x2_dot = q*x1 + c*x2 + f(x1) + u
Please suggest suitable observer to estimate the angular speed x2.
x1 is the angular position of the plant.
Relevant answer
Answer
High Gain Observer (HGO) is good techniques for nonlinear system to estimate their states, which also hold the separation principle. I think HGO will be better.
  • asked a question related to Estimation
Question
3 answers
Anyone if knows about the estimation of population parameters for panel data, kindly recommend me some literature or web links?
Relevant answer
Answer
  • asked a question related to Estimation
Question
4 answers
What is the difference between Maximum Likelihood Sequence Estimation and Maximum Likelihood Estimation? Which one is a better choice in case of channel non-linearities? And why and how oversampling helps in this?
Relevant answer
Answer
The Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a specific model. It selects the set of values of the model parameters that maximizes the likelihood function. Intuitively, this maximizes the "agreement" of the selected model with the observed data.
  • asked a question related to Estimation
Question
8 answers
Hello,
using Cholesky decomposition in the UKF induces the possibility, that the UKF fails, if the covariance matrix P is not positiv definite.
Is this a irrevocable fact? Or is there any method to completely bypass that problem?
I know there are some computationally more stable algorithms, like the Square Root UKF, but they can even fail.
Can I say, that problem of failing the Cholesky decomposition occurs only for bad estimates during my filtering, when even an EKF would fail/diverge?
I want to understand if the UKF is not only advantagous to the EKF in terms of accuarcy, but also in terms of stability/robustness.
Best regards,
Max
Relevant answer
Answer
If I understand your question correctly, it concerns not the initial covariance matrix rather the updated covariance matrix you get at the end of each Kalman iteration.
If such a condition arises you may use Higham's method to find an approximate positive-definite covariance matrix.
Reference:
Computing a nearest symmetric positive semidefinite matrix - ScienceDirect
  • asked a question related to Estimation
Question
4 answers
In Bayesian Inference, we have to choose a prior distribution of parameter for finding Bayes estimate which depends upon our belief and experience.
I would like to know what are steps or rule we should follow for taking a prior distribution of a parameter. Please help me with the same so that I can proceed.
Relevant answer
Answer
Thanks,
Sabahaldin Abdulqader Hussain
to explain this. I understood that we can look at the properties of a parameter of a distribution in terms of their support value like that. But what happens when we work with real data and in that case, we do not have any prior information.
  • asked a question related to Estimation
Question
1 answer
I am currently working on a project where I am testing a Pairs Trading Strategy based on Cointegration. In this strategy I have 460 possible stock pairs to choose from every day over a time frame of 3 years. I am using daily Cointegration test and get trade signals based off of that to open or close a trade. According to this strategy I am holding my trades open until either the take profit condition (revert to mean of spread) or stop loss condition (spread exceeding [mean + 3* standard deviation]) holds. This means that some trades might be open a couple of days, others might be open for weeks or even months.
My question is now: How can i calculate the returns of my overall strategy?
I know how to calculate the returns per trade but when aggregating returns over a certain time period or over all traded pairs I have problems.
Let's say I am trying to calculate returns over 1 year. I could take the average of all the trade returns or calculate sum(profits per trade of each pair)/sum(invested or committed capital per trade), both of these would only give me some average return values.
Most of my single trades are profitable but in the end I am trying to show how profitable my whole trading strategy is, so I would like to compare it to some benchmark, but right now I don't really know how to do that.
One idea I had, was to possibly estimate the average daily return of my trading strategy by:
  1. Estimating daily return per trade: (return of trade)/(number of days that trade was open)
  2. Taking the average of all the daily returns per trade
Then finally I would compare it to the average daily return of an index over the same time frame.
Does this make any sense or what would be a more appropriate approach?
Relevant answer
Answer
I think the idea is good, it makes sense to calculate the average daily return on your trading strategy by:
1. Estimation of daily return on a transaction: (return on a transaction) / (number of days during which the trade was opened)
2. Taking the average of all daily income per transaction
Then compare with the average daily return on the index for the same period of time. In this case, the index is the base or reference and to estimate the deviation from it. You can also test a hypothesis about the differences and whether they are statistically significant.
  • asked a question related to Estimation
Question
3 answers
Can anyone offer sources that offer error (or uncertainty) bands to be applied to the annual energy production (AEP) of wind turbines that are calculated from mean annual wind speeds?
AEPs are reported to carry considerable errors with up to 15% uncertainty on power curve determination and up to 20% on wind resources [1], so if anyone has a source on how to treat the compound uncertainty please can you advise?
[1] Frandsen, S. et. al. Accuracy of Estimation of Energy Production from Wind Power Plants.
Relevant answer
Answer
Dear Mohammad Royapoor,
The terrain complexity, local roughness, the existence of obstacles and the distance of each turbine from the meteorological towers are among the factors that determine the magnitude of uncertainties. The range of uncertainty can be very wide, but a typical range is 3% - 6%.
For more details and information about this subject, I suggest you to see links and attached files on topic.
Best regards
  • asked a question related to Estimation
Question
5 answers
I've got this 5 year Direct Normal Irradiance data I downloaded from https://re.jrc.ec.europa.eu/. I ordered this by year and by every 24 hours because I need the DNI curve by max in summer, min in winter and the average in spring and autum. The thing is that because not all days are sunny, the data is noisy and can't use simple "max" and "min" algorithms. Which algorithms do you recomend to filter and correct data? To achieve this graph, I used LabVIEW to order data.
  • asked a question related to Estimation
Question
2 answers
I'm using lmer4 package [lmer() function] to estimate several Average Models, wich I want to plot their Estimated Coefficients. I found this document, "Plotting Estimates (Fixed Effects) of Regression Models, by Daniel Lüdecke" that explains how to plot Estimates, and it works with Average Models, but uses Conditional Average values insted of Full Average values.
Model Script:
library(lme4)
options(na.action = "na.omit")
PA_model_clima1_Om_ST <- lmer(O.matt ~ mes_N + Temperatura_Ar_PM_ST + RH_PM_ST + Vento_V_PM_ST + Evapotranspiracao_PM_ST + Preci_total_PM_ST + (1|ID), data=Abund)
library(MuMIn)
options(na.action = "na.fail")
PA_clima1_Om_ST<-dredge(PA_model_clima1_Om_ST)
sort.PA_clima1_Om_ST<- PA_clima1_Om_ST[order(PA_clima1_Om_ST$AICc),] top.models_PA_clima1_Om_ST<-get.models(sort.PA_clima1_Om_ST, subset = delta < 2)
model.sel(top.models_PA_clima1_Om_ST) Avg_PA_clima1_Om_ST<-model.avg(top.models_PA_clima1_Om_ST, fit = TRUE) summary(Avg_PA_clima1_Om_ST)
Plot scrip:
library(sjPlot) library(sjlabelled) library(sjmisc) library(ggplot2) data(efc) theme_set(theme_sjplot()) plot_model(Avg_PA_clima1_Om_ST, type="est", vline.color="black", sort.est = TRUE, show.values = TRUE, value.offset = .3, title= "O. mattogrossae")
The plot it creates uses the values of Conditional Average values insted of Full Average values. How can i plot Estimates of Average Models using Full Average values?
Thanks for your time and help
Relevant answer
Answer
Hello, can anyone help me to interpret the GLM results about estimated value, intercept, coefficient, AICc, delta AICc, loglik, weight etc? What do the Null deviance and Residual deviance represent?
  • asked a question related to Estimation
Question
1 answer
Dear community,
I am facing some issues and scientific questionings regarding the dose-response analysis using the drc package from R.
Context :
I want to know if two strains have different responses to three drugs. To do so, I am using the dcr package from R. I then determine the EC50 for each strain regarding each drugs. I later plot my EC50 and use the estimate and standard error to determine if the EC50 is statistically different between strains. For each strain and drug I have four technical replicates and I will have three biological replicates. Visually, the model produced by the package matches my experimental data. However, I am looking for a statistical approach to determine if the model given by drc is not too far from my experimental data. How to know if I can be confident in the model ?
My approach :
I am using mselect() to determine which drc model is the most accurate with my data. However, I do not know how to interpret the results. I read that the higher the logLik is, the best the model describes the data provided. But do you know if a threshold does exist?
For example I have from the mselect() :
> mselect(KCl96WT.LL.4, list(LL.3(), LL.5(), W1.3(), W1.4(), W2.4(), baro5()), linreg = TRUE)
logLik IC Lack of fit Res var
LL.3 101.90101 -195.8020 0 0.0003878212
LL.3 101.90101 -195.8020 0 0.0003878212
W1.3 101.53204 -195.0641 0 0.0003950424
W2.4 102.48671 -194.9734 0 0.0003870905
LL.5 103.05880 -194.1176 0 0.0003869226
W1.4 101.52267 -193.0453 0 0.0004062060
Cubic 101.42931 -192.8586 NA 0.0004081066
baro5 101.98930 -191.9786 0 0.0004081766
Lin 96.45402 -186.9080 NA 0.0004958264
Quad 96.64263 -185.2853 NA 0.0005044474
I also used the glht() element and coeftest(KCl96WT.LL.4, vcov= sandwich). But I am facing the same issue.
> coeftest(KCl96WT.LL.4, vcov= sandwich)
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
b:(Intercept) 4.3179185 0.6187043 6.979 3.024e-08 ***
d:(Intercept) 0.0907908 0.0080186 11.323 1.397e-13 ***
e:(Intercept) 0.9809981 0.0686580 14.288 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Do you know what approach could indicate if I can be statistically confident regarding my model? Can I be mathematically confident in the EC50 given by the package?
Thanks for your time! I am looking forward to discover new ways to be more critical regarding my data analysis. If you have any questions or comment regarding my approach, feel free to ask me !
  • asked a question related to Estimation
Question
1 answer
How might one go about estimating how recently an individual was infected with a given pathogen given antigen and antibody titers relevant to said pathogen?
Relevant answer
Answer
Sounds like an exam question. Go back to your text book and read about the time course of the immune response. Then the answer should be obvious.
  • asked a question related to Estimation
Question
5 answers
What do you think about estimatting the level of automation? How to carry out the process of estimatting the level of automation, autonomy and intellectualization?
It is important to get a quantitative result.
1. Estimate the technical level of the equipment, software;
2. Taxonomy processes / business processes and decide on the level of automation;
3. Estimate the completeness of automation by calculating the proportion of controlled signals from the norm;
4. The share of computer time from the total time of the operation.
Etc.
  • asked a question related to Estimation
Question
4 answers
Is it possible to estimate the shape, scale, and location parameters of a generalized extreme distribution (gev) if I just know the mean, variance, and median of a given data set (i.e., no raw data available - - just its descriptive statistics)?
Relevant answer
Answer
Christopher
You can use the maximum likelihood estimator, or the probabilistic weighted moments, or other methods. The choice depend on a lot of situations inherent to the data. But, first try MLE.
  • asked a question related to Estimation
Question
1 answer
Currently, I'm doing thesis using google earth engine to calculate the days of cessation of storms to CHIRPS Daily Precipitation Image Collection. The code flow is as follows:
Conceptually, it is like this. The first date of the image collection is set as the zero day for the cessation of storms. Coming to the image of next date, it examines the values of precipitation. If the precipitation is zero, the day of cessation of storms is added up one day and still continues adding until the precipitation of coming days becomes non-zero. At that case, the day of cessation of storm starts off with zero day.
The results I've obtained is an image collection in which all the elements inside have the calculated bands with value of zero which shouldn't be like this. I would like some suggestions on this problem, please.
Reference paper: Estimation of snow water equivalent from MODIS albedo for a non-instrumented watershed in Eastern Himalayan Region
Relevant answer
Answer
Very interesting questions.
  • asked a question related to Estimation
Question
2 answers
Dear all,
I would like to estimate prediction accuracy using EBVs (Estimated Breeding Value) that computed from pedigree-based and genomic-based. In the models to estimate those EBVs, I have fitted a number of fixed effects (e.g. age, batch, tank,...), I wonder that if I re-fit those fixed effects in the cross-validation as predictors will lead to be overfitted? If no predictor, how can I do cross-validation between 2 sets of EBVs? Any suggestion?
Thanks and regards,
Vu
Relevant answer
Answer
Dear Mohamed-Mourad Lafifi Thanks for your suggestions.
My problem seems a little bit different since I have used other software to predict y values that can not incorporate into the R/Python environment. Regression models with normal packages for cross-validation in R did not incorporate the relationship/genomic matrix of individuals that I have to use for genetic evaluation. I can still split data by hand into reference and validation sets (5 folds for example) and estimate the accuracy base on the correlation between actual y and predicted y in the validation data set but I can not perform, for example, 3 replicates as it will give the same y values. Kind regards, Vu.
  • asked a question related to Estimation
Question
3 answers
I have been reading this paper on how to analyze linear relationships using a repeated measure ANOVA: .
I was wondering though once you establish a linear relationship across your categorical variables (A, B, C, D) how can you check if the difference across conditions A vs. B vs. C vs. D is also significant?
I have been using pairwise t-tests (A vs. B; B vs. C; C vs. D), but is there a better test to look at this?
Just for completeness, I have been using "ols" from "statsmodels" to check for the linear relationship, and "pairwise_ttests" from "pingouin" to run post-hoc tests in Python.
Relevant answer
Answer
There are contrasts (like Helmert, Deviation, Difference etc), which test several different cotrast. Maybe "Repeated" is what you are looking for, it tests A vs B, B vs C, and C vs D and nothing else. Just look it up.
  • asked a question related to Estimation
Question
9 answers
Is it possible to back-calculate/estimate the amount/dosage of a molecule consumed by looking at ante- or post- mortem toxicology blood levels? If you don't know, can you suggest a contact that might know? (I appreciate there will be issues around blood redistribution and site of sampling, etc.)
Relevant answer
Answer
Pharmacokinetic equations are inverse equations that you can calculate any unknown parameter
  • asked a question related to Estimation
Question
5 answers
I have created three logistic models, model 4, 1 and 2, and calculated AICc values for each. Both model 4, with 2 covariates (location and camera), and model 1, with a single covariate (location), have approximately equivalent AICc values (less than 2 points). In this case one should chose the model with the least parameters, this is model 6 with only location included. However, to make things more confusing the likelihood ratio tests for model 4 vs 1 and model 4 vs 2 suggest that having location and camera in the same model is better than just having location or just camera. This contradicts the AICc values. So which model would you choose? I provide an example below. Thanks in advance.
> # location as a covariate on abundance
> m1 <- occuRN(~1 ~location, final)
> m1
Call:
occuRN(formula = ~1 ~ location, data = final)
Abundance:
Estimate SE z P(>|z|)
(Intercept) 2.01 0.704 2.86 4.24e-03
location2  -2.19 0.547 -4.02 5.94e-05
Detection:
Estimate SE z P(>|z|)
-2.32 0.756 -3.07 0.00215
AIC: 162.7214
> # camera as a covariate on detection
> m2 <- occuRN(~camera ~1, final)
> m2
Call:
occuRN(formula = ~camera ~ 1, data = final)
Abundance:
Estimate SE z P(>|z|)
0.682 0.371 1.84 0.0657
Detection:
Estimate SE z P(>|z|)
(Intercept) -2.589 0.763 -3.392 0.000694
camera2 1.007 0.774 1.301 0.193247
camera3 2.007 0.785 2.557 0.010555
camera4 0.639 0.803 0.796 0.425864
AIC: 178.696
# camera as a covariate on detection, location as covariate on abundance
> m4 <- occuRN(~camera ~location, final)
> m4
Call:
occuRN(formula = ~camera ~ location, data = final)
Abundance:
Estimate SE z P(>|z|)
(Intercept) 2.71 0.319 8.49 2.06e-17
location2  -2.25 0.509 -4.41 1.03e-05
Detection:
Estimate SE z P(>|z|)
(Intercept) -4.050 0.616 -6.571 5.00e-11
camera2 1.030 0.620 1.660 9.69e-02
camera3 1.776 0.613 2.897 3.76e-03
camera4 0.592 0.642 0.922 3.57e-01
AIC: 157.2511
> model_list<-list(null,m4,m2,m1)
> model_names<-c("modelnull","model4","model2","model1")
> modelsel<-aictab(model_list, model_names, second.ord=T)
> modelsel
Model selection based on AICc:
                   K     AICc      Delta_AICc AICcWt     Cum.Wt LL
model4       6     163.25    0.00            0.61       0.61 -72.63
model1       3     164.13    0.88            0.39       1.00 -78.36
modelnull   2     181.06    17.81           0.00      1.00 -88.20
model2      5     182.70    19.44            0.00      1.00 -84.35
Relevant answer
Answer
I have two criteria: the AICC and the SIC. Supposedly the one with the lowest AICC or SIC value is chosen, but which is better?
  • asked a question related to Estimation
Question
2 answers
Dear All
Can you help me clarify the basic difference among the following command in STATA. xtmg, xtpmg, xtcce and xtdcce, xtdcce2. When and which method we should choose to estimate MG, AMG, PMG, CCE etc.
Relevant answer
Answer
Thanks for your help Prof. Mohamed-Mourad Lafifi
  • asked a question related to Estimation
Question
1 answer
Estimation of Protein Concentration by Spectrophotometry
Relevant answer
Answer
I would rather prefer buffers such as PBS, TBS, PBS-T, and TBS-T for protein estimation and have my proteins also diluted in the same buffers. And try to not make it a foamy dilution.
Thank you.
Varsha
  • asked a question related to Estimation
Question
6 answers
I'm estimating a model using fixed effects in Stata, and when I use xtreg the time-invariant independent variables drop out.  But when I use individual dummies for fixed effects, the time-invariant dummies don't drop out.  I've double-checked the data and the time-invariant variables are truly time-invariant.  I don't understand what's going on here.
Relevant answer
Answer
@Ahmed Shah Mobariz: If a time invariant variable is your variable of interest, you can change the cross-sectional unit across which fixed effects is applied. For instance, if you are currently working on firm-level data and firm is your cross-sectional unit, you can apply industry fixed effects (instead of firm-fixed effects) by using the regress command and including industry dummies (i.Industry). Further, you may use other estimators as well like Fama-Macbeth (see the command asreg) or between effects estimator. However, both these estimators do not consider time and cross-sectional variation in data simultaneously and hence do not capture the essence of a panel data set.
  • asked a question related to Estimation
Question
13 answers
How to Identify and Estimate the Lactic acid from the Microbial fermented broth other than HPLC using Simple methods?
if possible such as
1. Calorimetric or Spectroscopic methods
and much more
Relevant answer
Answer
8/31/20
Dear Vignesh,
Another way to quantitate lactic acid in a fermentation is by measuring the amount of NH4OH required to maintain the fermentation at a pre-determined pH value. See the attached document. We did this when we conducted a fermentation workshop several years ago. We fermented glucose w/ Lactobacillus casei (homo-fermentative) and monitored glucose consumption and lactic acid production over time. Lactic acid was determined using an automated enzyme-based method (YSI....very expen$ive instrument!!) and by titrating the culture fluid w/ NH4OH to maintain the pH (either 4.5 or 5.0) to allow the cells to continue growing. We compared moles Lactic Acid (LA) determined enzymatically (YSI) vs. NH4OH titration. The attached graph shows that the correspondence of LA production by the 2 methods was very good. The NH4OH measurement method is not very elegant, but it works. If may work for you assuming that the microbe you are using produces ONLY lactic acid, and not other acids. We could do this w/ L. casei as it produces only LA. It won't work if your microbe is a MIXED ACID fermenter.
I hope this information helps you.
Bill Colonna Dept. Food Science & Human Nutrition, Iowa State University, Ames, Iowa, USA wcolonna@iastate.edu
  • asked a question related to Estimation
Question
4 answers
Is there a method for estimating the OD of an E. coli culture without using the spectrophotometer? [I need to estimate an OD600 = 0,4 - 0,8 for electrocompetent e coli]
Relevant answer
Answer
You can use a set of McFarland turbidity standards, which are commercially available.