Science topics: Control Systems EngineeringEstimation

Science topic

# Estimation - Science topic

Explore the latest questions and answers in Estimation, and find Estimation experts.

Questions related to Estimation

Can it become linear, then linear regression could be done.

Should I estimate using non linear estimates?

Say one has some devices to measure the temperature of a room. The devices don't provide me with an accurate measurement. Some overshoot the actual value of the reading, others underestimate it. Using this set of inaccurate readings, is it possible for me to obtain a reading having high accuracy?

Estimation of the number of acceptor molecules surrounding a given donor in the (Forster resonance energy transfer) FRET system

Which commands are used for dynamic panel logit/probit model estimation in Stata?

How to Estimate the Total flavonoid content from endophytic fungal extract?

What is the best dynamic panel model when T>N ?

Hi,

Does anybody know how to extract the slopes' effect size of classes in Latent Class Growth Analysis using Mplus?

Thanks

Greetings!

I am conducting a series of CFAs in R using the 'lavaan' package. I am interested in estimating the correlations between the factors taking my measuerment model into account instead of going back to the raw data and summing the items representing each factor. In the lavaan output, I can find the covariances only. I can turn the covariances to correlations by dividing them by the product of the standard deviations, but due to the number of CFAs and the number of factors, I am wondering if there is a more streamlined way in the lavaan syntax to do that (Or in another SEM package in R).

Any help would be greatly appreciated!

(* I checked the lavaan commands. I did not find something but I am fairly new to the package and I might have missed it. Furthermore, the only option in 'lavaan.Plot' is to plot covariances)

(**Thought: I used variance standardization method in an example and marker method in another. In the variance standardization method, the "estimate" collumn and the "std.all" collumn of my covariance table were identical. In the marker method they were not. Im thinking that standardized covariances in lavaan should be the correlations and they are identifiable using a marker method only. Or not?"

Need help regarding: Estimation of Soil Moisture Regimes using Remote Sensing. Either EIPC Model fits best? or there is something else that can be considered.

I am working on a research point that employs estimation techniques.
I am trying to apply an algorithm in my work to estimate system poles. I wrote an m-file and tried to apply this technique on a simple transfer function to estimate its roots .any suggestions about estimation techniques ?

Hi, I am looking for available potassium quantification. But the suggested protocol said to have flame photometer, that I don't.

Please suggest me a good method for estimating available potassium of my soil samples without using flame photometer. Also suggest me a good method for estimating Calcium and magnesium.

Thank You

Dear Researcher,

I have the following MATLAB codes that generate FMCW signal. However, I have two basic problem with code I appreciate it if you can help/guide me to resolve them:

1. Based on my understanding, this code generates FMCW for one Target as the dimension of the sig is 1 x N which must must be L x N (L is the number of target)

2. the Dechirped signal, which is Analog, at the receiver have to be converted to digital in my algorithm

Note/ I want to apply this time of signal (FMCW) to Direction of Arrival Estimation (DOA) algorithm

Again I highly appreciate your time and consideration to help me to overcome these uncertainness.

%%CODES for generating FMCW signal%%%%%%%%

% Compute hardware parameters from specified long-range requirements

fc = 77e9; % Center frequency (Hz)

c = physconst('LightSpeed'); % Speed of light in air (m/s)

lambda = freq2wavelen(fc,c); % Wavelength (m)

% Set the chirp duration to be 5 times the max range requirement

rangeMax = 100; % Maximum range (m)

% In general, for an FMCW radar system, the "sweep time" should be at least five to six times the round trip time

tm = 5*range2time(rangeMax,c); % Chirp duration (s)=Symbol duration (Tsym)

% Determine the waveform bandwidth from the required range resolution

rangeRes = 1; % Desired range resolution (m)

bw = rangeres2bw(rangeRes,c); % Corresponding bandwidth (Hz)

% Set the sampling rate to satisfy both the range and velocity requirements for the radar

sweepSlope = bw/tm; % FMCW sweep slope (Hz/s)

fbeatMax = range2beat(rangeMax,sweepSlope,c); % Maximum beat frequency (Hz)

vMax = 230*1000/3600; % Maximum Velocity of cars (m/s)

fdopMax = speed2dop(2*vMax,lambda); % Maximum Doppler shift (Hz)

fifMax = fbeatMax+fdopMax; % Maximum received IF (Hz)

fs = max(2*fifMax,bw); % Sampling rate (Hz)

% Configure the FMCW waveform using the waveform parameters derived from the long-range requirements

waveform = phased.FMCWWaveform('SweepTime',tm,'SweepBandwidth',bw,...

'SampleRate',fs,'NumSweeps',2,'SweepDirection','Up');

% if strcmp(waveform.SweepDirection,'Down')

% sweepSlope = -sweepSlope;

% end

N=tm*fs; % Number of fast-time samples

Nsweep = 192; % Number of slow-time samples

sigTx = waveform();

for i=1:K

doas_rad=AOA_Degree*pi/180;

A=exp(-1i*2*pi*d*(0:M-1)'*sin([doas_rad(:).']));

sigRx=A*sigTgt';

sigRx=sigRx+awgn(sigRx,SNR);

%DeChirped and conevrt it to Digital

% DesigRx=dechirp(sigRx,sigREF);

DechirpedSignal= sigTgt .* conj(sigRx);

end

####(I also posted this on SO https://stackoverflow.com/q/71531275/16505198and SE https://stats.stackexchange.com/q/568112/340994 but didn't receive any answer until now. So here's another chance :-) However the code snippets might be more readable there... )#####

Hello,

I am estimating an ordinal logistic regression under the assumption of proportional odds with the ordinal::clm() function. As a RE see this model from the "housing" dataset (`MASS::housing`):

```

clm(Sat~Type*Cont + Freq, data = housing, link = "probit") %>% S

formula: Sat ~ Type * Cont + Freq

data: housing

Coefficients:

Estimate Std. Error z value Pr(>|z|)

TypeApartment -0.14387 0.54335 -0.265 0.791

TypeAtrium 0.20043 0.55593 0.361 0.718

TypeTerrace 0.18246 0.55120 0.331 0.741

ContHigh 0.05598 0.53598 0.104 0.917

Freq 0.01360 0.01116 1.219 0.223

TypeApartment:ContHigh -0.25287 0.78178 -0.323 0.746

TypeAtrium:ContHigh -0.17201 0.76610 -0.225 0.822

TypeTerrace:ContHigh -0.18917 0.76667 -0.247 0.805

Threshold coefficients:

Estimate Std. Error z value

Low|Medium -0.1130 0.4645 -0.243

Medium|High 0.7590 0.4693 1.617

```

If I want to test if the main effect and the interaction term are (simultaneously!) significant I used the glht function where I test the hypothesis that (bold for matrices or vectors) $\boldsymbol{\beta} \cdot \boldsymbol{K} = \boldsymbol{m}$.

So If I'd like to test if living in an apartment (main effect) **plus** the interaction of living in an apartment and having high contact is significantly different from zero it would be $(0; 0; 1; 0; 0; 0; 0; 1; 0;0 )\cdot \boldsymbol{\beta} = (0;0;...;0)$. (Assuming the two thresholds as intercepts and thus the first two estimates).

Is it right to test:

```

glht(mod, linfct = c("TypeApartment +TypeApartment:ContHigh =0")) %>% summary()

Simultaneous Tests for General Linear Hypotheses

Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")

Linear Hypotheses:

Estimate Std. Error z value Pr(>|z|)

TypeApartment + TypeApartment:ContHigh == 0 -0.3967 0.6270 -0.633 0.527

(Adjusted p values reported -- single-step method)

```

or do I have to use:

```

glht(mod, linfct = c("TypeApartment= 0", "TypeApartment:ContHigh =0")) %>% summary()

Simultaneous Tests for General Linear Hypotheses

Fit: clm(formula = Sat ~ Type * Cont + Freq, data = housing, link = "probit")

Linear Hypotheses:

Estimate Std. Error z value Pr(>|z|)

TypeApartment == 0 -0.1439 0.5434 -0.265 0.946

TypeApartment:ContHigh == 0 -0.2529 0.7818 -0.323 0.921

(Adjusted p values reported -- single-step method)

```

Thanks a lot in advance I hope I posed the question right and understandable :-) If you have other options to test if a main effect and an interaction term are significant go ahead and tell me (and the others).

Thanks, Luise

I want to know which weight is better to estimate carbon. Dry weight or wet weight?

And what is the conversion formula?

(E1U) (E2G) (E2G) (B2U) (A1G) (E1U) (E1U) (E2G)

(E2G) (B2U)

Requested convergence on RMS density matrix=1.00D-08 within 128 cycles.

Requested convergence on MAX density matrix=1.00D-06.

Requested convergence on energy=1.00D-06.

No special actions if energy rises.

SCF Done: E(RB3LYP) = -13319.3349271 A.U. after 1 cycles

Convg = 0.2232D-08 -V/T = 2.0097

Range of M.O.s used for correlation: 1 5424

NBasis= 5424 NAE= 1116 NBE= 1116 NFC= 0 NFV= 0

NROrb= 5424 NOA= 1116 NOB= 1116 NVA= 4308 NVB= 4308

PrsmSu: requested number of processors reduced to: 4 ShMem 1 Linda.

PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.

PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.

PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.I

.

.

.

PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.

PrsmSu: requested number of processors reduced to: 1 ShMem 1 Linda.

Symmetrizing basis deriv contribution to polar:

IMax=3 JMax=2 DiffMx= 0.00D+00

G2DrvN: will do 1 centers at a time, making 529 passes doing MaxLOS=2.

Estimated number of processors is: 3

Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.

Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.

CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.

Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.

CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.

.

.

.

Calling FoFCou, ICntrl= 3107 FMM=T I1Cent= 0 AccDes= 0.00D+00.

CoulSu: requested number of processors reduced to: 4 ShMem 1 Linda.

Erroneous write. Write 898609344 instead of 2097152000.

fd = 4

orig len = 3177921600 left = 3177921600

g_write

I asked about variables X and Y to the same respondents for two different product types. I mean, respondents firstly answered the items for Product1 then for Product2. In AMOS, I have two different models (Product1-Product2). We proposed that the effect of X on Y is stronger for product1. We have two Std. Estimates. The first beta is greater than the second one. Is this finding sufficient to confirm the hypothesis? Or do I still need the Chi-square difference test or something like it to prove the statistical significance? If not, what should I do for the statistical difference between these two betas?

Thanks ahead

An article March 10, 2020 on Annals of Internal Medicine,

**The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application**, estimates a median incubation period of 5.8 days "(95% CI, 4.5 to 5.8 days), and 97.5% of those who develop symptoms will do so within 11.5 days (CI, 8.2 to 15.6 days) of infection."Are there any updated estimates or more recent reports?

I AM USING PANEL DATA, I WANT TO ESTIMATE THE IMPACT OF REGULATION ON FIRMS' INNOVATION THROUGH DID, PSM-DID APPROACHES, I CAN ABLE TO CALCULATE DID BUT NOT PSM-DID FOR PANEL DATA. PLEASE ANYONE CAN EXPERIENCE.

Good evening everyone,

I am using time-series data in ARDL model in which, I have 1 dependent, 1 independent, 1 control and 2 dummy variables with interaction for analysis in E-Views 10 version software. But I didn't understand where to place control variable in the list of dynamic or fixed regressors of ARDL Estimation equation. Please guide me with your knowledge and experience. I will be very thankful to you.

I want to evaluation of phytochemical and morphological studies on Iranian willow (

*Salix.)*species. But I need to be the same age as the trees. Is there a way to measure the age of willow trees without cutting down?To generate a PCA from lcWGS SNPs, one may use ANGSD to generate genotype likelihoods and then use these as input to generate a covariance matrix using PCAngsd.

The covariance matrix generated by PCAngsd is a n x n matrix where n is the number of samples and p is the number of SNPs (variables). According to the [PCAngsd tutorial](http://www.popgen.dk/software/index.php/PCAngsdTutorial#Estimating_Individual_Allele_Frequencies), the principal components (i.e. the samples plotted in the space defined by the eigenvectors) can be generated directly from this covariance matrix by eigendecomposition.

This is in contrast to the 'usual' way that PCA is done (via a covariance matrix), where a p x p (

*not*n x n) covariance matrix*is generated from a centered n x p data matrix***C***. Eigendecomposition of***X***, then generates the eigenvectors and eigenvalues. The transformed values of***C***into the space defined by the eigenvectors (i.e. the principal components) can then be generated through a linear transformation of***X***with the eigenvectors (e.g. see [this](https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca) excellent example).***X**The difference between the two methods appears to lie in the covariance matrix. With the PCAngsd method, the covariance matrix is n x n as apposed to the 'usual' p x p matrix.

So what is the difference between these two covariance matrices, and what is generated by the eigendecomposition of an n x n matrix? Is it really the sample principal components, or something else?

Dear Ladies and Gentlemen,

I am currently working as a Master's student on several publications that methodically rely on latent profile analyses. In the context of these LPAs, I have repeatedly encountered the problem, that calculating the BLRT test with the help of Mplus (TECH14) is both time-consuming and often unsuccessful. In this specific case, an LPA with 30 Likert-scaled items (1-5) of the "Big Five" (OCEAN model) with a sample size of N= 738 was conducted. I would be interested to know, which approach you prefer in your research work.

Q1: Do you increase the number of bootstraps and LRT starts as the number of profiles increases, or do you exclude the BLRT test when you encounter error messages and instead refer to the Loglik value, LMR test, and the BIC fit indicator?

So far I have tried the following settings for TECH14 according to the recommendations of the forum entries by Muthen & Muthen:

LRTSTARTS 0 0 100 20 / LRTSTARTS 0 0 500 50.

Both of these options will result in unsuccessful bootstrap draws if more than three profiles are calculated for the model.

Q2: Do you treat your Likert scaled items as interval scaled variables and use the MLR Estimator or do you treat your indicator items as ordinal variables and use the WLSMV Estimator?

In my case, attributing the items as categorical with the WLSMV Estimator leads already with two profiles to a "non-positive definite first-order derivative product matrix".

There seem to be conflicting opinions here. Brown (2006) writes "The WLSMV is a robust estimator which does not assume normally distributed variables and provides the best option for modeling categorical or ordered data".

On the other hand, Bengt.O. Muthen (2006) :

The most important aspect is how strong floor and/or ceiling effects you have. If they are strong, you may want to use a categorical approach.

Q3: Would any of you be willing to cross-check my syntax for comparing distal outcomes with the BCH approach? (See appendix)

Thanks in advance for your help.

Philipp Schulz

References:

Brown, T. (2006). Confirmatory factor analysis for applied research. New York: Guildford.

Bengt. O. Muthen (2006): http://www.statmodel.com/discussion/messages/13/1236.html?1145023154

This study was carried out in order to find out the level of sheep’s meat, liver and kidney contamination by heavy metals such as: copper, lead, zinc, cadmium and cobalt in different areas of al Sulaimanyah Governorate in comparison with international allowed levels. For the above purpose; three samples of (meat, liver and kidney) were taken in three different

districts of al Sulaimanyah Governorate were covered: Said Sadiq, Dokan district and sulaimanyah city center. The samples were collected during October and November 2020. The triple interference (factors) has affected significantly, the amount of copper in the different sheep tissues; so the amount was varied with the difference of tissue, the place and the time of taking the sample. The highest level of copper in Liver’s tissue was recorded in Dokan district during November, while the lowest level of copper in the meat tissue was recorded in Said Sadiq district during November. The triple interference for the study factors, also affected the level of Zinc in different sheep tissue were the amount varied by tissue difference, place and the time of sample taking. Highest level of Zinc was recorded in kidneys tissue, in Sulaimanyah city Centre during October, while less amount of Zinc was recorded in liver’s tissue in Said Sadiq during October. The triple interference within the study’s factor, significantly affected the amount of cadmium. The amounts were varied by difference of tissue, place and time of taking the samples. Highest level of cadmium was recorded in the meat tissue, at Sulaimanyah city Centre during October, while less amount of Cadmium was recorded in liver’s tissue, Said Sadiq district during October. The triple interference did not affect significantly the amount of Lead and Cobalt in different sheep’s tissue.

Are there any alternative techniques for ethanol estimation other than HPLC and GC?

As far as I know, there are four steps for recommending the N fertilizer rates based on NDVI readings and grain yield at N rich strip and farmer practice plot.

The steps are as follows:

1- Estimating Response Index (RI) RI=Ypn/Ypo

2- Estimating Ypo (yield at farmer practice strip)

3- Estimating Ypn (yield at N rich strip) based on RI, i.e., Ypn=Ypo*RI

4- N fertilizer rate recommendation using the formula:

NFRate=(N uptake at Ypn - N uptake at Ypo)/NUE

If the above-mentioned steps are correct, I would like to know to estimate Ypo what variables I should use as an independent variable? NDVIo (NDVI at farmer practice) or INSEY (In-season Estimated Yield based on NDVI, i.e., NDVI divided by days from planting to sensing)?

Example

Dominant Biceps Strength = 70%

Non-dominate Biceps Strength = 50%

Estimated loading prescription

for dominant Biceps as following :

1kg/ 8 times of Rep/ 30s/ 5 Sets/ 3m/ 3 sessions/week. Strengthening exercises

Kindly find out the loading prescription of non-dominate Biceps and the reason for choosing this loading ?

I am in search of methods of quantitative and qualitative estimation of cellulase in a food sample(biological sample)?

Are there any references for Estimated Average Requirement (for minerals and vitamins) of infants less than 6 months?

Hi everyone

I'm looking for a quick and reliable way to estimate my missing climatological data. My data is daily and more than 40 years. These data include the minimum and maximum temperature, precipitation, sunshine hours, relative humidity and wind speed. My main problem is the sunshine hours data that has a lot of defects. These defects are diffuse in time series. Sometimes it encompasses several months and even a few years. The number of stations I work on is 18. Given the fact that my data is daily, the number of missing data is high. So I need to estimate missing data before starting work. Your comments and experiences can be very helpful.

Thank you so much for advising me.

logistic regression type of distribution estimation

what is the connection between Estimation and Distribution in machine learning

Hello dear

There is a significant variation between official statistics of FAO, NRCS, and HWSD about the amount of global SOC over 0-100 cm depth of soils around the world. all of them have studied same areas of lands with almost same methods and even more, they shared database with each other. For example, one of the main resources of HWSD to calculate global SOC comes from FAO. However, there is a considerable difference between them:

total global SOC over 0-100 cm depth:

HWSD: 2469 Pg

NRCS: 1399 Pg

FAO: 1459 Pg

references:

1- Hiederer, R. and M. Köchy1 (2011) Global Soil Organic Carbon Estimates and the Harmonized World Soil Database. EUR 25225 EN. Publications Office of the European Union.79pp

2- Hiederer, R. Kochy, M. 2012. Global Soil Organic Carbon Estimates and the Harmonized World Soil Database. EUR Scientific and Technical Research series â€ “ISSN 1831-9424 (online), ISSN 1018-5593 (print), ISBN 978-92-79-23108-7, doi:10.2788/1326.

3- 1. Köchy, M. Hiederer, R. Freibauer, A. 2015. Global distribution of soil organic carbon – Part 1: Masses and frequency distributions of SOC stocks for the tropics, permafrost regions, wetlands, and the world. SOIL, 1: 351–365. http://www.soil-journal.net/1/351/2015/

I would like to reunite those interested in estimating using different approach the true underlying number of SARS-CoV-2 infected individuals. This is important since this number give us an idea about those undetected individuals spreading the infection and causing deaths amongst the elderly and individuals with preexisting health conditions.

Hi, I am looking to estimate a willingness to pay for a choice model, but for the given alternatives compared to a base rather than for the product attributes variables. I have run a mixed logit model and have coefficients for the attribute variables followed by coefficients for case-specific variables with each different option.

Anyone's can suggest me which optimization / estimation techniques is good for solar photovoltaic cells.

In irrigation science, the net depth of irrigation (NDI) is estimated from the equation

NDI=RZD*WHC*PD%

RZD: root zone depth, mm

WHC: water holding capacit, mm water/cm soil profile

PD%: percentage of depletion %

In genera, PD is limited between 40-60%.

If there is any other acceptable ratio of depletion in any irrigation system, please inform me.

Best Regards

I am a bit warry about asking this question regarding my ignorance and I am also not to sure if this question might bring out some emotional response.

Assume I want to have an estimate on the average R-squared and variability from literature for specific models y~x. I found 31 literature sources.

The questions are twofold :

1.) Can I shift the simulate of an ABC-rejection algorithm acting like it come from indeed from my target (see the first 4 figures)?

The parameter in this case is the draw from the prior deviating from the target and then shift it so it fits.

2.) I applied 4 methods in this case ABC-rejection (flat prior not really preferred), Bayesian bootstrap, Classical bootstrap and a one sided T-test (lower 4 figures). From all methods I extracted the 2.5-97.5% intervals. Given the information below, is it reasonable to go for Bayesian bootstrap in this case?

As sometimes suggested on RG and hidden deeply in some articles the intervals of the different intervals converse and are more-or-less-ish similar. However, I do have another smaller dataset which is also skewed. So I would personally prefer the Bayesian bootstrap as it smooths out and the extreme exactness in this case does not matter to much to me. Based on these results my coarse guestimate of the average variability would range from ~20-30% (To me it seems pot`a`to - pot`aa`to for either method disregarding the philosophical meaning). I also would like to use the individual estimates returned each bootstrap and technically it is not normal distributed (Beta-ish), although this does not seem to matter much in this pragmatic case.

Thank you in advance.

Hi,

Even though Artemis gives an error bar on the EXAFS fitting but it assumes the value to be zero at certain R values.

In the case of data with some experimental noise, the error bar needs to be corrected, weighted by the square root of the reduced chi-squared value, taking into account the experimental noise for each R-space spectrum from 15 to 25 Å, as described in

How do calculate the uncertainties in the coordination number when EXAFS is fitted using Artemis?

Thanks in advance

for calculation of standardized residuals is it true?

Volatility= Estimated from MSGARCH package in R

Residuals= abs(returns)-volatility;

standardized residuals=Residuals/volatility;

I am running K estimator for a gene family. I found that the K estimator is only working for certain gene combinations. It failed to estimate Ka and Ks values for some gene combinations. Is it possible that the other combinations are distantly related?

I would like to measure the air pollution caused due to stone mines in the surrounding areas. So for this, what is the best method to identify sampling points.

Dear all,

The complex I am trying to simulate has protein ( 10 chains), DNA, and RNA
I ran the simulation for 2 us so far but the biological event I want to observe requires a very long MD simulation. I decided to accelerate the sampling using the AWH method. I am not familiar enough with the AWH method or other methods such as umbrella sampling. I have watched the two webinars on AWH but I am still not sure about the mdp options.
Attached is the mdp I used and the error I got!

I believed that I need to choose a reference atom for pulling, But on what basis?

pull = yes ; The reaction coordinate (RC) is defined using pull coordinates.
pull-ngroups = 12 ; The number of atom groups needed to define the pull coordinate.
pull-ncoords = 7 ; Number of pull coordinates.
pull-nstxout = 1000 ; Step interval to output the coordinate values to the pullx.xvg.
pull-nstfout = 0 ; Step interval to output the applied force (skip here).

pull-group1-name = Protein_chain_A ; Name of pull group 1 corresponding to an entry in an index file.
pull-group2-name = Protein_chain_B ; Same, but for group 2.
pull-group3-name = Protein_chain_C
pull-group4-name = Protein_chain_D
pull-group5-name = Protein_chain_E
pull-group6-name = Protein_chain_F
pull-group7-name = Protein_chain_G
pull-group8-name = Protein_chain_H
pull-group9-name = Protein_chain_I
pull-group10-name = Protein_chain_J
pull-group11-name = RNA
pull-group12-name = DNA

pull-group1-pbcatom = 0
pull-group2-pbcatom = 0
pull-group3-pbcatom = 0
pull-group4-pbcatom = 0
pull-group5-pbcatom = 0
pull-group6-pbcatom = 0
pull-group7-pbcatom = 0
pull-group8-pbcatom = 0
pull-group9-pbcatom = 0
pull-group10-pbcatom = 0
pull-group11-pbcatom = 0
pull-group12-pbcatom = 0

pull-coord1-groups = 1 2 ; Which groups define coordinate 1? Here, groups 1 and 2.
pull-coord2-groups = 3 4
pull-coord3-groups = 5 6
pull-coord4-groups = 7 8
pull-coord5-groups = 9 10
pull-coord6-groups = 11 12
pull-coord7-groups = 11 12

pull-coord1-geometry = distance ; How is the coordinate defined? Here by the COM distance.
pull-coord1-type = external-potential ; Apply the bias using an external module.
pull-coord1-potential-provider = AWH ; The external module is called AWH!

awh = yes ; AWH on.
awh-nstout = 50000 ; Step interval for writing awh*.xvg files.
awh-nbias = 1 ; One bias, could have multiple.
awh1-ndim = 1 ; Dimensionality of the RC, each dimension per pull coordinate. pull-coord1-groups
awh1-dim1-coord-index = 1 ; Map RC dimension to pull coordinate index (here 1–>1)
awh1-dim1-start = 0.25 ; Sampling interval min value (nm)
awh1-dim1-end = 0.70 ; Sampling interval max value (nm)
awh1-dim1-force-constant = 128000 ; Force constant of the harmonic potential (kJ/(mol*nm^2))
awh1-dim1-diffusion = 5e-5 ; Estimate of the diffusion (nm^2/ps),used to initial update size, how quezly the system moves
awh1-error-init = 5 ; Estimate of the error of diffusion , used to set initial update size
awh-share-multisim = yes ; Share bias across simulations
awh1-share-group = 1 ; Non-zero share group index

ERROR 12 [file AWH3.mdp]:
When the maximum distance from a pull group reference atom to other atoms
in the group is larger than 0.5 times half the box size a centrally
placed atom should be chosen as pbcatom. Pull group 12 is larger than
that and does not have a specific atom selected as reference atom.

Any advice you could give would be much appreciated

Thank you so much!

Amnah

I want to compare the efficiencies of different machine learning techniques for Structural reliability and surrogate modelling. Although my problem is specific, I think there should be well-known criteria for that. Unfortunately, I have not found any in the literature!

One simple idea is to multiply accuracy by the number of calls or time, but it is really not a very proper criterion most of the time.

How can we define good criteria? is it a good idea to go with a weighted multiplication based on a specific objective? or is there any well-known method for making this comparison?

I appreciate your help with this challenge!

I am implementing an unscented kalman filter for parameter estimation and I found its performances strongly related to the initialization of the parameter to estimate.

In particular, I don't understand why if I initialize the parameter below the reference value (the actual value to estimate) I get good performances, but if the parameter initialization is above the reference value, performances are very low and the estimation does not converge but continues increasing.

In other words, it seems that the estimation can only make the parameter increase. Is there any mathematical reason behind this?

I am trying to estimate nitrite from human serum using Griess reagent from Sigma. Sodium nitrite standards are used. Estimation of nitrite is done at 540 nm. Serum was processed in the following different ways before estimation.

1) Serum deproteinized with 92 mM Zinc Sulphate.

2) Serum deproteinized with 40% ethanol.

3) Serum deproteinized and reduced with Vanadium Chloride III.

I'm unable to measure nitrite from processed and unprocessed serum samples as suggested in literature. Kindly suggest your views.

The Nyquist-Shannon theorem provides an upper bound for the sampling period when designing a Kalman filter. Leaving apart the computational cost, are there any other reasons, e.g., noise-related issues, to set a lower bound for the sampling period? And, if so, is there an optimal value between these bounds?

I want to estimate TOA and CFO of LTE signal. TOA is estimated for oversampled signal. after TOA estimation, the CFO is estimated. Please let me know, why the estimated CFO is wrong?

I have designed the mathematical model of the plant with nonlinear hystersis function f(x1) and is validated using simulation. Now I want to design the nonlinear observer to esttimate the speed (x2). Not that I have also modeled the nonlinear function in the model.

My state space model of the plant is

x1_dot = x2

x2_dot = q*x1 + c*x2 + f(x1) + u

Please suggest suitable observer to estimate the angular speed x2.

x1 is the angular position of the plant.

Anyone if knows about the estimation of population parameters for panel data, kindly recommend me some literature or web links?

What is the difference between Maximum Likelihood Sequence Estimation and Maximum Likelihood Estimation? Which one is a better choice in case of channel non-linearities? And why and how oversampling helps in this?

Hello,

using Cholesky decomposition in the UKF induces the possibility, that the UKF fails, if the covariance matrix P is not positiv definite.

Is this a irrevocable fact? Or is there any method to

**completely**bypass that problem?I know there are some computationally more stable algorithms, like the Square Root UKF, but they can even fail.

Can I say, that problem of failing the Cholesky decomposition occurs only for bad estimates during my filtering, when even an EKF would fail/diverge?

I want to understand if the UKF is not only advantagous to the EKF in terms of accuarcy, but also in terms of stability/robustness.

Best regards,

Max

In Bayesian Inference, we have to choose a prior distribution of parameter for finding Bayes estimate which depends upon our belief and experience.

**I would like to know what are steps or rule we should follow for taking a prior distribution of a parameter.**Please help me with the same so that I can proceed.

I am currently working on a project where I am testing a Pairs Trading Strategy based on Cointegration. In this strategy I have 460 possible stock pairs to choose from every day over a time frame of 3 years. I am using daily Cointegration test and get trade signals based off of that to open or close a trade. According to this strategy I am holding my trades open until either the take profit condition (revert to mean of spread) or stop loss condition (spread exceeding [mean + 3* standard deviation]) holds. This means that some trades might be open a couple of days, others might be open for weeks or even months.

My question is now:

**How can i calculate the returns of my overall strategy?**I know how to calculate the returns per trade but when aggregating returns over a certain time period or over all traded pairs I have problems.

Let's say I am trying to calculate returns over 1 year. I could take the average of all the trade returns or calculate

*sum(profits per trade of each pair)/sum(invested or committed capital per trade)*, both of these would only give me some average return values.Most of my single trades are profitable but in the end I am trying to show how profitable my whole trading strategy is, so I would like to compare it to some benchmark, but right now I don't really know how to do that.

One idea I had, was to possibly estimate the average daily return of my trading strategy by:

- Estimating daily return per trade: (return of trade)/(number of days that trade was open)
- Taking the average of all the daily returns per trade

Then finally I would compare it to the average daily return of an index over the same time frame.

Does this make any sense or what would be a more appropriate approach?

Can anyone offer sources that offer error (or uncertainty) bands to be applied to the annual energy production (AEP) of wind turbines that are calculated from mean annual wind speeds?

AEPs are reported to carry considerable errors with up to 15% uncertainty on power curve determination and up to 20% on wind resources [1], so if anyone has a source on how to treat the compound uncertainty please can you advise?

[1] Frandsen, S. et. al.

*Accuracy of Estimation of Energy Production from Wind Power Plants.*I've got this 5 year Direct Normal Irradiance data I downloaded from https://re.jrc.ec.europa.eu/. I ordered this by year and by every 24 hours because I need the DNI curve by max in summer, min in winter and the average in spring and autum. The thing is that because not all days are sunny, the data is noisy and can't use simple "max" and "min" algorithms. Which algorithms do you recomend to filter and correct data? To achieve this graph, I used LabVIEW to order data.

I'm using lmer4 package [lmer() function] to estimate several Average Models, wich I want to plot their Estimated Coefficients. I found this document, "Plotting Estimates (Fixed Effects) of Regression Models, by Daniel Lüdecke" that explains how to plot Estimates, and it works with Average Models, but uses Conditional Average values insted of Full Average values.

Model Script:

library(lme4)

options(na.action = "na.omit")

PA_model_clima1_Om_ST <- lmer(O.matt ~ mes_N + Temperatura_Ar_PM_ST + RH_PM_ST + Vento_V_PM_ST + Evapotranspiracao_PM_ST + Preci_total_PM_ST + (1|ID), data=Abund)

library(MuMIn)

options(na.action = "na.fail")

PA_clima1_Om_ST<-dredge(PA_model_clima1_Om_ST)

sort.PA_clima1_Om_ST<- PA_clima1_Om_ST[order(PA_clima1_Om_ST$AICc),]
top.models_PA_clima1_Om_ST<-get.models(sort.PA_clima1_Om_ST, subset = delta < 2)

model.sel(top.models_PA_clima1_Om_ST)
Avg_PA_clima1_Om_ST<-model.avg(top.models_PA_clima1_Om_ST, fit = TRUE)
summary(Avg_PA_clima1_Om_ST)

Plot scrip:

library(sjPlot)
library(sjlabelled)
library(sjmisc)
library(ggplot2)
data(efc)
theme_set(theme_sjplot())
plot_model(Avg_PA_clima1_Om_ST, type="est", vline.color="black", sort.est = TRUE, show.values = TRUE, value.offset = .3, title= "O. mattogrossae")

The plot it creates uses the values of Conditional Average values insted of Full Average values. How can i plot Estimates of Average Models using Full Average values?

Thanks for your time and help

Dear community,

I am facing some issues and scientific questionings regarding the dose-response analysis using the drc package from R.

Context :

I want to know if two strains have different responses to three drugs. To do so, I am using the dcr package from R. I then determine the EC50 for each strain regarding each drugs. I later plot my EC50 and use the estimate and standard error to determine if the EC50 is statistically different between strains. For each strain and drug I have four technical replicates and I will have three biological replicates. Visually, the model produced by the package matches my experimental data. However,

**I am looking for a statistical approach to determine if the model given by drc is not too far from my experimental data. How to know if I can be confident in the model ?**My approach :

I am using mselect() to determine which drc model is the most accurate with my data. However, I do not know how to interpret the results. I read that the higher the logLik is, the best the model describes the data provided. But do you know if a threshold does exist?

For example I have from the mselect() :

> mselect(KCl96WT.LL.4, list(LL.3(), LL.5(), W1.3(), W1.4(), W2.4(), baro5()), linreg = TRUE)

logLik IC Lack of fit Res var

LL.3 101.90101 -195.8020 0 0.0003878212

LL.3 101.90101 -195.8020 0 0.0003878212

W1.3 101.53204 -195.0641 0 0.0003950424

W2.4 102.48671 -194.9734 0 0.0003870905

LL.5 103.05880 -194.1176 0 0.0003869226

W1.4 101.52267 -193.0453 0 0.0004062060

Cubic 101.42931 -192.8586 NA 0.0004081066

baro5 101.98930 -191.9786 0 0.0004081766

Lin 96.45402 -186.9080 NA 0.0004958264

Quad 96.64263 -185.2853 NA 0.0005044474

I also used the glht() element and coeftest(KCl96WT.LL.4, vcov= sandwich). But I am facing the same issue.

> coeftest(KCl96WT.LL.4, vcov= sandwich)

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)

b:(Intercept) 4.3179185 0.6187043 6.979 3.024e-08 ***

d:(Intercept) 0.0907908 0.0080186 11.323 1.397e-13 ***

e:(Intercept) 0.9809981 0.0686580 14.288 < 2.2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Do you know what approach could indicate if I can be statistically confident regarding my model? Can I be mathematically confident in the EC50 given by the package?

Thanks for your time! I am looking forward to discover new ways to be more critical regarding my data analysis. If you have any questions or comment regarding my approach, feel free to ask me !

i am trying to conduct cfa anaylsis using R studio however, instead of giving me all the fit indecisive i am supposed to get , even with the summary function i dont get the GFI | AGFI | NFI | NNFI | CFI | RMSEA . can any one please help me with this issue

Estimator ML

Optimization method NLMINB

Number of free parameters 25

Number of observations 275

Model Test User Model:

Test statistic 228.937

Degrees of freedom 41

P-value (Chi-square) 0.000

Hi everyone

I want to know which techniques of Machine Learning/Deep Learning I can use for more accurate SOC estimation of Lithium Ion batteries.

Thanks and Regards

How might one go about estimating how recently an individual was infected with a given pathogen given antigen and antibody titers relevant to said pathogen?

What do you think about estimatting the level of automation? How to carry out the process of estimatting the level of automation, autonomy and intellectualization?

It is important to get a quantitative result.

1. Estimate the technical level of the equipment, software;

2. Taxonomy processes / business processes and decide on the level of automation;

3. Estimate the completeness of automation by calculating the proportion of controlled signals from the norm;

4. The share of computer time from the total time of the operation.

Etc.

Is it possible to estimate the shape, scale, and location parameters of a generalized extreme distribution (gev) if I just know the mean, variance, and median of a given data set (i.e., no raw data available - - just its descriptive statistics)?

Currently, I'm doing thesis using google earth engine to calculate the days of cessation of storms to CHIRPS Daily Precipitation Image Collection. The code flow is as follows:

Conceptually, it is like this. The first date of the image collection is set as the zero day for the cessation of storms. Coming to the image of next date, it examines the values of precipitation. If the precipitation is zero, the day of cessation of storms is added up one day and still continues adding until the precipitation of coming days becomes non-zero. At that case, the day of cessation of storm starts off with zero day.

The results I've obtained is an image collection in which all the elements inside have the calculated bands with value of zero which shouldn't be like this. I would like some suggestions on this problem, please.

Reference paper: Estimation of snow water equivalent from MODIS albedo for a non-instrumented watershed in Eastern Himalayan Region

Dear all,

I would like to estimate prediction accuracy using EBVs (Estimated Breeding Value) that computed from pedigree-based and genomic-based. In the models to estimate those EBVs, I have fitted a number of fixed effects (e.g. age, batch, tank,...), I wonder that if I re-fit those fixed effects in the cross-validation as predictors will lead to be overfitted? If no predictor, how can I do cross-validation between 2 sets of EBVs? Any suggestion?

Thanks and regards,

Vu

I have been reading this paper on how to analyze linear relationships using a repeated measure ANOVA: .

I was wondering though once you establish a linear relationship across your categorical variables (A, B, C, D) how can you check if the difference across conditions A vs. B vs. C vs. D is also significant?

I have been using pairwise t-tests (A vs. B; B vs. C; C vs. D), but is there a better test to look at this?

Just for completeness, I have been using "ols" from "statsmodels" to check for the linear relationship, and "pairwise_ttests" from "pingouin" to run post-hoc tests in Python.

Is it possible to back-calculate/estimate the amount/dosage of a molecule consumed by looking at ante- or post- mortem toxicology blood levels? If you don't know, can you suggest a contact that might know? (I appreciate there will be issues around blood redistribution and site of sampling, etc.)

I have created three logistic models, model 4, 1 and 2, and calculated AICc values for each. Both model 4, with 2 covariates (location and camera), and model 1, with a single covariate (location), have approximately equivalent AICc values (less than 2 points). In this case one should chose the model with the least parameters, this is model 6 with only location included. However, to make things more confusing the likelihood ratio tests for model 4 vs 1 and model 4 vs 2 suggest that having location and camera in the same model is better than just having location or just camera. This contradicts the AICc values. So which model would you choose? I provide an example below. Thanks in advance.

> # location as a covariate on abundance

> m1 <- occuRN(~1 ~location, final)

> m1

Call:

occuRN(formula = ~1 ~ location, data = final)

Abundance:

Estimate SE z P(>|z|)

(Intercept) 2.01 0.704 2.86 4.24e-03

location2 -2.19 0.547 -4.02 5.94e-05

Detection:

Estimate SE z P(>|z|)

-2.32 0.756 -3.07 0.00215

AIC: 162.7214

> # camera as a covariate on detection

> m2 <- occuRN(~camera ~1, final)

> m2

Call:

occuRN(formula = ~camera ~ 1, data = final)

Abundance:

Estimate SE z P(>|z|)

0.682 0.371 1.84 0.0657

Detection:

Estimate SE z P(>|z|)

(Intercept) -2.589 0.763 -3.392 0.000694

camera2 1.007 0.774 1.301 0.193247

camera3 2.007 0.785 2.557 0.010555

camera4 0.639 0.803 0.796 0.425864

AIC: 178.696

# camera as a covariate on detection, location as covariate on abundance

> m4 <- occuRN(~camera ~location, final)

> m4

Call:

occuRN(formula = ~camera ~ location, data = final)

Abundance:

Estimate SE z P(>|z|)

(Intercept) 2.71 0.319 8.49 2.06e-17

location2 -2.25 0.509 -4.41 1.03e-05

Detection:

Estimate SE z P(>|z|)

(Intercept) -4.050 0.616 -6.571 5.00e-11

camera2 1.030 0.620 1.660 9.69e-02

camera3 1.776 0.613 2.897 3.76e-03

camera4 0.592 0.642 0.922 3.57e-01

AIC: 157.2511

> model_list<-list(null,m4,m2,m1)

> model_names<-c("modelnull","model4","model2","model1")

> modelsel<-aictab(model_list, model_names, second.ord=T)

> modelsel

Model selection based on AICc:

K AICc Delta_AICc AICcWt Cum.Wt LL

model4 6 163.25 0.00 0.61 0.61 -72.63

model1 3 164.13 0.88 0.39 1.00 -78.36

modelnull 2 181.06 17.81 0.00 1.00 -88.20

model2 5 182.70 19.44 0.00 1.00 -84.35

Dear All

Can you help me clarify the basic difference among the following command in STATA. xtmg, xtpmg, xtcce and xtdcce, xtdcce2. When and which method we should choose to estimate MG, AMG, PMG, CCE etc.

The recent few research article showed that the rain attenuation prediction model even the ITU-R, predicts over estimated attenuation for short links. Most of such experimental link as per literature is smaller than 2km or even building to building link that is about 200m. Many researchers find the issues and given solution by correcting the effecting path. But I think the issue is not yet addressed technically.

Why such limiting value of path length exists for which existing models show discrepency to predict attenuation? Thank you.

Estimation of Protein Concentration by Spectrophotometry