Science topic
Maximum Likelihood - Science topic
Explore the latest questions and answers in Maximum Likelihood, and find Maximum Likelihood experts.
Questions related to Maximum Likelihood
I am currently replicating a study in which the dependent variable describes whether a household belongs to a certain category. Therefore, for each household the variable either takes the value 0 or the value 1 for each category. In the study that I am replicating the maximisation of the log-likelihood function yields one vector of regression coefficients, where each independent variable has got one regression coefficient. So there is one vector of regression coefficients for ALL households, independent of which category the households belong to. Now I am wondering how this is achieved, since (as I understand) a multinomial logistic regression for n categories yields n-1 regression coefficients per variable as there is always one reference category.
Could you please explain specifically these two classification algorithms and the similarities and differences when they are applied in remote sensing image classification? Thanks a lot.
While finding maximum likelihood estimates how to get minimum bias estimate.
Hello everyone,
As the title suggests, I am trying to figure out how to compute a reduced correlation matrix in R. I am running an Exploratory Factor Analysis using Maximum Likelihood as my extraction method, and am first creating a scree plot as one method to help me determine how many factors to extract. I read in Fabrigar and Wegener's (2012) Exploratory Factor Analysis, from their Understanding Statistics collection, that using a reduced correlation matrix when creating a scree plot for EFA is preferable compared to the unreduced correlation matrix. Any help is appreciated!
Thanks,
Alex
I’m in need of materials on exponential regression model estimation using maximum likelihood estimation or other methods
I have constructed phylogenetic tree of 16s rDNA sequence by Neighbour joining, Maximum Likelihood, Maximum Parsimony methods by MEGA with Bootstrap value. I want to combine all the three phylogenetic trees into a single phylogenetic tree. Which program is suitable for this and how can I construct that? following attached tree (image) is an example
Hello, I would like to create 100 parsimonious trees through the command line. I am looking for the best software to do this. I have 25,000 tips for the trees, so it would not be possible to use ML methods. The software does not have to run 100 of them in one step. I would be happy to use just one script that makes one maximum parsimony tree and run this script 100 times using a workflow management tool like Snakemake. Also, the software has to take a multiple sequence alignment as the input file. I wanted to use TNT, but I cannot use my MSA fa file with TNT. Thank you in advance.
Hi everyone.
I have a question about finding a cost function for a problem. I will ask the question in a simplified form first, then I will ask the main question. I'll be grateful if you could possibly help me with finding the answer to any or both of the questions.
1- What methods are there for finding the optimal weights for a cost function?
2- Suppose you want to find the optimal weights for a problem that you can't measure the output (e.g., death). In other words, you know the contributing factors to death but you don't know the weights and you don't know the output because you can't really test or simulate death. How can we find the optimal (or sub-optimal) weights of that cost function?
I know it's a strange question, but it has so many applications if you think of it.
Best wishes
Experiencing slight difference in the results of tress obtained from BEAST and Maximum Likelihood...
Hi there,
I'm an undergrad Psychology student, I was taught in my statistics class that it is recommended to use pairwise deletion when dealing with missing data to not reduce your sample size and statistical power. I have been doing reading on Little's Missing Completely at Random analysis (MCAR) as well as imputation techniques like multiple imputation, maximum likelihood and full information maximum likelihood to deal with missing data instead of deletion.
My question is if a researcher were to have a significant p-value on the MCAR test and then use pairwise deletion would this mean that the sampling is not random, but instead based on whether a participant responds to a question or not. If so, does this then eliminate the generalizability of the research?
Hi, I have questions about HLM analysis.
I got results saying 'Iterations stopped due to small change in likelihood function'
First of all, is this an error? I think it needs to keep iterating until it converges. How can I make this keep computing without this sign? (type of likelihood was restricted maximum likelihood, so I tried full maximum likelihood but I got the same sign) Can I fix this if I set higher '% change to stop iterating' in iteration control setting?
I am working with a distribution in which the support value of x depends upon the scale parameter of the distribution and when I obtaining the Fisher Information of the MLE, it exists and giving some constant value. So, in order to find the asymptotic variance of the parameter, can I take the inverse of the Fisher Information matrix even they violets the C.R. Regularity condition, and will it hold the normality property of the MLE??
Please suggest to me how can I proceed to find the confidence interval of the parameter.
When I learn about the meta-analytic structural equation modeling using TSSEM method, I find some different opinions regarding missing data:
In Jak, S. (2015). Meta-Analytic Structural Equation Modelling. Springer International Publishing., The author indicated that 'Similar to the GLS approach, selection matrices are needed to indicate which study included which correlation coefficients. Note however, that in TSSEM, the selection matrices filter out missing variables as opposed to missing correlations in the GLS-approach, and is thus less flexible in handling missing correlation coefficients'.
While in Cheung, M. W.-L. (2021, January 22). Meta-Analytic Structural Equation Modeling. Oxford Research Encyclopedia of Business and Management. Retrieved 28 Jan. 2021, from https://oxfordre.com/business/view/10.1093/acrefore/9780190224851.001.0001/acrefore-9780190224851-e-225., the author indicated that 'Instead of using the GLS as in Becker’s approach, the TSSEM approach uses FIML estimation. FIML is unbiased and efficient in handling missing data (correlation coefficients in MASEM) ···'.
Based on what I described above, I feel confused about what kind of missing data can TSSEM handle, the missing variables? or the missing correlation coefficients? or both can be handled using different methods?
Then my understanding is that the two authors described the ways to handle missing data in TSSEM from different corners, Dr.Jak emphasize on using selection matrices to filter out missing variables; While Dr.Cheung emphasize on using ML to interpolate missing correlation coefficients. But I am not sure whether my understanding on it is right or not, So I sincerely invite you to answer my question, thank you!
Zhenwei Dai
2021.8.28
Good day, I am interested in understanding a research paper that used the maximum likelihood of a generalized linear mixed model to predict the missing data from participants who did not complete an intervention in a randomized control trial.
I am not very familiar with statistical functions, and would like to understand this concept in simple terms to explain to my peers. Any information is appreciated.
Hello, I have not normally distributed data and wonder if I should use and report the standard Maximum Likelihood Model Fit Indices such as CFI, TLI etc. or if I cannot use them due to non-normality. I'm also unsure if I went correctly in the SEM procedure regarding non-normality.
This is the order I proceeded:
Model: UTAUT2, n=120, Software: AMOS 23, SPSS 23
1. Exploration
1.1 Eye-Inspection of data in SPSS for outliers (straight-liners)
1.2 KMO & Bartletts test -> both suggest data fitness for FA
2. CFA:
2.1 Construct reliability and validity -> good after modification
2.2 Model Fit Indices (based on MLE) -> acceptable
2.3 normality test -> items are not normally distributed
2.4 Bollen-Stine Bootstrap -> p not significant, which means model should NOT be rejected
2.5 Bootstrap -> Bias-corrected percentile Method -> all items load significantly
3. Structural Model:
3.1 Bollen-Stine -> p not significant -> model is not rejected
3.2 Model Fit Indices (based on MLE) -> acceptable
3.3 Bootstrap -> Make assumptions
I'd be very thankful for any help and recommendations.
i need pakages for estimation VAR model using MLE
Ml estimation is one of the methods for estimating autoregressive parameter in univariate and multivariate time series.
Hello All,
Wooldridge's Introductory Econometrics (5th ed.) states that "Because maximum likelihood estimation is based on the distribution of y given x, the heteroskedasticity in Var(y|x) is automatically accounted for."
Does this hold also for bias-corrected or penalized maximum likelihood estimation under Firth logistic regression?
Please advise.
Thanks in advance!
my result is different from some papers', can you share your result?
thanks so much.
As far as I know, the simulated maximum log likelihood should be increased with the increase of the number of R of random draws.
I want to run the Pseudo-Poisson Maximum Likelihood (PPML) in a panel data framework as my dependent variable has many zeroes. However, my challenge is that from all the literature I have read on the PPML, it seems to only work in gravity model type of estimation. Is is it possible to run a PPML using panel data for a non-gravity type of model? If it is possible, what is the STATA command to use?
I have run ordered probit model for a latent factor using gsem function in Stata 16. This is because I have Likert scale (1-5) answers to my observable variables. The problem is I cannot obtain fit statistics for the model. I have read that it is not possible for gsem in Stata with some exceptions (e.g. latent class analysis). Is that the case?
Would an alternative be to run a SEM with maximum likelihood? I guess that has been done in some studies using Likert scale answers, and I can then get gof easily. But I am hesitant as it does not seem to be technically correct..
Thank you for any suggestions!
Hi all,
I'm looking for pieces of software that could compute Maximum Likelihood and Parsimony on mixed-ploidy SSR data (2n though 6n). I can devise sequences, but the crucial points are (1) handling more than 1n data; (2) ability to assess phylogeny among the mixed-ploidy dataset. Thank you for your suggestions!
-Marcin
i tried to build a tree between bayesian inference and maximum likelihood algorithm using the same sequence but i got the different result. i reconstructed 5 homolog sequences with no polymorphic site. in maximum likelihood there is no branch length among 5 samples, while in bayesian inference i have branch length among 5 homolog sequence samples. why does this happen ? thank you
I was reading my class work which mentions about various algorithms for pharmacophore mapping and wanted to learn more about the maximum likelihood algorithm. What exactly are the advantages and limitations of using this algorithm? What is the best algorithm for pharmacophore mapping?
What is the difference between Maximum Likelihood Sequence Estimation and Maximum Likelihood Estimation? Which one is a better choice in case of channel non-linearities? And why and how oversampling helps in this?
I am trying to use R to create scatterplots of my data. I am using the effect_plot function (partial.residuals = TRUE) so that I can control for other variables in my plot, but I cannot figure out a way to estimate my missing data using full information maximum likelihood (FIML). Is there a way to do this in R?
If we have a NOMA uplink of lets say 2 users. if the base station is utilizing Joint Maximum Likelihood Detection (JMLD) instead of Successive Interference cancellation. In this case the key performance metric is the BER (Bit Error Rate).
How do we calculate the Rate of first user? In case of SIC the user being decoded first has its individual rate affected by interference from the user to be decoded afterwards. If SIC is not utilized and JMLD is utilized then we should get rid of the interference caused by 2nd user to the one being decode first.
Is this conceptually right? What is the relation of BER with achievable rate? Is it possible that we achieve the desired rate but we have bit errors?
Rate = Bandwidth * log2(1 + SINR)
For clarity:
1) First user is the one being decoded first and has good channel gain and hence is strong user.
2) 2nd user is the one being decoded afterwards.
I want to construct a Maximum likelihood tree of gene sequence from isolated species and a full-length sequence of the same gene in other species obtained from Genbank, but I am unable to decide which model I should choose WAG or JTT in RaxML.
Your suggestions will benefit me greatly
Hi all,
I am running a mixed model and have noticed that my data violates the normality assumption. I am running this in SPSS and therefore the estimators available to me are maximum likelihood or restricted maximum likelihood. I was wondering if these estimates are robust for non-normal data? If so, is there a preference for one over the other? Currently my SPSS has defaulted to restricted maximum likelihood.
Thank you in advance for any help on this!
M
I have a non linear wiener model (xk = x0 + θ_k*(t_k)^b+ σ B(t_k )). I have a sample of m RMS values collected for m intervals of time.
For getting these parameters (θ_k,b,σ,B(t_k )), can i perform partial differential of PDF and get values?
REF: https://ieeexplore.ieee.org/document/7574329 - Section II.B
or principle axis factoring?And which turning method is the best with maximum likelihood or principle axis factoring in order to be able to confirm the scale model in Confirmatory factor analysis?( the participant number is between 400 and 500 and the research ares is social sciences-education)
or principle axis factoring?And which turning method is the best with maximum likelihood or principle axis factoring in order to be able to confirm the scale model in Confirmatory factor analysis?( the participant number is between 400 and 500 and the research ares is social sciences-education)
Hi everybody!
I'm performing EFA on a 400 observations database that contains 39 variables that I'm trying to group. I'm using maximum likelihood and applying a varimax rotation.
I have eliminated all the variables that have have communlaties < 0.4, I know this can be a bit "relaxed" but overall I don't have communalities that are that high (0.67 the highest one and it is only two variables), I have then dropped the variables that have a loading < 0.4 and have eliminated the variables that are cross loading (usually with loadings just above.4).
After performing all these steps, I have 3 clearly defined factors with 19 variables in total (F1: 8 variables, F2: 7 variables, F3: 4 variables). Is it acceptable to drop that many variables?
Thanks in advance!
Mauricio
Hi, I want to get three years of hourly wind speed data at 50 meters height for specific locations in Turkey. Unfortunately, I couldn't find any good websites for this. Do you know any online sources for this?
one of the seven variables in my path model is a binary one, with yes/no answers, and this variable is an exogenous one. what estimation method should I use in Mplus? Is Maximum Likelihood good to go with?
Thanks for clearing it out for me.
We are trying to map land cover classes on a watershed. We have selected training sites (during a field campaign in early 2017) and extracted their spectral profiles based on a Landsat 8 image acquired at the time of field surveying.
In order to assess the land cover changes, we wanted to map the same cover classes at a previous year. Since our training sites might not be relevant, we wanted to perform supervised classification using endmembers spectra instead of ROIs. When importing those spectra inside ENVI's Endmember Collection toolbox, it appears that only Spectral Angle Mapper and Spectral Information Divergence classifiers could be used. Common algorithms such as Maximum Likelihood or Mahalanobis distance fail, returning the following error message :
Problem: the selected algorithm requires that the collected endmember spectra all contain an associated covariance. ENVI is unable to continue because some of the endmembers collected to not have their covariance.
Could anyone help here ? Actually is our method relevant ? How can we possibly perform supervised classification using Maximum Likelihood/ Mahalanobis classifiers on some older satellite images ?
My problem is related to fitting an intensity decay with an exponential function via the maximum likelihood method of a poisson-distrubted distribution. Since I need to extract the lifetime of my intensity decay, I need to fit it with an exponential function, and since I know the data follows a Poisson distribution, I am using the MLM with such distribution to achieve my goal. I have seen similar problems but referred to normal distributions, not to my specific case.
Here, I show some simulated data (exponential function), whose parameters I want to calculate using the MLM for a Poisson distribution. It literally does nothing.
Anyone can help? Many thanks.
from scipy import stats
import numpy as np
from scipy.optimize import minimize
import pylab as py
xdata = np.linspace(0,25,1000)
ydata = 10*np.exp(-xdata/2.5)+1
def negative_log_likelihood(params):
A = params[0]
B = params[1]
C = params[2]
yPred = A*np.exp(-xdata/B)+C
# Calculate negative log likelihood
LL = -np.sum(stats.poisson.logpmf(ydata,yPred))
return(LL)
initParams = [5, 1, 0]
results = minimize(negative_log_likelihood, initParams, method='Nelder-Mead')
print (results.x)
estParms = results.x
yOUt = yPred = estParms[0]*np.exp(-xdata/estParms[1])+estParms[2]
py.clf()
py.semilogy(xdata,ydata, 'b-')
py.semilogy(xdata, yOUt)
py.show()
I conduct first-order CFA model with four factors and 23 variables, of which 11 are categorical and 22 are continuous. I do not know which estimator (ML, MLR, MLM, WLSMV ??) I should use. Please, somebody can help me?
Thank you very much in advance.
I am running multilevel models in R (two-level and three level models) for my thesis. However, I have two problems:
1. Missing data
Literature advises use of Multiple Imputation (MI) or Full-information maximum likelihood (FIML). I do not know how to carry out these processes in r or stata taking into account multilevel modeling. I am looking for practical videos or articles that can help me run either of these processes. I would like to have something running from imputation to analysis process.
2. Many variables
I have many explanatory variables i.e. dummies (e.g gender), discrete and continuous variables. I am looking for a procedure to choose the variables for the regressions. I read some article that said PCA can only work for continuous variables. So this process should take into account multilevel modeling. Are there recommendations of practical videos or articles
Lastly, which one should be conducted first, sorting out missing data or choosing the variables (PCA or other process)
Dear all,
I am running a path model with a dummy variable (Gender) as an independent variable (there are more independent variables). The dependent variables are continuous.
Maximum Likelihood estimation assumes multivariate normality. But this assumption is violated using an independent dummy variable (or dichotomous variable).
I am looking for studies which have investigated the bias of an independent dummy variable on the maximum likelihood estimation when I use only continuous dependent variables.
Does anybody know such studies or some guiding rules?
[Due to sample design I can not use maximum likelihood parameter estimates with standard errors and a chi-square test statistic (when applicable) that are robust to non-normality and non-independence of observations.]
Thank you very much in advance.
Kind regards
Rico
Hello
I'm working on a project that I want to detect forest and deforestation areas. I need some information about Maximum-likelihood or Support vector machine (SVM) for image classification.
Hi,
Just pope yield function: y = f(x; a) + h(z; P)e,- (mean effect + variance effect ).
I would like to know how to run this model through MLE procedure?
In statsmodel package there is a class for estimating AR(p) processes, but this class can not handle exogenous inputs (ARX model)
I used ARMA class which can be used for estimating ARMAX(p,q) processes and set q=0 (number of MA coefficients) in order to estimate ARX model.
I expect my ARX model have better fit in comparison to AR model, but this is not the case. I suspect this is because of ARMA using Maximum Likelihood which is an approximate solver. How can I fit ARX model with exact least squares in python?
Hi, my model results in a low critical ratio for most of the coefficient paths.
Can it be due to the sample size? N=157; I used maximum likelihood as an estimate.
Goodness of fit tests are all good.
Is there anyway how to solve that issue?
I am currently analyzing my data for my thesis research, and an issue has come up that we do not know how to resolve. I have two time points, and I am conducting various path analyses in R with lavaan installed.
The issue I am having is with accounting for missing data. Since this is a longitudinal study, only 66% of participants completed both time points. I know that the default estimator with the lavaan package is maximum likelihood. However, this estimator removes cases that did not complete both time points from my analyses. Therefore, I resorted to use full information maximum likelihood estimator. However, when I use the function (missing = “ML”), my results come out strange. For instance, my r-square results are abnormally high, and my parameter estimates become very different from the default estimator and are very large as well. We think something isn’t right with my missing data function.
Does anyone know what might be causing this issue? Is there a better way to account for missing data in my longitudinal design with the lavaan package in R?
Your help and advice is greatly appreciated, as I feel like I am hitting a wall right now. Thank you!
I have a small data set consisting of 16 sequences, 919bp each. I am trying to determine the phylogenetic relationship among the individuals. I am wondering if the neighbor-joining method, maximum likelihood or bayesian analysis works best for small data sets like mine. Thank you.
The proportional fitting procedure (IPFP,) is an iterative algorithm for estimating expected cell values [M_ijk] of a contingency table such that the marginal conditions are met.
For iterative computation of the M_ij, using MLE for the model, other Algorithm exist , and most common are for 2D while am interested in that of 3D, Assuming model of independence is equivalent to the model that all odds ratios equal one. I need references of examples showing manual calculation, and how to do more advance ones on R.
Hi Guys,
I would highly appreciate some help/input regarding boolen-stine bootstraping in AMOS. Given that my data violated the multivariate normality assumption I have opted to run bollen-stine bootstrapping in AMOS (as MLR are not available in this software). However, now I wanted to compare the model fit indices for both the bootstrapped model and the "normal" ML results and the all model fit parameters are identical.
I am not sure what I am doing wrong, but clearly there should be a difference between the two or what is the purpose of it otherwise?
Really looking forward to some input.
Thanks,
Jacky
Hi, I have a problem with a confirmatory analysis. I have 40 items of which 10 were strong, but the maximum likelihood analysis tells me that I must count 4 factors of these 10 items, thus I am asked to repeat some items in more than one factor, my question is, ¿Is this possibile?, ¿Methodological possible?
Can someone familiar with the spedeSTEM software please help me?
I cannot acces the web based version, when I try it - this page cannot be displayed appears.
So I downloaded the command line version. I tried it on ubuntu (with installed python) but when I run the script it says /usr/bin/env: python -0 -t -W all no such file or directory.
Can someone suggest me any idea to solve this?
Dear experts, I badly need a help to solve the problem of Landsat image classification of a coastal region of Bangladesh. The problem is regarding supervised classification of a coastal district. The land cover spectral reflectance value of pixels is quite complex. The Pixel values of Build up area and Barren land are very close. When selecting the Training sample pixels for Urban area and Barren lands that are often in conflict in the output. There are many unwanted areas in the classified image. Like many unwanted buildup areas or many Barren lands occupy Buildup areas. I tried more and less training inputs for urban area several times. but the results are not appropriate. Can you please suggest any ways to solve the problem??
What are the methods to solve the blind source separation problem?
This is related to entropy.
This is related to independent components analysis.
This is related to maximum likelihood estimation.
But how higher order moments can be used?
I'm trying to create a map, where I want to show a specific object. Is it possible to classify only the sample area I put in the map without doing maximum likelihood classification?
I used MEGA7 to determine the best substitution model and the SM with the lowest BIC value is the LG substitution model. However, I can't find LG when I want to make a maximum likelihood tree of my proteins. Can anyone help me?
I used maximum likelihood method to draw the tree, i don't know why the bootstrap for the same bacterial species is low (1_29) as shown in the attachment (bootstrap consensus tree),and the numbers between the same species (original tree) are also low as you can see in the attachment. I used muscle algorithm for alignment my sequences by mega app for 16srRNA region.
I also tried to use Gblock app, but the bootstrap is still the same.
I appreciate your help.
When calculating interferometric coherence, why can't you do so on a pixel by pixel basis? I know the equation for estimating coherence = (|S1∙S2* |)/√(S1∙S1*∙S2∙S2*) where S1 and S2 are the two single look complexes. And I know this calculation uses a maximum likelihood estimator but why do you need to specify an estimation window and why cant the estimation window size be 1?
Thank you.
I have sample size of 215 with 10 of them having disease positive and I have about 20 covariables that I want to examine. Can I do univariate logistic Regression for each and those with p- value more than 0.1 are to be included in Firth logistic regression ?
how can I interpret the results in SPSS ?
Conf. Interval Type Wald
Conf. Interval Level (%) 95
Estimation Method Firth penalized maximum likelihood
Output Dataset --NA--
Likelihood Ratio Test 38.0566
Degrees of Freedom 11
Significance 7.65335733629025e-05
Number of Complete Cases 176
Cases with Missing Data 39
Number of Iterations 26
Convergence Status Converged
Last Log Likelihood Change 6.3948846218409e-14
Maximum Last Beta Change 5.36667170504665e-06
what is the significance , last log likelihood and maximum ones ??
I have a complex loglikelihood and i want to estimate its parameter.I have tried all i could do. The best i got from optim function is that i didnt use a good initial value. Kindly advise please.It is slowing down my work
Hello,
I am currently looking at the results of confirmatory factor analyses (CFA) that were conducted by another person. There are a few analysis and model choices that don’t seem quite right to me. I would greatly appreciate if anyone with enough experience with CFA could let me know what they think of the following points:
1) Is maximum-likelihood (ML) estimation ever an acceptable method to use in CFA if variables are ordinal (e.g., Likert scales) or nominal?
2) Is ML estimation ever an acceptable method to use in CFA if data are not normally distributed?
3) Is it acceptable to keep an item with a factor loading > 1 (heywood case) if the model’s fit indices and parameter estimates are otherwise acceptable?
Thank you vey much!
I can find some articles describing R codes for drawing Nomogram after logistic regression or cox regression. But if in my logistic regression model, penalized maximum likelihood has to be used to resolve the failure of maximum likelihood estimate to converge, is it still possible to draw a Nomogram?
If the answer is yes, how to write R codes for this condition?
Thanks for any reply.
Joan
I have estimated a tobit regression model with one dependent variable and 14 independent variables. The number of observations is 450 out of which 41 are left censored while all others are uncensored. I am using a primary survey data relating to farm households. The stata output provides a pseudo-R2 value of 3.75. Kindly let me know if it is acceptable.
Let X1,...,Xn iid random variables with distribution F(p), where p is some parameter. Due to some reasons, p is not observable directly (in the sense that there is no way to confirm whether p is static or dynamic). The challenge is to estimate the quantile of p to a given level, without assuming a particular distribution of p.
It seems that bootstrap is the only choice. So under the assumption that X1,...,Xn are not time-specific, bootstrap might not be bad choice. However, the estimate depends heavily on the number of bootstrap-draws. This results in unreliable quantile estimate.
However, assuming that p can be estimated by maximum likelihood method as well as any conditions necessary to ensure a consistent ML estimate, then we know that the ML estimate follows a normal distribution.
So is it appropriate to use the quantile of the ML estimate (using normal distribution) to estimate the quantile of p?
Additionally, can someone comment on the following:
I am generating a 1D data using a squared exponential kernel. If I use the data to learn the hyperparameters using maximum likelihood approach, then what are the conditions under which I will get the same hyperparameters as I have used for generating the data.
I am estimating a multiple regression model (with one 2-way interaction) using Maximum Likelihood estimation (MLE). Due to some substantial missingness on some important covariates (30-60% missing; n=19000), I estimated the multiple regression model using two missing data treatments (Listwise Deletion, Multiple Imputation). These methods, however, produced different results - example, interaction was significant when using multiple imputation, but not listwise deletion.
Is there a method/test to evaluate which approach (listwise deletion or multiple imputation) is more trustworthy? In papers I've read/reviewed, people often seem to find concordance between their model coefficients when using listwise deletion and multiple imputation.
Also, for those interested, these models were estimated in Mplus, and I implemented a multiple imputation based on bayesian analysis to generate imputed datasets followed by maximum likelihood estimation.
Thanks much,
Dan
Is there any particular software for partitioning of sequences like the CO1 sequences were partitioned by each codon position, whereas the EF1α sequences were partitioned as introns and exons. Please help me out about this querry.
For given samples X1...Xn, which follow the distribution F with parameters p1,...pm, one may use the maximum likelihood method to estimate p1,...pm. In the technical process, the method is only an optimization problem in m-dimensional field.
In some cases, it is irritating to acquire different estimated values of p1,...pm than the one acquired if one only estimate p1 (assuming this parameter does not depend on p2,...pm). E.g. the joint estimators (\mu, \sigma) may be different than \mu, when estimated alone. Since \mu is reserved for expected value, which one is "correct"? the joint-\mu or the single-\mu?
1) if the joint one is correct, then any single estimator should be doubted.
2) if the single one is correct, then joint-model does not have any values. So one should always choose univariate model.
In some cases, the joint-\mu does not make any sense. The interpretation that \mu is an expected value is somehow no longer valid in a joint-model.
I am currently developing a new flexible discrete distribution with two parameters. However, I am unable to obtain the unique estimators using MLEs and moments method. Multiple solutions obtained from the R software with same logLikelihood values. The method of moments also a dead end. Why does this happen? By the way, the distribution is a legit distribution with sum of probability equates to one.
Hi all,
It is easy to define an outgroup when reconstructing a maximum-likelihood(ML) tree using IQ-Tree. Currently, I'd like to specify several outgroups in IQ-TREE. However, I don't find any option to made it. I also have a try to specify a comma-separated taxa name of outgroups after the parameter –o (-o outgroup1,ougroup2,outgroup3…). It still can not work. Do you anyone know how to do that? Many thanks.
All the best,
Fangluan
Intuitively, Maximum Likelihood inference on high frequency data should be slow, because of the large data set size. I was wondering if anyone has experience with slow inference, I can make optimization algorithms to speed up the infrence then.
I tried this with Yacine Ait Sahalia work on estimating diffusion models, using his code, which (Unfortunately!) is pretty fast, even for large data set. Now does any one know any large slow high frequency financial econometric problem do let me know,
Hi, Can someone please explain that if the EM algorithm is initialised with K-means, then one needs to compute complete-data maximum likelihood estimation (MLE) or maximum a posteriori (MAP)?
In my understanding, setting the priors leads the EM algorithm to maximising MAP problem. So, by initialising using K-means, we are setting priors.
Thank you.
I am new using the RStudio, and I would like your help and guidance to be able to make a phylogenetic tree with Maximum Likelihood. Thank you.
I want to hear about experiences using traditional or relatively simple capture-recapture models to investigate effects of time-varying individual covariates (i.e. body size) on survival. An easy trick is to make the covariate categorical instead of continuous and examine effects of (size) states on survival (using multistate or multievent models), but if you want to investigate continuous size-survival relationships and how they change over time, what options do you have without needing to shift to more complex modelling tools?
Hello, I want to construct 16s rRNA gene phylogentic tree. I have retrieved the similar sequences from NCBI and alligend that sequences through clustalx . I have installed Paup4 to my PC what should I do know to construct a tree with bootstrap values and also how will I save that tree?