Science topic

# Comparative Modeling - Science topic

Explore the latest questions and answers in Comparative Modeling, and find Comparative Modeling experts.

Questions related to Comparative Modeling

Me and my team are developing a statistical test to compare models and observations in a model-independent way. We found some 2d and 3d example of models coming from either simulations or theory and their respective observables to use, however, 1d scenarios are somehow harder to find. If you know some famous example, or directly from your own research we would be happy to include it in our paper.

Ideally the example would have a set of 1d observables (e.g. mass, period, spin) and a sample simulated from the model in exam (or a specific parametric form for that model, so that it is straight forward to sample from it).

Fell free to contact me directly if you want to know more about the project or on how we would use your example!!

Greetings everyone and thank you in advance for your response to my questions.

I am currently learning my way through using AMOS for my multiple regression analysis, so I am humbly asking for some guidance, insights and opinions from fellow experts in here.

I have two models that I want to test out the roles of social media addiction, social comparison (two types: upwards comparison and downwards comparison) and experience in using skin lightening products on skin tone satisfaction. Based on these constructs and literature reviews, I have built two models, the proposed model and the alternative model. The models are as in the image attached.

My questions are:

1) How can I compare models if I used multigroup analysis to test for the moderation effect of social media addiction in the proposed model, i.e., Chi-square difference? I converted social media addiction into 2 groups: addicted and not addicted to social media. If I were to compare AIC and BIC values with the alternative model for model selection, which AIC&BIC values should I use? The AIC&BIC from n = all participants, or from each "addicted" and "not addicted to social media" groups? Is there any steps-by-steps guidelines that I can follow to properly compare a multigroup analysis model?

2) Based on my models, am i doing it right if I compute the AIC and BIC values for each hypothesis in each prospective models (i.e., H1 is considered as R1 model, H2 as R2 model etc.) or I should just compute the AIC and BIC values for the two models as it is (testing for all hypotheses simultaneously, hence computing the AIC and BIC of the model as a whole)?

I apologize if the questions are lengthy. I tried to explain as much as I could because I cannot find any literature to help me with this uncertainty of mine. Your responses are very much appreciated!

Hey everybody!

I'm implementing a Bayesian Negative Binomial using STATA 17. Because of some colinearity or convergence issues, I needed to put my variables in different blocks in the modeling process. Yet, it is a bit confusing to choose the most optimum number of blocks (and also their exact set of variables) for a model. Do you have any idea about it?

Apart from that, what criteria do you suggest (DIC, Acceptance rate, Efficiency, variables significance, etc.) for comparing models developed using various number of blocks?

I appreciate any help you can provide in advance.

I am researching trading strategies, particularly involving time series analysis, but also machine learning, on data sets that range as far back as 1950 and would like to compare the accuracy of various models, up to ~25 at a time.

My for loops in R for an ARIMA + GARCH model of the S&P500 values starting from 1950 to today take about 10 hours to evaluate all models with p and q both ranging from 0 to 5.

My specs are an AMD Intel i5-8600K, 16GB of DDR4 RAM, and an AMD R9 Fury, but I am not using any parallel processing.

Thank you.

My hypothesis is that the functional diversity of fish (responsible variable) increases until an optimum level of a gradient of habitat structural complexity (predictor variable), but decreases after that (which I have noticed in the graphic inspection). Then, I am really interested in this hump-shaped relationship.

To test this relationship, I will run a beta regression. In R, I have found two ways to include a second-order term in the model:

**I(x^2)**and**poly(x,2)**. The first one does not include the lower-order term, but the '*poly*' function does.According to Cohen, Cohen, West, & Aiken (2003), in order that the higher order terms have meaning, all lower order terms must be included, since higher order terms are reflective of the specific level of curvature they represent only if all lower order terms are partialed out.

First, I would like to know if this is a consensus and if I really cannot use only the second order term as a predictor.

Second, if I use a likelihood ratio test to compare models (e.g., only first order term

*vs.*first and second order term) and the result is not significant, how can I choose the best model?Hi all! I know that Pagel's λ which can be estimated using gls() in R is not appropriate for binary data, but I wonder if I can still use gls() in order to compare models with different correlation structures (Brownian, OU, etc) using gls(). None of the packages I have found, which are appropriate for binary models, produce AIC values which I can use to compare the models. Cheers!

We have a survey with N = 7000 responses to a set of Likert-style questions, demographics, and similar. We separated it into training and test samples, and are now doing exploratory analyses on the training data.

I was expecting to generate a list of hypotheses like in other social psychological work, e.g., "A will correlate positively with B". We could do that and verify them in the holdout sample. However, I don't know how in this kind of test to evaluate the fit of the prediction. r > 0 is not very informative.

We could run models such a regressions with certain factors, and then compare model fit with or without sets of predictors; that could be evaluated by comparing indices of fit. But most of the predictions I imagined weren't models like these.

So the question is: can you recommend a way to specify hypotheses about correlations? Should we be using an equivalence test to test against a certain effect size? Is there any way to do this in aggregate for a big table of correlations?

Hello,

I am pretty new with this CFD topic.

I already read plenty articles and checked tutorials regarding this topic.

I am about to design an underwater towed body (close to the surface, a part is surface piercing) and I want to get informations regarding the resistance of the hull. For this I was looking for comparable models. The best I found was a Torpedo with an wetted surface very close to my body. I set up a simulation and now I have results, but I am not really sure, if everything I set up was right. So I set up a simulation for the Torpedo as well (same settings), but the average Drag is twice as much as reported in the paper. To specify my questions:

1. There are plenty peaks within the Drag curve shouldn´t (if all goes right) the curves become "flat"?

2. I am not sure about the boundary settings, orientated at the boat tutorial.

I added some pictures. If you need more information, just let me know.

Thanks for your help

I am doing meditation where life satisfaction is a mediator between Tourism Impact and Quality of life and I want to compare this model in two regions. Do I have to check the mediation in both the region separately or there is any other way? kindly suggest a way.....

I am estimating the relationships between Economic Growth (GDP), Public Debt and Private Debt through a PVAR model in which my panel data consists of 20 countries across 22 years.

First of all, how can I know what is the optimal lag length I should be using for such an analysis?

Then using IPS test for stationarity, two of my variables turn out to have unit roots in the panels and that's why I would think of taking the first differences in both of them to use in the big PVAR model. Would you think this is correct to follow?

Given I mentioned that, then this means I should be worrying about Co-integration? If yes, what do you advise me to do?

Finally, when analyzing my estimations after different changes, I cannot compare the models since there is no R-Squared, do you think there is a specific test I can run to compare models?

Thanks a lot!

My research area is Entrepreneurship. I have a new research proposal on how Entrepreneurs can finance their projects through crowd funding. Literature search revealed that U.S.A is already adopting this novel finance model. Therefore, I need to collaborate with researcher(s) from U.S.A, Europe or other International organizations, for research grants/aids especially in area of data collection since I am comparing the model in developed and developing countries.

Dear partners, I try to model a RC high building with shear walls in Sap2000. (20 stories). But I need to model with wide column method. I read a lot of information about this, but when I compare the model with wide column method and the same model with shell wall, the results are very differents.

For this reason, I need help with the steps or definitions of the properties and seccion of the elements: Wide Column and rigid beam. Or your comments about this topic.

Attached a image of the plan.

Thanks for your attention.

when I run glm function in r and try to select the best prediction equation, the following message appear in r and can't go further.could any one please help me, how can I scape from this problem?

Warning messages:

1: In dpois(y, mu, log = TRUE) : non-integer x = 14.610000

2: In dpois(y, mu, log = TRUE) : non-integer x = 14.390000

3: In dpois(y, mu, log = TRUE) : non-integer x = 14.420000

4: In dpois(y, mu, log = TRUE) : non-integer x = 14.790000.

and to compare models using AIC, the result turnsd in to infinity and i cant understand it

Hi,

To all stats experts, is there a way to compare if a model fits better in one population compared to another? I'm thinking, maybe by comparing the model fit indices?

Your insight would be really helpful, thanks!

I have a cooperation proposal for researchers who are interested in long time research aimed to C2C, B2C, B2B co-creation processes based on transaction cost. On beginning I am able to share my questionnaire and data base related to problem transactions costs and co-creating value of social media busisness model. I used SEM analysis to proof of thesis about dependence between transaction costs factors like: behavioral uncertainty, opportunism, assets specificity e.t.c and comparative transaction costs in choice of different type and form of value co-creating. I compare in this model factors coused costs and benefits for users of social media app. We could publish the results together.

I have three dependent variables, and 10 predictors and I am analyzing the data with multivariate regression. However, I need to compare the model and the contribution of each predictor with another groups. Any ideas how to proceed?

Hello, I have a longitudinal data (30 measures) from 30 subjects. These subjects are divided into three groups (a, b, c).

My question is on how should I build the LME, this is one possible approach:

I could start with the null model (M1 = response ~ time)

and then include an additive fixed effect effect from the groups, this would result in (M2 = response ~ time + groups) and compare both. Then, include an interaction term (M3 = response ~ time * groups)

and again compare.

Then, adding the random effects for the intercept would result in (M4 = response ~time*groups, random = 1|Subject), and finally the full model, with random effects for both intercept and slope (M5 = response ~ time*groups, random = Time|Subject).

On the other hand, I could start including the random effects from zero (M1). Is there a correct approach to this problem?

On the comparison part:

I am comparing models with difference in the fixed effects through wald t-tests (anova (mn)). With this result I check the individual significance of a fixed effect instead of comparing two or more models directly.

Whereas when the fixed effects are the same but the changes occur in the random effects, I am using anova (m1, m2, ...mn) to compare the best model.

Is this the correct approach also?

My coauthors and I are finishing up a project where we look at ways in which model data is compared with observational data. We found specific conceptual disconnects for using r2 and various Goodness-of-Fit methods for this. As a matter of academic diligence, I would like to inquire what other researchers use for comparing model output with observational data. r2 is probably the most popular because it is so easy to implement. If r2 were not available to you, what would you use? A preprint of our article is available at

Hello, I have created a SWAT model in a basin where flow data does not exist. There are few water quality measurements: less than 10 measurements each for years 2013 and 2016 only, with 2013 data from July to October and 2016 data from September to December. Am I even able to calibrate and validate this model? How can I compare model outputs with my water quality nutrient data? Also, how can I compare SWAT outputs with my water quality nutrient data? Much guidance needed, thanks.

I am currently using a dynamic in vitro BBB model for my research, and I would like to compare models using primary rat brain endothelial cells (which I am isolating myself) and an immortalised rat brain endothelial cell line. I would like to use RBE4 in particular as they are the most well-characterised, and I do not want to use a human cell line as I would like to be able to directly compare the results with my primary rat cells. However, I cannot find a source of these cells within the UK, any help in sourcing some would be appreciated.

The likelihood ratio test of comparing reduced model with full model differs by fixed factor result to chi-square distribution of zero degree of freedom.

/* reduced model */

proc mixed method = ml;

class block gen;

model rtwt = /ddfm = kr;

random block gen;

run;

/* full model*/

proc mixed method = ml;

class block gen;

model rtwt = prop_hav/ddfm = kr;

random block gen;

run;

There are 3 degree of freedom from reduced model - block variance, genotype variance, and residual variance. The same degree of freedom for full model with includes prop_hav as covariate. The difference in their -2 loglikelihood has zero degree of freedom under chi-distribution. Please could anyone guide me on how to compare these model to ascertain if the full modelis significantly different from reduced model.

I want compare results of model FEM and Experimentals of tunnel lining with the theory or any standard norm!

Thanks you

When first being exposed to methods of model selection in ARIMA class models, I was told that visual inspection of the ACF and PACF (when data is stationary) is satisfactory.

However when compared to the model selection using AIC or BIC we will end up with a different model sometimes.

Attached is the plot of time series data I'm using I am using along with its ACFs and PACFs. Visual inspection would lead us to conclude that the appropriate model would be an AR(1).

However based on the model from AIC (given by the R command auto.arima) the appropriate model is an ARMA(2,2).

In terms of model selection which method of selection is preferred? Visual inspection of ACF and PACF or the use of a selected Information criterion?

I need to compare models for predicting DBH and Height from stump diameter at 30 cm. I am looking for allometric models for tropical and subtropical tree species. my objective is to predict stem/bole biomass from stump diameter at 30 cm. I have species wise allometric equations for volume and biomass for a particular region and these equations are based on DBH and Height, but not the stump dia.

And is there any alternative way to reach my objectives.

I need suggestions...

I am wanting to take a 10-12 residue loop from an E.coli enzyme polypeptide which confers a particular allosteric mechanism(substrate specificity) and apply it to a non-homologous mouse catalytic enzyme in order to generate a novel enzyme whose hybrid loop I can then refine and adjust to see if we can make it produce a similar allosteric selectivity depending on the loop orientation in response to different substrates binding at the active site.

Now, my question is this; what freely available computational resources exist that can do this?

I'm vaguely familiar with a number of various programs and servers, but Rosetta match/design/dock is too non-user friendly and overly complex with no clear indication it can accomplish what I want specfifically ( I'm aware it offers loopbuilding, or matching or modelling, but I cannot find evidence of any others doing this with loops in the literature, and theres no clear answer as to whether or not it is able to select specific location/orientation to place loop, the existing literature seems to only ever use comparative modelling which is no use here as there's no homology, we are trying to create a novel function. (and no clear GUI so I can observe/experiment with placement.)

pymols build function, Chimeras loop builder and Peptidebuilder all seem to suggest their builders and minimizers can do what I need them to, but there's never a clear 'add this loop to this protein in this specific location' function, it's always too indirect.

I'm fairly new to computational biochemistry, can someone please point me in the right direction, how can I simply and clearly cut a loop from one protein, stick it on another myself computationally and then assess whether it modulates enzyme specificity?

Is there a service or program that does this? Or is my best bet using the likes of I-TASSER, zdock or Rosetta?

Thanks in advance, I'm probably missing something alarmingly obvious, but I've been going round in circles and it's giving me such a headache!

Hello, everyone:

Recently, I was trying to build 3D structure of one protein which has no crystal structure, I was wondering some questions.

First, what are the lowest threshold of sequence coverage and identity between the target sequence and the template for comparative Modeling?

Second, Dose Ab Initio Prediction can generate reliable results for a protein with no significant sequence identity to the known template?

Third, using the popular software Itasser, the parameters of result are as listed follows: C-score is -0.78, TM-score is 0.61+-0.14 and Exp. RMSD is 9.4+-4.6. The manual says that result with C-score greater than -1.5 is reliable. However, I find that the sequence coverage between my target sequence and the best test template is 12% and 31% respectively, Dose anyone can give some evaluation on the prediction? Can I use it in the published article?

Finally, can any one suggest pepeline of classical articles for prediction of 3D structure?

Thanks very much for any reply!

Where can I access topside sounder data (e.g. Alouette I/II, ISIS I/II, etc). I would like some actual electron density profiles to compare with models. The higher the orbital altitude, the better.

Best regards,

Alex

Request help in updating myself with the latest developments in DEA. Are there any comparable models that are developed beyond dea. SFA, DEA are productivity assesment tools. Kindly suggest any developments in the area of productivity measurement.

I want make comparison between my proposed model and one existing model based on 3 different performance error such as MSE, RMSE & MAPE. Also, I have 4 different medical data sets to be trained for classification purpose. My question is that, should I use the same number of iterations using the 3 different performance error for each of the data set? or what is the best way for comparing the models in terms of performance error?

In plotting the graph, what is the best way to plot the curve? Performance error vs no. of iterations or what?

Thanks

Respected All,

I have modeled a structure. when i am going to predict the quality of the modeled structure after energy minimization by Ramachandran plot. Some residues other than binding pocket lies in outlier region". what are the possible reason of these residue lying in outlier region? How can we predict the reason behind this error in model?

Best Regard,

I am interested in comparing model fit among proc mixed repeated measures models for 8 different outcomes. Fit Statistics, AIC, AICC, etc., were used to select the best fitting model for each of the 8 outcomes, however, I would like to be able to compare the best fit model for each outcome to each other. For example, if I wanted to compare how much variation in weight was accounted for by the best fit model for weight to how much variation in height was accounted for by the best fit model for height, what metric would I use? I have found some reports and publications that use Covariance Parameter Estimates to compute the Intraclass Correlation Coefficient(ICC) to compare models employing different hieratical structures for the same outcome and I wondered if the ICC could be used for my situation as well.

measured reflectance peak amplitude decreased, with compared to modeled spectra?

I modeled two protein domains structure based on comparative modelling by using MODELLER. I want to link or combine them into one dimer protein based on available template. I need you experiences in in this case since I stuck here. I am awaiting your proposed procedure and MODELLER script.

Yours sincerely

I used the "comparative modeling" algorithm of the online tool ROBETTA to model a protein structure (http://robetta.bakerlab.org/). The tool always generates five models, to which the ROBETTA FAQs state "In the case of homology modeling predictions, the models are ordered by the likelihood of the alignment clusters. Each alignment cluster represents a particular topology."

Thus, if I understand it correctly, the model 1 will be more likely to be the "true structure" than model 2 and model 2 will be more likely to be the "true structure" than model 3 and so on. I am wondering whether this is quantifiable or with other words: I want to know HOW MUCH MORE LIKELY is model 2 than model 1. Is it possible to get "some numbers" for this?

I have a second problem. I generated a mutant of the protein that I am trying to model and the mutation (one single point mutant) is a complete loss-of-function mutant in vivo. When I run the comparative modeling algorithm on ROBETTA for the wildtype protein I get a structure that looks very plausible (for a lot of reasons) and it is the model ranking second ("model 2"). If I use the "mutant" with the "loss-of-function" mutation" under the same conditions in the same algorithm, I also get this structure (or something VERY SIMILAR). However, this structure is now not model 2 but instead model 4. It seems to be "ranking lower" (i.e. be "less likely"?). Can I interpret these data in a way that the loss-of-function mutation makes it harder to acquire this specific fold (e.g. can I at least say the data hint in this direction)? And if yes, would it be possible to quantify this (i.e. saying HOW MUCH does the mutation interfere with folding)?

Hello,

I am modeling a tsunami model from a FEM software. To reduce the calculation time I did it as a 2 D model (assume the plane strain condition is per 1 mm width) .

To compare the model results with experimental results do I have to consider any factors? ( ex- multiply software results it with the experiment model width?, etc )

Hi,

I'm using a lagrangian dispersion model to study the contribuition of urban traffic to concentrations of NOX. We need to compare model results with measurements, but of course the model does not include chemistry and cannot simulate the photochemical NOX.On the other hand we both have a regional scale model that can treat the full chemistry and we have access to several monitoring station data.

I'm currently studying the method described in Saravanan A., et al, "A Method for Estimating Urban Background Concentrations in Support of Hybrid Air Pollution Modeling for Environmental Health Studies", Int. J. Environ. Res. Public Health 2014. but I can't seem to find a practical example of the method application.

Can someone suggest an alternative method with a reference in which we can see it actually applied to some cases?

thanks

Felicita

According to ASME standards validation of a physical/mathematical model has to be done by comparing model predictions with the "real world", i.e. with experimental results.

Often, however, no such results are availible, at least not with the necessary accuracy and error analysis when it comes to turbulent flow.

Shouldn't we, as an alternative, take high quality DNS results instead and still call it "validation of a model" ?

It is known that Wald tests and likelihood ratio tests typically yield very similar results, especially as the sample size increases.

When conducting trend test for a continuous variable in CLR, should I report the p-value for trend from the likelihood ratio tests comparing the model with the continuous variable to that without the continuous variable, or report the p-value for the continuous odds ratio (Ward test)?

I'm using "out of sample maximum likelihood" to assess model fit from several candidate prediction models generated in SAS 9.3 using PROC GLIMMIX or PROC GENMOD (binary distribution). "Out of sample maximum likelihood" is calculated as: ∑

*(measured * log(predicted) + (1-measured)*log(1-predicted))*. Has anyone used this method to compare similar models, or can you recommend other statistics to compare models and assess model fit? I'm seeking method alternatives or references for application of "out of sample maximum likelihood" to assess model fit.AUROC is not necessary to be equal to the model accuracy. Sometimes the AUC records higher rate than the accuracy and vice versa. What dose that tell us? Is there any link to ranking quality or confidence of the decision of the model?

If I am trying to estimate two completing models using maximum likelihood estimation and one of them fails to converge. Can I assume that the model that converged is a better fit ?

Often times applied researchers use nonlinear transformations on variables that are not normally distributed prior to modeling. I would like to show that other techniques result in superior models, but comparative model fit statistics such as AIC and BIC cannot be used with a nonlinear transformation (Burnham & Anderson, 2002). Are there other statistical methods for comparing models?

I took a protein domain sequence from Uniprot database. My current aim is to design a structural model of this domain using a domain homologue with 25% sequence identity. Will it be a good choice to serve the purpose.?