- Bruce Weaver added an answer:What can I do while doing two group multivariate analyses with unequal group sizes?
I want to compare performances of 9 patients with those of 45 healthy matched controls on a series of cognitive tests. Most of the related studies have used ANOVA, even when the comparisons have been between two groups on only one dependent variable. But, all the studies have had relatively equal group sizes (8-8, 9-9, 9-10, 11-12, etc.). What can I do with the large difference between the group sizes while doing one-way ANOVA, mixed ANOVA (group[patient, normal] as between subject variable, subtests/conditions as within-subject variable, and a dependent variable [e.g., number of correct responses]), or ANCOVA (baseline tests as covariates)? Meanwhile, Levene's Test shows no deviations from homogeneity of variance (homoscedasticity) for most of the variables (not for all), also, the p values (for differences between the groups) are very low.
I wonder what would happen if Fagerland & Sandvik's simulations were performed using Swedish or Finnish computers. Would the results change? :-)Following
- Rainer Duesing added an answer:What is the best way to analyze a 2x4 within design, with a metric between variable?
I have a 2 (valence: pos vs neg) x 4 (stimulus type) design. Both variables were assessed within each subject. Additionally, I have a metric trait variable. A simple way would be to do a median split an incorporate it as a between variable to a 2x2x4 ANOVA, but I am aware of problems with median splits (although sometimes exaggerated, see DeCoster, Iselin, & Gallucci, 2009).
I tried to run the it as an ANCOVA, including the trait variable as a metric variable, but I am not really satisfied with the interpretability of this option.
Do you have any other ideas how to analyze the data and keeping the trait variable as a metric one?
I use SPSS and MATLAB.
Yes, I have the Tabachnick & Fidell book (great book) at hand but ignored this chapter too long I suspect. I also have West, Welch, and Galecki - Linear mixed models. Roh, I thought if you already used R, you could give a quick introduction ;-)
I think we both attended to the same MPlus workshop. Maybe I'll ask TimoFollowing
- Prasanth Sasidharan added an answer:What is the difference between ANCOVA and Repeated measure ANCOVA?
Could you please guide me on ANCOVA and repeated measure ANCOVA? What is the difference between these two, Please guide using an example.
Hi all Thanks for the help
I have few more clarity required like ANCOVA and Repeated measure Analysis of covariance not ANOVA. Plz suggestFollowing
- Amber Muhinyi added an answer:Can anyone advise on how I can conduct a power analysis for ANCOVA using GPower or another method?
Gpower requires that df numerators be specified - can anyone advise how to estimate these in order to determine sample size? Thank you
Thank you Joan - this looks excellent.Following
- Massimiliano Grassi added an answer:Is wild bootstrap valid in a “bordeline" case t-test?
I have to compare the mean levels of a continuos variable y (ranging 1-20) subdividing my sample in two groups according to a dichotomous variables (i.e gender). Sample sizes of the two groups are unequal (m=20/f=80).
It happened that in the male (smaller) group, all subjects have y=1, while in the female group scores ranges along all possible scores, although the distribution is not normal and highly skewed towards the lower scores. As of these, the male subgroup has no variance in y while the female group shows it.
Considering these issues, I thought to use bootstrap on t-test to make a more reliable mean comparison between the two groups:
I first applied bootstrap, stratified for gender, with (welch) t-test.
I then tried a wild bootstrap approach with the same t-test. As a matter of fact, I can consider this also a special regression with a single dichotomous predictor, with an extreme hetehroscedasticy and non-normality of residuals and in regression with heteroscedasticy the wild bootstrap approach is usually recommended.
What relevantly differs between the two bootstrap strategies is that with a wild approach I’m bootstrapping residuals from one group to subjects of another group. Results are also very different: wild bootstrap provides much much smaller p/CI than the stratified bootstrap with welch correction.
My questions are:
1 is a boostrapped t-test valid in this situation? And if so,
2 what of the two bootstrap approach is the most correct in this situation?
3 in case I would add some covariates (i.e. ANCOVA with one dichotomous predictor and two continuos predictors), what is again the most correct bootstrap strategy?
Many thanks for your help!
Thank you all and sorry for the late reply (notifications went to spam folder...)!
I totally agree with you all and as you got it, my worries rely on possible reviewers' concerns on not providing a formal testing for the group difference. I think beyond the explanation I can add a bootstrapped CI of the female mean to show it does not include the 1, that seems to me a easy-to-be accepted compromise.
Actually I have a couple of more y’s to test with the same model (showing heteroskedasticy and non-normality of residuals, even though non linearity seems not to be the issue) and I think i will go for a wild bootstrap for these.
And thank you all for the general reminders that are truly valid for every analysis!Following
- Jochen Wilhelm added an answer:Can anybody help on repeated measure ANCOVA?
Is it possible to do R ANCOVA with only pre and post values and pre value as covariate and time vs treatment interaction? Plz help.........
Just to unreval the common concept:
The example in Gustavo's link proposes to calculate the repeated-measures-ANOVA as
summary(aov(price ~ store + Error(subject/store), data=groceries2))
The very same same is obtained from the anova of the linear model
anova( lm(price ~ subject + store , data=groceries2) )
what is of the form I proposed above.
The difference is that the aov-function gives SumSq values between subjects within stores, whereas the lm-function gives the SumSq between subjects over all stores. If this is not the interesting point, both ways eare equivalent, since they calculate the same resut for the effect of "strore", the lm with the advantage that the actual differences between the stores can be more directly infered.
Both methods also work for unbalanced designs. However, aov thows a warning that the Error() model is singular (deleting some entries causes some stores have no measurements for some subjects and "Error(subject/store)" can not be estimated there; the "problem" can be "solved" by again allowing the between.subject error be calculated over all stores: "Error(subject)")Following
- Meg Barber added an answer:How can one account for confounding variables in data analysis?
I am hoping somebody can help me. I am currently writing my protocol for my research module, however the data analysis section is baffling me!
My study is an RCT with two treatment arms, and I am wondering which is the best method for accounting for/controlling potential confounding covariates. I have visited the idea of performing within-group ANCOVA, and then further statistical analysis based on the F-test significance (e.g. if significant- create subsets and analyse with ANOVA or multiple student t-tests for continuous endpoints) and if F-test finds no significant difference between covariates, just to analyse as originally intended (e.g. student t-test for continuous and chi square for categorical). I have also considered Bonferroni correction but I am unsure if this would be suitable.
Apologies for rambling- my mind is swimming! Many thanks for any advice to be offered!
That's brilliant, thank you for your help!
- Any suggestion about using ANCOVA with repeated measures?
My consulting adviser said that we can't use covariance method when there are more than 2 time points. But I'm not sure about it again!
What's your idea about that?
Unfortunately no, not yet.Following
- Marina Menez added an answer:How should I analyze results from the study that involved pre- and post-test measurements and had one experimental and one control group?
Should I go for ANCOVA (use “group” as fixed factor, and pre-test as covariate)? Or should I go for Split-plot ANOVA (use “group” as between-subjects factor and measurement occasion as within-subjects factor) and look for significant interaction?
(Estimated marginal) mean differences in ANCOVA could also could go into different directions than I expect.
I agree, but given you are already controlled by possible previous differences between your groups you are in a better posibility to establish a) the effect was significant b) the direction of difference is in the direction your test indicates. You can´t assert that in the mixed ANOVA (well, not in a direct way; you should run other tests like Tukey etc.).
Try to get Judd and mcClelland's book...it has all this stuff, presented in a very friendly way...the other good reference is Maxwell and Delaney's book, Designing experiments and analyzing data but it's a little more technical.Following
- Haris Memisevic added an answer:When should I perform ANCOVA?
I was wondering about when is it appropriate to use covariate in the analysis of data. Some cases are straightforward such as when you have pre – post intervention scores in homogeneous groups and use pre intervention scores as a covariate.
But, what to do in the case when comparing for example, motor skills in people with intellectual disability and those without IT. Does it make any sense to include IQ as a covariate when we know that these two groups are different in IQ scores? Any thoughts on this?
Reading the comments I finally realized the right question. I guess you should not use (or at least be very careful about using) covariate when the groups are naturally different on the covariate (not randomly assigned to the group). Thank you for clarifying this concept.Following
- Moath Awawdeh added an answer:Can anyone recommend a good book which studies the One-Way Analysis of Covariance (ANOCOVA or ANCOVA) with applications and tables interpretation ?
I am using ANOCOVA for both of its goals (regression and grouping) using Matlab programming (aoctool). I used Schewart's lectures note and Huck's book (Reading statistic and research) as references but I need more deep studies which present this analysis technique in more detail with more engineering applications, unfortunately I don't use SPSS!
Thank u all!Following
- Mehdi Hedayati added an answer:Ancova, anova or paired t test?
I am puzzled in what test to use when tetsing if two sample outcomes differ statistically significant. Should I use the t test, ancova or anova?In what cases should I consider anova or ancova above t test? I almost always use t test, and also tell my students to do so. Who can help me out?
Dear Ronan, Thank you for your description. It is an interesting point, |I will search again and check the assumption of t-test. In text and popular web such as wikipedia with referring to papers we can find below assumption for t-test:
In a specific type of t-test, these conditions are consequences of the population being studied, and of the way in which the data are sampled. For example, in the t-test comparing the means of two independent samples, the following assumptions should be met:
Each of the two populations being compared should follow a normal distribution. This can be tested using a normality test, such as the Shapiro–Wilk or Kolmogorov–Smirnov test, or it can be assessed graphically using a normal quantile plot.
If using Student's original definition of the t-test, the two populations being compared should have the same variance (testable using F-test, Levene's test, Bartlett's test, or the Brown–Forsythe test; or assessable graphically using a Q–Q plot). If the sample sizes in the two groups being compared are equal, Student's original t-test is highly robust to the presence of unequal variances. Welch's t-test is insensitive to equality of the variances regardless of whether the sample sizes are similar.
The data used to carry out the test should be sampled independently from the two populations being compared. This is in general not testable from the data, but if the data are known to be dependently sampled (i.e., if they were sampled in clusters), then the classical t-tests discussed here may give misleading results.
Anyway, it was an interesting doubt about normal distribution of data or error of data.
On the other hand I think, if the data has not normal distribution, we can use t-test if the sample size be enough large.
I will follow and check it again.
- Mikhail Saltychev added an answer:How do I use ANCOVA for meta-analysis?
Trying to conduct a meta-analysis using post-test values from several studies with two independent groups (cases/controls). There are pre-/post-test means and SDs. What I'm looking for, are tips on how to employ ANCOVA for adjusting for difference between baseline data. What software should I use? I'm quite familiar with CMA and MIX. I'm not sure, but I think that CMA is converting pre-/post-values into change difference. I understand that it works too, but, in this particular case, I'd like to try ANCOVA. Please, do not suggest R. I just don't want to spend too much time on learning the new language. Unfortunately, I do not also have Stata at my disposal. If it is too complex, than I just stick with change difference. Sorry if I couldn't express myself more clearly.
Thank you Tom. I think, meta-regression is not what Cochrane Handbook means when talking about ANCOVA. I did a little search on Internet and, hopefully, found the answer. It seems, that you have to calculate (manually) an effect size for each included study using ANCOVA, and then to put them into meta-analysis (any software, CMA is just fine). There is a presentation on Cochrane site by Jo McKenzie.
Andrew, propensity score is hardly of use in meta-analysis, except maybe for IPD. PS needs huge numbers. I did my PhD using propensity score on sample of 50 000. I believe, there is no way you can use propensity score in a meta-analysis on aggregate scores.Following
- David MacKinnon added an answer:How do I perform a mediation analysis using pretest and post-test scores?
I have a between-subjects manipulation that has 2 levels (X). I want to see if this manipulation affects a performance score, Y. I also want to see a variable, which is measured before Y, would mediate the X on Y effect. So this is the classic mediation problem: X --> M --> Y
I measured both M and Y before and after the manipulation, so that I have M1, M2, and Y1, Y2. My research makes more sense in whether the change in M (i.e. M2-M1) explains the change in Y (i.e. Y2-Y1). A friend suggested me to do the mediation with the difference scores:
X --> (M2-M1) --> (Y2-Y1)
I'm also considering controlling the M1 score and Y1 score in building the mediation model, which should work like an ANCOVA when M1 and Y1 don't interact with X in influencing Y2.
Appreciate your help if you could let me know which way is a better approach in this question.
PS Also considered multilevel and SEM approach, but not quite feasible with my small sample size. That's why I wanna stick with the regression approach.
See Chapter 8 in MacKinnnon (2008, Introduction to Statistical Mediation Analysis) for options with two waves of data and also options for more than two waves of data. There are several options with the pretest-posttest design including difference score, residualized change score, and analysis of covariance. Residualized change and ANCOVA general give very similar results except if there are substantial pre-test differences. The choice depends on the pattern of change over time that you would expect in each group if no intervention was delivered. If the groups were randomized, differences between groups are likely due to random error so regression to the mean would be expected over time--so ANCOVA and residualized change would be reasonable. If the groups are not randomized, a difference score method may be more appropriate as it allows for pretest differences to maintain over time. The ANCOVA model is the most general and it can be estimated with regression or SEM. The SEM approach is more general as it can allow for more complicated models such as models with latent variables and extensions to more waves of data. These issues are described in MacKinnon (2008) as mentioned above. The two wave ANCOVA model and multi-wave longitudinal mediation models are described in MacKinnon (1994: NIDA monograph) and applied to the mediation analysis of a steroid prevention program in MacKinnon et al. (1991; Prevention Science). I believe that both of these papers are on Research Gate. If you can’t get them on Research Gate, please contact me and I will send them to you.
MacKinnon, D. P. (1994). Analysis of mediating variables in prevention and intervention research. In A. Cazares & L. A. Beatty (Eds.), Scientific methods for prevention/intervention research (NIDA Research Monograph Series 139, DHHS Pub 94-3631, pp. 127-153). Washington, DC: U. S. Department of Health and Human Services.
MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum.
MacKinnon, D. P., Goldberg, L., Clarke, G. N., Elliot, D. L., Cheong, J., Lapin, A., et al. (2001). Mediating mechanisms in a program to reduce intentions to use anabolic steroids and improve exercise self-efficacy and dietary behavior. Prevention Science, 2, 15-28.Following
- Robert J Miller added an answer:How can I compare linear relationships?
We have measured simple relationships between the size of different species and their weight (biomass). They are linear regressions (sometimes semilog). The type of question I would like to answer is, for example, if I have 4 species of snail, each with a separate linear equation representing the relationship between size and biomass, are those relationships significantly different, or does one relationship suffice for all my snail species?
I'm not sure how to do this comparison. Each species often has different sample sizes. I thought about just doing the separate relationships, then pooling all samples and comparing the resulting slope to the species-wise relationships with t tests. However I'm worried that species with more samples will bias the results. I could randomly remove samples from those species to equalize them. Another alternative might be to do an ancova with species put in as a dummy variable and look for interactions with species as a test of parallelism. Does that seem reasonable? It seems like a good idea to me because it will also be a test of whether the intercepts are the same.
Excellent points Timothy, thanks for your input.Following
- Bruce E Oddson added an answer:Does anyone have suggestions for reporting a robust ANCOVA?I'm following the example in Andy Field's R book where he suggests that after failing the test for homogeneity of regression slopes, one might do a robust ANCOVA ala Wilcox 2005. I'm able to run the tests no problem, and interpreting them is also not an issue, but for output of the following nature (see below), does anyone know of a standard way to report this data?
I think a way to start at least will be to report the standard ANCOVA up to the point where the interaction is significant and then say robust procedures were followed, how to report these though are a bit beyond me.
ancova(covGrp1, dvGrp1, covGrp2, dvGrp2)
 "NOTE: Confidence intervals are adjusted to control the probability"
 "of at least one Type I error."
 "But p-values are not"
X n1 n2 DIF TEST se ci.low ci.hi p.value crit.val
[1,] 10.30 20 12 -22.166667 2.7863062 7.955575 -47.42320 3.089867 0.0213100575 3.174696
[2,] 11.30 28 17 -19.184343 2.7536447 6.966891 -39.98396 1.615273 0.0167914292 2.985495
[3,] 12.45 32 23 -20.350000 3.9162704 5.196270 -35.02758 -5.672423 0.0008787346 2.824637
[4,] 14.00 27 34 -8.314171 1.4638404 5.679698 -23.71193 7.083583 0.1524122220 2.711016
[5,] 16.10 14 17 3.431818 0.3796813 9.038682 -22.28197 29.145604 0.7085490133 2.844860
ancboot(covGrp1, dvGrp1, covGrp2, dvGrp2,tr = .2, nboot=2000)
 "Note: confidence intervals are adjusted to control FWE"
 "But p-values are not adjusted to control FWE"
 "Taking bootstrap samples. Please wait."
X n1 n2 DIF TEST ci.low ci.hi p.value
[1,] 10.30 20 12 -22.166667 -2.7863062 -47.00379 2.670459 0.0355
[2,] 11.30 28 17 -19.184343 -2.7536447 -40.93482 2.566135 0.0185
[3,] 12.45 32 23 -20.350000 -3.9162704 -36.57264 -4.127360 0.0015
[4,] 14.00 27 34 -8.314171 -1.4638404 -26.04606 9.417719 0.1525
[5,] 16.10 14 17 3.431818 0.3796813 -24.78674 31.650380 0.6980
If you are going to be fair (depends how you look at it) to other robust techniques, then I would say you report it as simply as a regular ANCOVA. You state which package and assumptions you used. You give the p values and associated CIs for each statistic of interest. Although the additional information provided by the procedure is potentially helpful, nobody asks for it when (often incorrectly) "standard" procedures are used.Following
- Scott Taylor Barrett added an answer:How do you conduct a mixed-factors ANCOVA with a time-dependent covariate in R?
This might be a long shot, but I thought I'd give ResearchGate a chance to prove itself.
I'm having trouble trying to conduct an ANCOVA in R when one of my variables is a time-dependent covariate. For simplicity sake, let's say I have variables Y, A, B, and X, where Y is my dependent variable, A is a between subjects factor with two levels, B is a within-subjects factor with 6 levels, and X is a continuous variable I want to add as a covariate and is measured at all levels of A and B.
Any help on how to conduct this in R using either lm(), lme(), aov(), ezANOVA(), or something similar would be very helpful.
Mixed-factors as in a between subjects factor crossed with a within-subjects factor (http://en.wikipedia.org/wiki/Mixed-design_analysis_of_variance).
X is a covariate that is measured within-subjects (like the DV) at each condition of B (the within-subjects factor). X is time-dependent (i.e. time-varying) in that it is not stable across repeated measurements.Following
- Dasapta Erwin Irawan added an answer:How can we determining significance of independent variables in logistics regression?
I am trying to do Logistic Regression in R.
My data set contains more than 50 variables. Some of them are factor (qualitative variable) and others are independent variable(quantitive ). I would like to get the significance of the variables from their p-value.
So far I came to know, I can do ACNOVA test to calculate p value of the factors. It (ACNOVA) combines features of both ANOVA and regression. It augments the ANOVA model with one or more additional quantitative variables, called covariates, which are related to the response variable.
How can I calculate P-value of quantitive variables? Or if I am wrong about ACNOVA, what other possibilities are available?
Any suggestion or help will be appreciated.
Thank you Oliver.Following
- Javier Miguelena added an answer:ANCOVA or repeated measures ANOVA?
My experiments was mainly a split-plot design with water treatment as the main factor. Measurements were taken in 6 harvests from Dec 2012 to May 2013. In Dec, all plots under irrigation; from Jan to March, half plots under irrigation and the other half withholding water; from Apr to May, all plots under irrigation. In each harvest, plants were defoliated after other measurements were taken. I do not know whether I should do an ANCOVA with measurements in the first harvest as covariates, or do a repeated measures ANOVA?
Interesting problem. REML means restricted maximum likelihood, it refers to a way of partitioning variation when you are including random effects. You absolutely need to include random effects in your model. The way I understand it, you are asking if there is an effect of changing the irrigation treatment. I would use a model that includes month (since measurements taken the same month are not fully independent due to cliamte) and plot (since plots might have different "personalities" that show up as you repeat the measurements) as random effects.
The fixed effects should be treatment (continuous vs interrupted irrigation) and treatment time (before vs after irrigation change), as well as an interaction between the two. The interaction term tests the hypothesis that either of the treatments changes its response more than the other after the date when irrigation was interrupted. Your prediction, I think, is that the plots in the "interrupted irrigation" treatment will change more than the control.Following
- When can I use ANCOVA?
I know that we can consider pre-intervention amounts as covariate variable if we want to control initial differences. My question is: should these differences be statically significant or not? I mean, when there is a difference between pre-test scores, but it isn't statically significant, can I consider pre-test scores as covariate yet?
I have one more question.
My consulting adviser said we can't use covariance method when there are more than 2 time points. But I'm not sure about that again!
What's your idea about that?Following
- What is the exact name of my test in spss?
I have a question about the name of a specific test in spss.
I have 4 separated groups (4 different interventions), and I measured the dependent variable over 4 time points. I considered my intervention groups as "between subjects factor" and the time points as "within subjects factor". In this case the suitable test would be "mixed ANOVA with repeated measures", right?
Now if I have a covariate factor, what would be the NAME of the test? "mixed ANCOVA with repeated measures"? or what?
I have one more question.
My consulting adviser said we can't use covariance method when there are more than 2 time points. But I'm not sure about that again!
What's your idea about that?Following
- Bruce Weaver added an answer:Can one use multiple logistic regression to estimate possible confounding effect?
Multiple logistic regression to estimate possible confounder effect?
We revealed that a protein A has significantly higher concentration in patients than in controls, but there might be a potential confounding variable B, which is also significantly different between controls and patients. I’d like to assess how important the effect of the variable B on concentration of protein A is. Is it OK when I compare simple logistic regression with the diagnosis (0=controls, 1=patients) being the dependent variable and the concentration of protein A being the independent variable with data from multiple logistic regression with variable B added among dependent variables?
Jurah, regarding your 3rd point, note that for ordinary least squares (OLS) models, it is the ~errors~ (not the outcome variable) that are assumed to be normally distributed. For further discussion and commentary, click the link below. HTH.Following
- Emir Veledar added an answer:How can you compare two groups that are initially distinct in two moments of time?ANCOVA or difference between pre-and post? Some authors use ANCOVA with the variable measured in initial time as a covariate, and others employ the difference between pre-and post analyzed via t-test.Your question is formulated on a very vage way.
I copy paste question from the top:
"How can you compare two groups that are initially distinct in two moments of time? "
So if we have two groups and they are initially distinct, it means that they are distinct at the baseline, so comparison on baseline is allready done.
If they are "initially distinct in two moments of time" then it means that we compared them in 2 moments and we found them distinct.
Can you re ask your question?Following
- Edwin A. Locke added an answer:What does it mean if a covariate turns the effect of an IV on DV from significant to insignificant in an ANCOVA?How should I interpret this? Does it mean it is a moderator? Or does it mean that there are just no effects of the IV (experimental manipulation)? Thanks!
It means the IV effect was confounded, e.g., one group had more ability than another. The control variable could be a mediator or a moderator or simply a stronger main effect than your IV. You would need more studies to figure this out.Following