Does anyone here knows how to perform a post-hoc test?

I desperately need some help regarding with my statistical analysis prior to my undergraduate research study.. What are the best post hoc test that can be perform for pairing two groups with experimental and positive control group?

When you just compare two groups, then hy apply a post hoc test? Then you can just go for a simple T-test (when distributions in both groups are normal) or a non-parametric test if needed (I believe it's the Mann-Whitney U test from the top of my head)

If you compare two groups that have been treated vs control you will most likely use 2-Way Anova testing given normal distribution for you data. As an advice. The best software available in terms of handling etc. is graph pad prism. You can download a trial version of it and check all their tutorials. Is is really self-explaining. You choose the groups/design you used or you can even go with their examples and fill in your own data. Also google for Excel Sheets that to simple ANOVA or ttests. You can download them and fill in your data.

Finally graph pad has powerful online tools which you can use for free to start with......

Myra Ponce · Nueva Ecija University of Science and Technology

thank you for your help...but i have already done the ANOVA test... unfortunately there is a significant difference between the groups, i need to perform a post-hoc test in order to determine why there is a significant difference between the two groups...my adviser said that i have to perform a post-hoc test...and i don't know what type of post-hoc test i will perform and i don't know how.....

1-way ANOVA and Student's T-test are the same thing for two groups.

To calm down your adviser, you can make the Bonferroni correction : divide your alpha level by the number of comparisons. Here, there is 1 comparison (two groups...), and alpha/1 = alpha, you just do not change anything but your adviser will be happy with the presence of Bonferroni correction ;)

More seriously, as all other's said, post-hoc tests are only required if you have more than two groups.

Just to be sure: can you confirm you have just two groups including the control group? I am not completely sure when reading your question, but this may be due to my bad English.

>Matthias : with two groups, treated or not, this is a 1-way ANOVA, unless you have an additionnal factor to take into account (like sex), but this mean in some sense at least for groups (treated/male, treated/female, not-treated/male, not-treated/female ; crossed 2-ways ANOVA) or you use the same subjects with or without treatment (nested 2-ways ANOVA, equivalent to the paired sample T-test if only two treatments).

Oct 5, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

I have only two groups, an experimental and positive control groups...but the experimental groups are divided into three treatments. The 1-way anova test failed to give a not significant result, this is why my adviser required me to perform a post-hoc test....

Experimental group is consist of three different concentrations, with each concentrations i have three replicates and the positive control also have three replicates. I used the student's t-test in comparing each of the replicates of the experimental group to the replicates of the positive control group, this resulted to have significant differences, the same thing happened with the 1-way anova test between the experimental and positive control group.

Strange formulation: "failed to give a not significant result". Usually one seeks for a significant result but it may happen that one fails to get it.

You explain a beautiful example of a seemingly very smart experimental setup that is to be abused to answer a question it was not designed to answer. The problem here is the testing (the experiment is actually fine). The test requires a well-stated null hypothesis. This is not clearly defined in your case. You seem to have several null hypotheses, some of them seem to be "nested" (dependend), others seem to be "parallel" (independent). Considering the multitude of different tests you like to do, one needs to control for the expected false-positive rate (this is what tests are usually done for). For a series of independent tests, the chance of "at least one false-positive" can be controlled by Bonferroni-correction, or - slightly more efficient - by the Bonferroni-Holm-correction. I recommend to use the pooled standard errors here.

The ANOVA asks a different question (are *all* groups sampled from the same population). It is a often-used makeshift to help controlling the false-positive rate when several (independent!) tests have to be performed. It is not clear how to control the rate for the pair-wise comparisons after a significant ANOVA by post-hoc tests exactely. There are more liberal and more stringent post-hoc tests and there is no general rule which one to use. The main feauture of tese post-hoc tests is that they also use the pooled standard error (and make some fine-tuning of the degrees of freedom).

I find it counter-productive to do such tests you aim to do. It is most important to report the effect sizes (e.g. the group differences) together with the precision of the estimation (best to provide confidence intervals). Any interpretation shpould be based on these estimated sizes and precisions, not on some difficult-to-interpret p-values.

However, I'm afraid this won't be an option for you. So somehow you must produce some p-values to keep your boss satisfied. The groups with different concentrations are not independent, so ANOVA is essentially out, anyway. My suggestion (for one drug, tested in several concentrations):

1) compare the treated with highest conc with the controls. If this is non-sign. Then stop and report that you were not able to find a treatment effect. If it is sign. (and only then!) then proceed:

2) compare the treated with the second-highest conc with the controls. Rest as in 1.

3) third-highest-conc vs. controls... as above

If 1) and 2) are sign. but not 3), you can say that you found a treatment effect for concentrations greater or equal the second-highest conc.

If you have several different drugs, do all this for each drug, but multiply the p-values of the individual tests with the number of drugs analyzed (Bonferroni-correction) before you compare them with the level of significance.

A few additional remarks to the very detailed answer of Jochem :

1) In statistical language, you have three experimental groups, one for each concentration --- unless you use different concentrations on the same objects (animals; plants; patients... whatever it is). This point is very important: if you use the same animals for the three different concentrations, you cannot use 1-way ANOVA. If animals are different, you can use it « safely ».

2) If there are different increasing concentrations, instead of doing several Student's tests of each group to the "positive" control, you may try a linear regression of your answer on the concentration (or its log, as if often more practical); in this case, you just test that the slope is non-negative (you don't event have to worry too much about the linearity, provided you expect a monotonic effect of the concentration), that is a single test instead of multiple ones: no problem of "post-hoc", "multiplicity correction" and so on...

3) For control of all-pair-wise comparisons errors after significant ANOVA, the two extrem approaches are the (protected) Lowest Significant Difference of Fisher (LSD), which is more powerful but as the less efficient control, and the Tukey's Honestly Significant Difference (HSD), which is less powerful but has an optimal control of the type I error.
However, in your situation of comparing every group to the control, you may consider the Dunnett's test, which is optimal for that situation.

In fact, you have to precisely define the comparisons needed in your situation to perform the most suited test.

Of course, we are ready to help you for that if you can provide more details on your experiment and your question (data are not needed, at least if everything goes well ;)

Emmanuel, I don't think that using the same drug in different concentrations is *not* safely allowing to perform an ANOVA. The drug effect is clearly not independent between the groups. But you mentioned a good point: since there should be dose-dependency, this dependency can be directly tested, e.g. by a regression. When the effect sizes are not of interest, a rank-correlation can be used instead.

PS: I thought Scheffe's test is more conservative than Tukey's test? Tukey is a good compromize between Scheffe and LSD.

PPS: I really wonder why people bother with post-hoc tests when there is are straight-forward adjustments to control wamili-wise-error-rates or false-discovery-rates. Would be great if someone could give me some explanation.

Jochen, application condition for ANOVA is not exactly that data are independant, but that _residuals_ are independant. If drug as an effect, one can effectively expect that animals with higher doses will high higher observed values [or lowers, depending on the drug effect], but that is handled by the Y=f(X) part of the model. Independance of residuals only means that the measurement/animal effect cannot be predicted from any other value of the sample. This can also be seen if you think that the dependance you mention is only true if you know the X-value (the dose you have), not absolutly. This is why I think ANOVA can be « safly » used if there is not two measures on the same animal. Note however the quotes, since any other source of correlation may exist (same operator, same experimental measure apparatus), and problems with normality & homoscedasticity... Best is always to check all of this on the residuals...
Scheffe's test is more conservative because it deals with a test of "all possible linear combinations of the means" and not only "all pairwise comparisons" like Tukey's HSD and Fisher's (P)LSD. As a matter of fact, it is a kind of extrem answer to your last question: to make the equivalent of the Scheffe's test with some kind of Bonferroni correction, you will have to divide Type I error by +\infty -> it will be 0 hence never reject H0.
Post-hoc tests are in fact, in this context (and in my opinion), the best way to handle FWR because they explicitly include correlations betweens the various tests, hence leading to an exact & optimal (?) correction of the individual tests, whereas procedures like Bonferroni, Holm and so on are more or less based on the independance and are more conservative, and less powerful. As an extreme case, imagine you make ten times exactly the same comparison between two groups, each one only by changing the scale (for instance, comparing the same two distances in m, in mm, in km...), and a single p is 0.01 (obviously, the same for all). Automatic application of Bonferroni correction will divide the alpha by 10 (ten tests...), hence will say "no significant difference", but a well conducted post-hoc test should detect that tests are linearly corrected, hence will not correct the alpha and still say the comparisons are significant.
A small example in R :
x <- factor( rep( c( "a", "b" ), each = 10 ) )
y <- c( rnorm( 10 ), rnorm( 10 ) + 0.9 )
# y :
# [1] -0.70804836 0.42911448 -0.99932377 1.50128463 -0.20749743 0.08346059
# [7] -1.33795236 -0.80789652 0.47931451 -0.24948593 0.94070948 0.82732299
#[13] 1.79875929 1.47437724 0.73565928 0.99655882 -0.70082870 1.64889175
#[19] 1.22837745 1.30479729
t.test( y ~ x, var.equal = T )
# p = 0.002655
# Double test : méthode de Bonferroni
p.adjust( c( t.test( y ~ x, var.equal = T )$p.value, t.test( (2 *y) ~ x, var.equal = T )$p.value ) )
# [1] 0.005310643 0.005310643
library( "multcomp" )
# Double test : contrastes avec Student multidimensionnelle, "optimale"
summary( glht( lm( y ~...

(End of the comment seems to be missing ? I rewrote it here just in case :

summary( glht( lm( y ~ x ), linfct = rbind( c( 0, 1 ), c( 0, 10 ) ) ) )
# Simultaneous Tests for General Linear Hypotheses
# Fit: lm(formula = y ~ x)
#
# Linear Hypotheses:
# Estimate Std. Error t value Pr(>|t|)
# 1 == 0 1.2072 0.3466 3.483 0.00266 **
# 2 == 0 12.0717 3.4659 3.483 0.00266 **

So suited post-hoc tests are more powerful, especially in cases of highly correlated tests. However, one need to use them in the best conditions to really think what are exactly the comparisons of interest for the question asked --- but is this really a drawback? Sadly, except in most common cases (Tukey, Dunnett and a few others), they are not straightforward to use in statistical packages and need a good knwoledge of the underlying theory of the linear model...

Oct 6, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

Thank you for your kind help....

It's just a simple cytotoxicity and antioxidant activity screening of one isolate. I have 2 groups, one experimental and a positive control group, the experimental group was subjected to 3 treatments with different concentrations such as 10ppm, 100ppm and 1000ppm. Each of theses concentrations have three replicates, i have compared this to a positive control, based on my study, i alraedy predicted that i will have a significant difference between the two groups, but my adviser wants me to perform a post-hoc test in order to determine the difference between the two groups.

Is there a posibilty that there will be a significant factor in terms of the time intervals?

Thank you very much.... =')

Oct 7, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

I have attached the results of my studies...can you help me to make a definite conclusion that will support theseresults...thank you =)

> Myra : I've took a glance to your file. I think it is very difficult to help you without more precise details. In fact, when reading and guessing what it is about, it seems ANOVA is not the good method to use for your data, but probably I misunderstood your problem and kind of data. Below are a few questions...
In the first table, you wrote "number of dead units" (shrimps), and measure 10 in the positive control group, whatever the replicate is. How is this "ten" obtained? If it means than 10 of 10 shrimps in the experiment are dead at the end, and only 2 or 3 of 10 in other groups, I am not sure simple ANOVA is the most suited tool...
It seems to be a kind toxicology study, with the positive control a known molecule causing death of the shrimps and the experimental groupS receiving the unkwown molecule and causing less death, is this true? What is the aim, showing toxicity or lack of toxicity of your molecule? According to the question, tests to answer it will be very different: first one will suggest a kind of equivalence test (but a negative control would be useful), wheras the second suggests unilateral tests of each group vs control (unilateral Dunnett's test, assuming ANOVA is correct)...

Oct 7, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

yes it's just a simple toxicology test...

the ten indicates the total number of dead shrimps on the positive control group at whatever concentrations. the main objective of my paper is to determine the cytotoxicity level of phytol, whereas i found out that is not cytotoxic and my result supports the LC50 value. as well as the significant result of the experimental versus the standard toxic substance. however my adviser wants me to determine that significant difference....

assuming that the ANOVA is correct, i don't know how to perform the dunnett's test....hope you could help me out..

How many shrimps were present a total in each individual experiment ? I find strange that on the three replicates of the positive control group, you have exactly the same number of dead shrimps... Unless you had 10 shrimps at the beginning and all of them died...
But if your variable is the number of dead shrimps amongst 10 (or any other fixed, known number of shrimps), you are interested with the change of death probability according to change in the phytol concentration and the tool is logistic regression, not ANOVA.
If not, ANOVA will also be difficult with a group with nul variance...
Letality is obiously higher in the positive control group than with any concentration of phytol, so phytol seems less toxic than your control, however only LC50 can give a definine answer. But finding a (reliable) LC50 with only 3 doses is quite a challenge!
Please could you provide a precise description of the experiment, otherwise helping you is not really possible, the risk is high to give you incorrect answers (remember for instance the "two groups" problem, letting us think that post-hoc tests were useless, when in fact you have more than two groups and post-hoc tests can be considered...)

Oct 8, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

Thank you for your help....

Oct 9, 2012

Myra Ponce · Nueva Ecija University of Science and Technology

My adviser wants me to perform a dmrt test for these result, but i don't think that it is the right test for it. Hope that you could help me out...thank you...

1) Duncan test is probably not suited for this problem, since it makes all comparisons and try to build some kind of classes of groups, which does not seem to be your question. Dunnett's test to compare all concentrations to the control would be more suited, but you will face the problem of the no-variability in your control group

2) Once again: if you do not give more details on your experimental setup, answers may be completely unadapted to your data! Please explain how you do the experiment to count the number of dead shrimps. And have you a negative control to compare the effect of the product with the effect of nothing? Since it seems to be less toxic than your control, a negative control would be important...

This thread has been written just assuming that it is OK to do a post-hoc test. Whether or not it is OK to do a post-hoc test is the subject of some debate among statisticians. Your advisor, however, is advising that you do one. I advise that you do some research on the net so that you can include the debate as part of your paper: it will be a lot stronger for it.If I remember right, Steve Simon ("Professor Mean") has published on the net about this: hopefully this will give you somewhere to start.

## All Answers (22)

Thomas A. Groen· Universiteit TwenteMatthias Totzeck· Heinrich-Heine-Universität DüsseldorfFinally graph pad has powerful online tools which you can use for free to start with......

http://www.graphpad.com/quickcalcs/

Hope this helps

Myra Ponce· Nueva Ecija University of Science and TechnologyJochen Wilhelm· Justus-Liebig-Universität GießenEmmanuel Curis· Université René Descartes - Paris 5To calm down your adviser, you can make the Bonferroni correction : divide your alpha level by the number of comparisons. Here, there is 1 comparison (two groups...), and alpha/1 = alpha, you just do not change anything but your adviser will be happy with the presence of Bonferroni correction ;)

More seriously, as all other's said, post-hoc tests are only required if you have more than two groups.

Just to be sure: can you confirm you have just two groups including the control group? I am not completely sure when reading your question, but this may be due to my bad English.

>Matthias : with two groups, treated or not, this is a 1-way ANOVA, unless you have an additionnal factor to take into account (like sex), but this mean in some sense at least for groups (treated/male, treated/female, not-treated/male, not-treated/female ; crossed 2-ways ANOVA) or you use the same subjects with or without treatment (nested 2-ways ANOVA, equivalent to the paired sample T-test if only two treatments).

Myra Ponce· Nueva Ecija University of Science and TechnologyExperimental group is consist of three different concentrations, with each concentrations i have three replicates and the positive control also have three replicates. I used the student's t-test in comparing each of the replicates of the experimental group to the replicates of the positive control group, this resulted to have significant differences, the same thing happened with the 1-way anova test between the experimental and positive control group.

Thank you for your help! =')

Jochen Wilhelm· Justus-Liebig-Universität GießenYou explain a beautiful example of a seemingly very smart experimental setup that is to be abused to answer a question it was not designed to answer. The problem here is the testing (the experiment is actually fine). The test requires a well-stated null hypothesis. This is not clearly defined in your case. You seem to have several null hypotheses, some of them seem to be "nested" (dependend), others seem to be "parallel" (independent). Considering the multitude of different tests you like to do, one needs to control for the expected false-positive rate (this is what tests are usually done for). For a series of independent tests, the chance of "at least one false-positive" can be controlled by Bonferroni-correction, or - slightly more efficient - by the Bonferroni-Holm-correction. I recommend to use the pooled standard errors here.

The ANOVA asks a different question (are *all* groups sampled from the same population). It is a often-used makeshift to help controlling the false-positive rate when several (independent!) tests have to be performed. It is not clear how to control the rate for the pair-wise comparisons after a significant ANOVA by post-hoc tests exactely. There are more liberal and more stringent post-hoc tests and there is no general rule which one to use. The main feauture of tese post-hoc tests is that they also use the pooled standard error (and make some fine-tuning of the degrees of freedom).

I find it counter-productive to do such tests you aim to do. It is most important to report the effect sizes (e.g. the group differences) together with the precision of the estimation (best to provide confidence intervals). Any interpretation shpould be based on these estimated sizes and precisions, not on some difficult-to-interpret p-values.

However, I'm afraid this won't be an option for you. So somehow you must produce some p-values to keep your boss satisfied. The groups with different concentrations are not independent, so ANOVA is essentially out, anyway. My suggestion (for one drug, tested in several concentrations):

1) compare the treated with highest conc with the controls. If this is non-sign. Then stop and report that you were not able to find a treatment effect. If it is sign. (and only then!) then proceed:

2) compare the treated with the second-highest conc with the controls. Rest as in 1.

3) third-highest-conc vs. controls... as above

If 1) and 2) are sign. but not 3), you can say that you found a treatment effect for concentrations greater or equal the second-highest conc.

If you have several different drugs, do all this for each drug, but multiply the p-values of the individual tests with the number of drugs analyzed (Bonferroni-correction) before you compare them with the level of significance.

Emmanuel Curis· Université René Descartes - Paris 51) In statistical language, you have three experimental groups, one for each concentration --- unless you use different concentrations on the same objects (animals; plants; patients... whatever it is). This point is very important: if you use the same animals for the three different concentrations, you cannot use 1-way ANOVA. If animals are different, you can use it « safely ».

2) If there are different increasing concentrations, instead of doing several Student's tests of each group to the "positive" control, you may try a linear regression of your answer on the concentration (or its log, as if often more practical); in this case, you just test that the slope is non-negative (you don't event have to worry too much about the linearity, provided you expect a monotonic effect of the concentration), that is a single test instead of multiple ones: no problem of "post-hoc", "multiplicity correction" and so on...

3) For control of all-pair-wise comparisons errors after significant ANOVA, the two extrem approaches are the (protected) Lowest Significant Difference of Fisher (LSD), which is more powerful but as the less efficient control, and the Tukey's Honestly Significant Difference (HSD), which is less powerful but has an optimal control of the type I error.

However, in your situation of comparing every group to the control, you may consider the Dunnett's test, which is optimal for that situation.

In fact, you have to precisely define the comparisons needed in your situation to perform the most suited test.

Of course, we are ready to help you for that if you can provide more details on your experiment and your question (data are not needed, at least if everything goes well ;)

Jochen Wilhelm· Justus-Liebig-Universität GießenPS: I thought Scheffe's test is more conservative than Tukey's test? Tukey is a good compromize between Scheffe and LSD.

PPS: I really wonder why people bother with post-hoc tests when there is are straight-forward adjustments to control wamili-wise-error-rates or false-discovery-rates. Would be great if someone could give me some explanation.

Emmanuel Curis· Université René Descartes - Paris 5Emmanuel Curis· Université René Descartes - Paris 5summary( glht( lm( y ~ x ), linfct = rbind( c( 0, 1 ), c( 0, 10 ) ) ) )

# Simultaneous Tests for General Linear Hypotheses

# Fit: lm(formula = y ~ x)

#

# Linear Hypotheses:

# Estimate Std. Error t value Pr(>|t|)

# 1 == 0 1.2072 0.3466 3.483 0.00266 **

# 2 == 0 12.0717 3.4659 3.483 0.00266 **

So suited post-hoc tests are more powerful, especially in cases of highly correlated tests. However, one need to use them in the best conditions to really think what are exactly the comparisons of interest for the question asked --- but is this really a drawback? Sadly, except in most common cases (Tukey, Dunnett and a few others), they are not straightforward to use in statistical packages and need a good knwoledge of the underlying theory of the linear model...

Myra Ponce· Nueva Ecija University of Science and TechnologyIt's just a simple cytotoxicity and antioxidant activity screening of one isolate. I have 2 groups, one experimental and a positive control group, the experimental group was subjected to 3 treatments with different concentrations such as 10ppm, 100ppm and 1000ppm. Each of theses concentrations have three replicates, i have compared this to a positive control, based on my study, i alraedy predicted that i will have a significant difference between the two groups, but my adviser wants me to perform a post-hoc test in order to determine the difference between the two groups.

Is there a posibilty that there will be a significant factor in terms of the time intervals?

Thank you very much.... =')

Myra Ponce· Nueva Ecija University of Science and TechnologyEmmanuel Curis· Université René Descartes - Paris 5In the first table, you wrote "number of dead units" (shrimps), and measure 10 in the positive control group, whatever the replicate is. How is this "ten" obtained? If it means than 10 of 10 shrimps in the experiment are dead at the end, and only 2 or 3 of 10 in other groups, I am not sure simple ANOVA is the most suited tool...

It seems to be a kind toxicology study, with the positive control a known molecule causing death of the shrimps and the experimental groupS receiving the unkwown molecule and causing less death, is this true? What is the aim, showing toxicity or lack of toxicity of your molecule? According to the question, tests to answer it will be very different: first one will suggest a kind of equivalence test (but a negative control would be useful), wheras the second suggests unilateral tests of each group vs control (unilateral Dunnett's test, assuming ANOVA is correct)...

Myra Ponce· Nueva Ecija University of Science and Technologythe ten indicates the total number of dead shrimps on the positive control group at whatever concentrations. the main objective of my paper is to determine the cytotoxicity level of phytol, whereas i found out that is not cytotoxic and my result supports the LC50 value. as well as the significant result of the experimental versus the standard toxic substance. however my adviser wants me to determine that significant difference....

assuming that the ANOVA is correct, i don't know how to perform the dunnett's test....hope you could help me out..

Myra Ponce· Nueva Ecija University of Science and TechnologyEmmanuel Curis· Université René Descartes - Paris 5But if your variable is the number of dead shrimps amongst 10 (or any other fixed, known number of shrimps), you are interested with the change of death probability according to change in the phytol concentration and the tool is logistic regression, not ANOVA.

If not, ANOVA will also be difficult with a group with nul variance...

Letality is obiously higher in the positive control group than with any concentration of phytol, so phytol seems less toxic than your control, however only LC50 can give a definine answer. But finding a (reliable) LC50 with only 3 doses is quite a challenge!

Please could you provide a precise description of the experiment, otherwise helping you is not really possible, the risk is high to give you incorrect answers (remember for instance the "two groups" problem, letting us think that post-hoc tests were useless, when in fact you have more than two groups and post-hoc tests can be considered...)

Myra Ponce· Nueva Ecija University of Science and TechnologyMyra Ponce· Nueva Ecija University of Science and TechnologyEmmanuel Curis· Université René Descartes - Paris 52) Once again: if you do not give more details on your experimental setup, answers may be completely unadapted to your data! Please explain how you do the experiment to count the number of dead shrimps. And have you a negative control to compare the effect of the product with the effect of nothing? Since it seems to be less toxic than your control, a negative control would be important...

Martin Patrick Holt· University of LeicesterCan you help by adding an answer?