National Agricultural Research Center - NARC
Question
Asked 5 February 2023
What statistical test should be used when we have median and its confidence interval?
Hello Statisticians,
I have a dataset where I have median and upper and lower bounds of the 95% confidence intervals for two/more groups. Even the error of the median could be calculated as per the formula SE= (upper bound−lower bound)/(2×1.96) .
Now, my question is how to find whether the two groups are significantly different.
Thanks,
Most recent answer
To determine whether two groups are significantly different using confidence intervals, you can check whether their intervals overlap or not. If the intervals do not overlap, this suggests that the two groups are significantly different. If the intervals overlap, then we cannot conclude that the two groups are significantly different.
Another way to approach this is to calculate a confidence interval for the difference between the two groups. To do this, you can subtract the median of one group from the median of the other group and calculate the standard error of the difference using the formula SE = sqrt(SE1^2 + SE2^2), where SE1 and SE2 are the standard errors of the medians for each group, calculated using the upper and lower bounds of the 95% confidence interval. Then, you can calculate the 95% confidence interval for the difference using the formula:
CI = (difference - 1.96SE, difference + 1.96SE)
If the confidence interval for the difference does not include zero, this suggests that the two groups are significantly different.
It is worth noting that the method described above assumes that the data are normally distributed and that the standard errors of the medians are calculated correctly. If the data are not normally distributed, or if the standard errors of the medians are not accurate, this method may not be appropriate, and other methods may need to be used, such as non-parametric tests like the Mann-Whitney U test or the Wilcoxon rank-sum test.
1 Recommendation
Popular answers (1)
Royal College of Surgeons in Ireland
There are many problems here.
The first is that there are several way of calculating a confidence interval for the median. If the CI is calculated using a rank based method, then the usual 1·96 rule does not apply, or at least not in the way you might hope.
Second, if the confidence intervals overlap the medians may be still different. It's true that if CIs don't overlap then the groups are different, but the reverse is not necessarily true.
Third, there are two ways of comparing medians :
a) the median values of each group and
b) the median difference between an observation in one group and an observation in the other,
These are not the same thing (unlike the case for the means).
Fourth, the median test, the Wilcoxon Mann-Whitney test and the KW test are not tests of equality of medians. They are interpretable as tests for equality of medians only under some unlikely assumptions about the distributions of the groups.
Without the data, you are pretty much guessing, and having to make a lot of assumptions.
3 Recommendations
All Answers (10)
Kent State University
If zero is in the confidence interval at the 95% confidence level there is no difference between the two groups.The p-value depends on whether the null.hypothesis is unidirectional or not. See a good introduction to statistics, the material on hypothesis testing. Best wishes, David Booth
Rutgers, The State University of New Jersey
I assume from the wording of the question that you don't have the raw data, but have the medians and 95% confidence intervals.
You can use a decision rule where if the confidence intervals for two groups don't overlap, that these two groups are significantly different.
This won't necessarily give the same result as a traditional hypothesis test, but it's a valid way to make a decision.
In a two-sample test, this approach will in general be a more conservative approach than a traditional hypothesis test.
The following is an example in R code for the two sample case, where the 95% confidence intervals overlap slightly, but Mood's median test comes out with a p-value below 0.05. But the difference in conclusions, if you try different data sets, is small.
(In the example here, A is (1, 2, 3, ... 24) and B is (9, 10, 11, ... 32). I get confidence intervals that are just slightly overlapping, and a p-value for Mood's median test of about 0.02. You can run the code at https://rdrr.io/snippets/ without installing R.)
A = 1:24
B = 9:32
Y = c(A, B)
Group = factor(c(rep("A", length(A)), rep("B", length(B))))
Data = data.frame(Group, Y)
library(rcompanion)
groupwiseMedian(Y ~ Group, data=Data, bca=FALSE, percentile=TRUE)
library(coin)
median_test(Y ~ Group, data = Data)
2 Recommendations
Lakehead University
David Eugene Booth, judging from your response, I wonder if you understood Damini Jaiswal to be saying she has a 95% CI for the difference between two (independent) medians. I understood her to mean that for each of 2 (or more) groups, she has a median and 95% CI. Judging by Sal Mangiafico's reply, that's how he understood it too.
Damini Jaiswal, although it is about differences in means, not medians, this short CMAJ article reinforces Sal's comment about overlapping CIs.
2 Recommendations
North-Caucasus Federal University
Hi, Damini Jaiswal
To compare some medians (on dependent variable) of some samples (independent or exogenous variable) one should use Kruskal-Wallis ANOVA or/and Median test on raw data. And after all may be useful Multiple comparisons of mean ranks for all groups (personally I prefer to use STATISTICA 12.0 program but may be other programs do it better).
Best regards.
Rutgers, The State University of New Jersey
Maksim Sokolovskii , a couple of notes:
1) Kruskal-Wallis in general isn't a test of medians, although in some cases it can be interpreted this way.
2) What is "Median test" ?
North-Caucasus Federal University
As I found in an electronic manual to STATISTICA program:
"The Median test is a "crude" version of the Kruskal-Wallis ANOVA in that it frames the computation in terms of a contingency table. Specifically, STATISTICA will simply count the number of cases in each sample that fall above or below the common median, and compute the Chi-square value for the resulting 2 x k samples contingency table. Under the null hypothesis (all samples come from populations with identical medians), we expect approximately 50% of all cases in each sample to fall above (or below) the common median. The Median test is particularly useful when the scale contains artificial limits, and many cases fall at either extreme of the scale ("off the scale"). In this case, the Median test is in fact the only appropriate method for comparing samples."
Royal College of Surgeons in Ireland
There are many problems here.
The first is that there are several way of calculating a confidence interval for the median. If the CI is calculated using a rank based method, then the usual 1·96 rule does not apply, or at least not in the way you might hope.
Second, if the confidence intervals overlap the medians may be still different. It's true that if CIs don't overlap then the groups are different, but the reverse is not necessarily true.
Third, there are two ways of comparing medians :
a) the median values of each group and
b) the median difference between an observation in one group and an observation in the other,
These are not the same thing (unlike the case for the means).
Fourth, the median test, the Wilcoxon Mann-Whitney test and the KW test are not tests of equality of medians. They are interpretable as tests for equality of medians only under some unlikely assumptions about the distributions of the groups.
Without the data, you are pretty much guessing, and having to make a lot of assumptions.
3 Recommendations
Rutgers, The State University of New Jersey
Maksim Sokolovskii , I assume this is the manual you've cited: https://docs.tibco.com/pub/stat/14.0.0/doc/html/UsersGuide/GUID-07945880-3420-4AB2-8E33-58A8C117A336.html
"The Median test is a "crude" version of the Kruskal-Wallis ANOVA...". Oy. I don't know who writes this stuff.
2 Recommendations
University of Nevada, Las Vegas
Maksim Sokolovskii , here is the handout I gave one of my classes a few weeks ago to show that the Kruskal-Wallis test doesn't do this (and is intransitive). I am in Las Vegas, hence the gambling story. Just re-loaded the color printer ink, so put in lots of color for printing! Using @Sal's package in this handout!
2 Recommendations
National Agricultural Research Center - NARC
To determine whether two groups are significantly different using confidence intervals, you can check whether their intervals overlap or not. If the intervals do not overlap, this suggests that the two groups are significantly different. If the intervals overlap, then we cannot conclude that the two groups are significantly different.
Another way to approach this is to calculate a confidence interval for the difference between the two groups. To do this, you can subtract the median of one group from the median of the other group and calculate the standard error of the difference using the formula SE = sqrt(SE1^2 + SE2^2), where SE1 and SE2 are the standard errors of the medians for each group, calculated using the upper and lower bounds of the 95% confidence interval. Then, you can calculate the 95% confidence interval for the difference using the formula:
CI = (difference - 1.96SE, difference + 1.96SE)
If the confidence interval for the difference does not include zero, this suggests that the two groups are significantly different.
It is worth noting that the method described above assumes that the data are normally distributed and that the standard errors of the medians are calculated correctly. If the data are not normally distributed, or if the standard errors of the medians are not accurate, this method may not be appropriate, and other methods may need to be used, such as non-parametric tests like the Mann-Whitney U test or the Wilcoxon rank-sum test.
1 Recommendation
Similar questions and discussions
Related Publications
Sample cluster annotation association p-values.
Columns: label (annotation label; string), p_value (enrichment p-value, float), diff (difference between group statistics, float), conf_low (lower bound of confidence interval; float), conf_high (upper bound of confidence interval; float), test (statistical test; string), cluster (sample cluster; inte...
Tras varios decenios de críticas a las técnicas inferenciales
basadas en las pruebas de significación estadística orientadas
al rechazo de la llamada hipótesis nula y, a pesar del notable
consenso alcanzado entre los estadísticos profesionales,
este recurso se mantiene vigente tanto en las publicaciones
biomédicas, entre ellas las de Salud Pública,...
Just as diagnostic tests are most helpful in light of the clinical presentation, statistical tests are most useful in the context of scientific knowledge. Knowing the specificity and sensitivity of a diagnostic test is necessary, but insufficient: the clinician must also estimate the prior probability of the disease. In the same way, knowing the P...