Question
Asked 5 February 2023

What statistical test should be used when we have median and its confidence interval?

Hello Statisticians,
I have a dataset where I have median and upper and lower bounds of the 95% confidence intervals for two/more groups. Even the error of the median could be calculated as per the formula SE= (upper bound−lower bound)/(2×1.96) .
Now, my question is how to find whether the two groups are significantly different.
Thanks,

Most recent answer

Ahmad Al Khraisat
National Agricultural Research Center - NARC
To determine whether two groups are significantly different using confidence intervals, you can check whether their intervals overlap or not. If the intervals do not overlap, this suggests that the two groups are significantly different. If the intervals overlap, then we cannot conclude that the two groups are significantly different.
Another way to approach this is to calculate a confidence interval for the difference between the two groups. To do this, you can subtract the median of one group from the median of the other group and calculate the standard error of the difference using the formula SE = sqrt(SE1^2 + SE2^2), where SE1 and SE2 are the standard errors of the medians for each group, calculated using the upper and lower bounds of the 95% confidence interval. Then, you can calculate the 95% confidence interval for the difference using the formula:
CI = (difference - 1.96SE, difference + 1.96SE)
If the confidence interval for the difference does not include zero, this suggests that the two groups are significantly different.
It is worth noting that the method described above assumes that the data are normally distributed and that the standard errors of the medians are calculated correctly. If the data are not normally distributed, or if the standard errors of the medians are not accurate, this method may not be appropriate, and other methods may need to be used, such as non-parametric tests like the Mann-Whitney U test or the Wilcoxon rank-sum test.
1 Recommendation

Popular answers (1)

Ronán Michael Conroy
Royal College of Surgeons in Ireland
There are many problems here.
The first is that there are several way of calculating a confidence interval for the median. If the CI is calculated using a rank based method, then the usual 1·96 rule does not apply, or at least not in the way you might hope.
Second, if the confidence intervals overlap the medians may be still different. It's true that if CIs don't overlap then the groups are different, but the reverse is not necessarily true.
Third, there are two ways of comparing medians :
a) the median values of each group and
b) the median difference between an observation in one group and an observation in the other,
These are not the same thing (unlike the case for the means).
Fourth, the median test, the Wilcoxon Mann-Whitney test and the KW test are not tests of equality of medians. They are interpretable as tests for equality of medians only under some unlikely assumptions about the distributions of the groups.
Without the data, you are pretty much guessing, and having to make a lot of assumptions.
3 Recommendations

All Answers (10)

David Eugene Booth
Kent State University
If zero is in the confidence interval at the 95% confidence level there is no difference between the two groups.The p-value depends on whether the null.hypothesis is unidirectional or not. See a good introduction to statistics, the material on hypothesis testing. Best wishes, David Booth
Sal Mangiafico
Rutgers, The State University of New Jersey
I assume from the wording of the question that you don't have the raw data, but have the medians and 95% confidence intervals.
You can use a decision rule where if the confidence intervals for two groups don't overlap, that these two groups are significantly different.
This won't necessarily give the same result as a traditional hypothesis test, but it's a valid way to make a decision.
In a two-sample test, this approach will in general be a more conservative approach than a traditional hypothesis test.
The following is an example in R code for the two sample case, where the 95% confidence intervals overlap slightly, but Mood's median test comes out with a p-value below 0.05. But the difference in conclusions, if you try different data sets, is small.
(In the example here, A is (1, 2, 3, ... 24) and B is (9, 10, 11, ... 32). I get confidence intervals that are just slightly overlapping, and a p-value for Mood's median test of about 0.02. You can run the code at https://rdrr.io/snippets/ without installing R.)
A = 1:24
B = 9:32
Y = c(A, B)
Group = factor(c(rep("A", length(A)), rep("B", length(B))))
Data = data.frame(Group, Y)
library(rcompanion)
groupwiseMedian(Y ~ Group, data=Data, bca=FALSE, percentile=TRUE)
library(coin)
median_test(Y ~ Group, data = Data)
2 Recommendations
Bruce Weaver
Lakehead University
David Eugene Booth, judging from your response, I wonder if you understood Damini Jaiswal to be saying she has a 95% CI for the difference between two (independent) medians. I understood her to mean that for each of 2 (or more) groups, she has a median and 95% CI. Judging by Sal Mangiafico's reply, that's how he understood it too.
Damini Jaiswal, although it is about differences in means, not medians, this short CMAJ article reinforces Sal's comment about overlapping CIs.
2 Recommendations
Maksim Sokolovskii
North-Caucasus Federal University
To compare some medians (on dependent variable) of some samples (independent or exogenous variable) one should use Kruskal-Wallis ANOVA or/and Median test on raw data. And after all may be useful Multiple comparisons of mean ranks for all groups (personally I prefer to use STATISTICA 12.0 program but may be other programs do it better).
Best regards.
Sal Mangiafico
Rutgers, The State University of New Jersey
Maksim Sokolovskii , a couple of notes:
1) Kruskal-Wallis in general isn't a test of medians, although in some cases it can be interpreted this way.
2) What is "Median test" ?
Maksim Sokolovskii
North-Caucasus Federal University
As I found in an electronic manual to STATISTICA program:
"The Median test is a "crude" version of the Kruskal-Wallis ANOVA in that it frames the computation in terms of a contingency table. Specifically, STATISTICA will simply count the number of cases in each sample that fall above or below the common median, and compute the Chi-square value for the resulting 2 x k samples contingency table. Under the null hypothesis (all samples come from populations with identical medians), we expect approximately 50% of all cases in each sample to fall above (or below) the common median. The Median test is particularly useful when the scale contains artificial limits, and many cases fall at either extreme of the scale ("off the scale"). In this case, the Median test is in fact the only appropriate method for comparing samples."
Ronán Michael Conroy
Royal College of Surgeons in Ireland
There are many problems here.
The first is that there are several way of calculating a confidence interval for the median. If the CI is calculated using a rank based method, then the usual 1·96 rule does not apply, or at least not in the way you might hope.
Second, if the confidence intervals overlap the medians may be still different. It's true that if CIs don't overlap then the groups are different, but the reverse is not necessarily true.
Third, there are two ways of comparing medians :
a) the median values of each group and
b) the median difference between an observation in one group and an observation in the other,
These are not the same thing (unlike the case for the means).
Fourth, the median test, the Wilcoxon Mann-Whitney test and the KW test are not tests of equality of medians. They are interpretable as tests for equality of medians only under some unlikely assumptions about the distributions of the groups.
Without the data, you are pretty much guessing, and having to make a lot of assumptions.
3 Recommendations
Sal Mangiafico
Rutgers, The State University of New Jersey
"The Median test is a "crude" version of the Kruskal-Wallis ANOVA...". Oy. I don't know who writes this stuff.
2 Recommendations
Daniel Wright
University of Nevada, Las Vegas
Maksim Sokolovskii , here is the handout I gave one of my classes a few weeks ago to show that the Kruskal-Wallis test doesn't do this (and is intransitive). I am in Las Vegas, hence the gambling story. Just re-loaded the color printer ink, so put in lots of color for printing! Using @Sal's package in this handout!
2 Recommendations
Ahmad Al Khraisat
National Agricultural Research Center - NARC
To determine whether two groups are significantly different using confidence intervals, you can check whether their intervals overlap or not. If the intervals do not overlap, this suggests that the two groups are significantly different. If the intervals overlap, then we cannot conclude that the two groups are significantly different.
Another way to approach this is to calculate a confidence interval for the difference between the two groups. To do this, you can subtract the median of one group from the median of the other group and calculate the standard error of the difference using the formula SE = sqrt(SE1^2 + SE2^2), where SE1 and SE2 are the standard errors of the medians for each group, calculated using the upper and lower bounds of the 95% confidence interval. Then, you can calculate the 95% confidence interval for the difference using the formula:
CI = (difference - 1.96SE, difference + 1.96SE)
If the confidence interval for the difference does not include zero, this suggests that the two groups are significantly different.
It is worth noting that the method described above assumes that the data are normally distributed and that the standard errors of the medians are calculated correctly. If the data are not normally distributed, or if the standard errors of the medians are not accurate, this method may not be appropriate, and other methods may need to be used, such as non-parametric tests like the Mann-Whitney U test or the Wilcoxon rank-sum test.
1 Recommendation

Similar questions and discussions

Which stat analysis for repeated measures categorical ordinal variables using SPSS?
Question
3 answers
  • Sharon CopseySharon Copsey
Research Questions - predicting DVs will affect IVs.
3 predictor variables (IVs) - all categorical and ordinal. One could be continuous with lots of transforming, the other two are Likert Scale data.
4 outcome variables (DVs) - all categorical and ordinal Likert scale.
All variables are measured across 3 timepoints before, during and after.
The normal distribution was skewed, which I'm attempting to rectify by amending minimal outliers to the mean value. Is there a maximum number that is acceptable to alter by this method? Or minimum to create a normal distribution?
If the above is acceptable, is one-way ANOVA for the one IV that can be continuous the best option? Or best to run Chi-Square for each DV and IV and reduce categories into sets of 2 e.g. depressed/not depressed & healthy/not healthy? However, Andy Field book advises not to use Chi on repeated measures as each person, item, or entity should contribute to only one cell and my DV would be in two.
If a normal distribution is not acceptable as outlined in my previous paragraph, the Friedman test would measure one variable at the 3 timepoints, but not against each other, can I run Friedman first then any suggestions on how to see differences at timepoints?
Any suggestions wonderfully experienced academics, as I'm getting different suggestions from my supervisor and data support groups verses Andy Field's comment above?
Much appreciated

Related Publications

Data
Sample cluster annotation association p-values. Columns: label (annotation label; string), p_value (enrichment p-value, float), diff (difference between group statistics, float), conf_low (lower bound of confidence interval; float), conf_high (upper bound of confidence interval; float), test (statistical test; string), cluster (sample cluster; inte...
Article
Full-text available
Tras varios decenios de críticas a las técnicas inferenciales basadas en las pruebas de significación estadística orientadas al rechazo de la llamada hipótesis nula y, a pesar del notable consenso alcanzado entre los estadísticos profesionales, este recurso se mantiene vigente tanto en las publicaciones biomédicas, entre ellas las de Salud Pública,...
Article
Just as diagnostic tests are most helpful in light of the clinical presentation, statistical tests are most useful in the context of scientific knowledge. Knowing the specificity and sensitivity of a diagnostic test is necessary, but insufficient: the clinician must also estimate the prior probability of the disease. In the same way, knowing the P...
Got a technical question?
Get high-quality answers from experts.