Content uploaded by Jonathan J Deeks
Author content
All content in this area was uploaded by Jonathan J Deeks on Feb 03, 2015
Content may be subject to copyright.
Statistical algorithms in Review Manager 5
Jonathan J Deeks and Julian PT Higgins
on behalf of the Statistical Methods Group
of The Cochrane Collaboration
August 2010
Data structure
Consider a meta-analysis of k studies. When the studies have a dichotomous (binary) outcome the
results of each study can be presented in a 2×2 table (Table 1) giving the numbers of participant
who do or do not experience the event in each of the two groups (here called experimental (or 1)
and control (or 2)).
Table 1: Binary data
Study i
Event No event Total
Experimental
i
a
i
b
i
n
1
Control
i
c
i
d
i
n
2
If the outcome is a continuous measure, the number of participants in each of the two groups, their
mean response and the standard deviation of their responses are required to perform meta-analysis
(Table 2).
Table 2: Continuous data
Study
i
Group
size
Mean
response
Standard
deviation
Experimental
i
n
1
i
m
1
i
sd
1
Control
i
n
2
i
m
2
i
sd
2
If the outcome is analysed by comparing observed with expected values (for example using the
Peto method or a log-rank approach for time-to-event data), then ‘O – E’ statistics and their
variances are required to perform the meta-analysis. Group sizes may also be entered by the review
author, but are not involved in the analysis.
Table 3: O minus E and variance
Study
i
O minus E
Variance of
(O minus E)
Group size
(experimental)
Group size
(control)
i
Z
i
V
i
n
1
i
n
2
For other outcomes a generic approach can be used, the user directly specifying the values of the
intervention effect estimate and its standard error for each study (the standard error may be
calculable from a confidence interval). ‘Ratio’ measures of effect effects (e.g. odds ratio, risk ratio,
hazard ratio, ratio of means) will normally be expressed on a log-scale, ‘difference’ measures of
1
effect (e.g. risk difference, differences in means) will normally be expressed on their natural scale.
Group sizes can optionally be entered by the review author, but are not involved in the analysis.
Table 4: Generic data
Study
i
Estimate of
effect
Standard error of
estimate
Group size
(experimental)
Group size
(control)
ˆ
i
θ
{
}
ˆ
SE
i
θ
i
n
1
i
n
2
Formulae for individual studies
Individual study estimates: dichotomous outcomes
Peto odds ratio
For study idenote the cell counts as in Table n1, with
iii
ba
+
=
1
e given by
, , and let
ar
i
d
ii
cn +=
2
iii
nnN
21
+= . For the Peto method, the individual odds ratios
,
exp
i
Peto i
i
Z
OR
V
⎧
⎫
=
⎨
⎬
⎩⎭
.
The logarithm of the odds ratio has standard error
()
{}
,
1
SE ln
Peto i
i
OR
V
=
,
where
i
Z
is the ‘O – E’ statistic:
[
]
E
ii i
Z
aa=− ,
with
[]
(
)
1
E
ii i
i
i
nac
a
N
+
=
(the expected number of events in the experimental intervention group), and
(
)
(
)
()
12
2
1
iiiii i
i
ii
nn a c b d
V
NN
++
=
−
(the hypergeometric variance of ).
i
a
Odds ratio
For methods other than the Peto method, the odds ratio for each study is given by
ii
i
ii
ad
OR
bc
=
,
the standard error of the log odds ratio being
()
{}
1111
SE ln
i
iii i
OR
abcd
=
+++
.
Risk ratio
The risk ratio for each study is given by
1
2
/
/
ii
i
ii
an
RR
cn
=
,
2
the standard error of the log risk ratio being
()
{}
12
111 1
SE ln
i
ii i
RR
acn n
=+−−
i
.
Risk difference
The risk difference for each study is given by
12
ii
i
ii
ac
RD
nn
=−,
with standard error
{}
33
12
SE
ii i i
i
ii
ab cd
RD
nn
=+
.
Empty cells
Where zeros cause problems with computation of effects or standard errors, 0.5 is added to all cells
( , , , ) for that study, except when
i
a
i
b
i
c
i
d 0
=
=
ii
ca or 0
=
=
ii
db , when the relative effect
measures and
i
OR
i
R
R are undefined.
Individual study estimates: continuous outcomes
Denote the number of participants, mean and standard deviation as in Table 2, and let
iii
nnN
21
+
=
and
()()
22
112
11
2
iii
i
i
nsdnsd
s
N
−+−
=
−
2i
i
be the pooled standard deviation across the two groups.
Difference in means (mean difference)
The difference in means (referred to as mean difference) is given by
12ii
M
Dmm
=
− ,
with standard error
{}
22
12
12
SE
ii
i
ii
sd sd
MD
nn
=+.
Standardized difference in means (standardized mean difference)
There are several popular formulations of the standardized mean difference. The one implemented
in RevMan is Hedges’ adjusted g, which is very similar to Cohen's d, but includes an adjustment
for small sample bias
12
3
1
49
ii
i
ii
mm
SMD
sN
⎛⎞
−
=−
⎜⎟
−
⎝⎠
,
with standard error
{}
()
2
12
SE
23.9
ii
i
ii i
NSMD
SMD
nn N
=+
− 4
.
3
Individual study estimates: O – E and variance
For study ithe effect estimate is given by
ˆ
i
i
i
Z
V
θ= ,
with standard error
{}
1
ˆ
SE θ=
i
i
V
.
The effect estimate is either of a log odds ratio or a log hazard ratio, depending on how the
observed and expected values were derived.
Individual study estimates: Generic method
As the user directly enters the intervention effect estimates and their standard errors no further
processing is needed. All types of intervention effects are eligible for this method, but it might be
most useful when intervention effects have been calculated in a way which makes special
consideration of design (e.g. cluster randomized and cross-over trials), are adjusted for other effects
(adjusted effects from non-randomized studies) or are not covered by existing methods (e.g. ratios
of means, relative event rates).
Meta-analysis methods
All summations are over i, from 1 to the number of studies, unless otherwise specified.
Mantel-Haenszel methods for combining results across studies
Odds ratio
The Mantel-Haenszel summary log odds ratio is given by
()
,
,
ln ln
⎛⎞
=
⎜
⎜
⎝⎠
⎟
⎟
∑
∑
M
Hi i
MH
MH i
wOR
OR
w
, (1)
and the Mantel-Haenszel summary odds ratio by
,
,
M
Hi i
MH
MH i
wOR
OR
w
=
∑
∑
,
where each study’s odds ratio is given weight
,
ii
MH i
i
bc
w
N
=
.
The summary log odds ratio has standard error given by
()
{}
2
1
SE ln
2
MH
EFGH
OR
2
R
RS S
+
⎛
=++
⎜
⎝⎠
⎞
⎟
, (2)
where
ii
i
ad
R
N
=
∑
;
ii
i
bc
S
N
=
∑
;
4
()
2
iiii
i
adad
E
N
+
=
∑
;
(
)
2
iii
i
adbc
F
N
+
=
∑
i
;
()
2
iiii
i
bcad
G
N
+
=
∑
;
(
)
2
iiii
i
bcbc
H
N
+
=
∑
.
Risk ratio
The Mantel-Haenszel summary log risk ratio is given by
()
,
,
ln ln
⎛⎞
=
⎜
⎜
⎝⎠
⎟
⎟
∑
∑
M
Hi i
MH
MH i
wRR
RR
w
, (3)
and the Mantel-Haenszel summary risk ratio by
,
,
M
Hi i
MH
MH i
wRR
RR
w
=
∑
∑
,
where each study’s risk ratio is given weight
(
)
,
ii i
MH i
i
ca b
w
N
+
= .
The summary log risk ratio has standard error given by
()
{}
SE ln
MH
P
RR
R
S
=
, (4)
where
()
12
2
ii i i iii
i
nn a c acN
P
N
+−
=
∑
;
2ii
i
an
R
N
=
∑
;
1ii
i
cn
S
N
=
∑
.
Risk difference
The Mantel-Haenszel summary risk difference is given by
,
,
M
Hi i
MH
MH i
wRD
RD
w
=
∑
∑
, (5)
where each study’s risk difference is given weight
12
,
ii
MH i
i
nn
w
N
= .
The summary risk difference has standard error given by
{}
2
SE
MH
J
RD
K
=
, (6)
where
33
21
2
12
ii i i i i
iii
abn cdn
J
nnN
+
=
∑
;
12ii
i
nn
K
N
=
∑
.
Test for heterogeneity
The heterogeneity test statistic is given by
(
)
2
ˆˆ
MH i i MH
Qw=θ−θ
∑
,
5
where represents the log odds ratio, log risk ratio or risk difference and the are the weights
calculated as
ˆ
θ
i
w
{
}
2
ˆ
1SE
i
θ rather than the weights used for the Mantel-Haenszel meta-analyses.
Under the null hypothesis that there are no differences in intervention effect among studies this
follows a chi-squared distribution with
1
−
k degrees of freedom (where is the number of studies
contributing to the meta-analysis).
k
The statistic I
2
is calculated as
(
)
2
1
max 100% ,0
MH
MH
Qk
I
Q
−−
⎧
⎫
=×
⎨
⎬
⎩⎭
This measures the extent of inconsistency among the studies’ results, and is interpreted as
approximately the proportion of total variation in study estimates that is due to heterogeneity rather
than sampling error.
Inverse-variance methods for combining results across studies
Inverse-variance methods are used to pool log odds ratios, log risk ratios and risk differences as one
of the analysis options for binary data, to pool all mean differences and standardized mean
differences for continuous data, and also for combining intervention effect estimates in the generic
method. In the general formula the intervention effect estimate is denoted by , which is the
study’s log odds ratio, log risk ratio, risk difference, mean difference or standardized mean
difference, or the estimate of intervention effect in the generic method. The individual effect sizes
are weighted according to the reciprocal of their variance (calculated as the square of the standard
error given in the individual study section above) giving
ˆ
i
θ
{}
()
2
1
ˆ
SE
i
i
w =
θ
.
These are combined to give a summary estimate
ˆ
ˆ
ii
IV
i
w
w
θ
θ=
∑
∑
. (7)
with
{
}
1
ˆ
SE
IV
i
w
θ=
∑
. (8)
The heterogeneity statistic is given by a similar formula as for the Mantel-Haenszel method:
(
)
2
ˆˆ
IV i i IV
Qw=θ−θ
∑
.
Under the null hypothesis that there are no differences in intervention effect among studies this
follows a chi-squared distribution with
1
−
k degrees of freedom (where is the number of studies
contributing to the meta-analysis). I
2
is calculated as
k
(
)
2
1
max 100% ,0
IV
IV
Qk
I
Q
−−
⎧
⎫
=×
⎨
⎬
⎩⎭
.
Peto's method for combining results across studies
The Peto summary log odds ratio is given by
6
()
(
)
,
ln
ln =
∑
∑
iPeto
Peto
i
VOR
OR
V
i
. (9)
and the summary odds ratio by
(
)
,
ln
exp
iPetoi
Peto
i
VOR
OR
V
⎧
⎫
⎪
⎪
=
⎨
⎬
⎪
⎪
⎩⎭
∑
∑
,
where the odds ratio
,
P
eto i
OR is calculated using the approximate method described in the individual
study section, and are the hypergeometric variances.
i
V
The log odds ratio has standard error
()
{}
1
SE ln
Peto
i
OR
V
=
∑
. (10)
The heterogeneity statistic is given by
()
()
{
}
2
2
,
ln ln
Peto i Peto i Peto
Q V OR OR=−
∑
.
Under the null hypothesis that there are no differences in intervention effect among studies this
follows a chi-squared distribution with
1
−
k degrees of freedom (where is the number of studies
contributing to the meta-analysis). I
2
is calculated as
k
(
)
2
1
max 100% ,0
Peto
Peto
Qk
I
Q
−−
⎧
⎫
=×
⎨
⎬
⎩⎭
.
O – E and variance method for combining studies
This is an implementation of the Peto method, which allows its application to time-to-event data as
well as binary data. The summary effect estimate is given by
ˆ
ˆ
θ
θ=
∑
∑
ii
i
V
V
, (11)
where the estimate, , from study i is calculated from
ˆ
i
θ
i
Z
and as for individual studies. The
summary effect is either a log odds ratio or a log hazard ratio (the user should specify which). The
effect estimate (on a non-log scale) is given by
i
V
ˆ
effect estimate exp
⎧
⎫
θ
⎪
⎪
=
⎨
⎬
⎪
⎪
⎩⎭
∑
∑
ii
i
V
V
,
and is either an odds ratio or a hazard ratio.
The effect estimate (on the log scale) has standard error
{
}
1
ˆ
SE
i
V
θ=
∑
. (12)
The heterogeneity statistic is given by
(
)
22
ˆˆ
Peto i i
QV
=
θ−θ
∑
.
Under the null hypothesis that there are no differences in intervention effect among studies this
follows a chi-squared distribution with
1
−
k degrees of freedom (where is the number of studies
contributing to the meta-analysis). I
2
is calculated as
k
7
(
)
2
1
max 100% ,0
Peto
Peto
Qk
I
Q
−−
⎧
⎫
=×
⎨
⎬
⎩⎭
.
DerSimonian and Laird random-effects models
Under the random-effects model, the assumption of a common intervention effect is relaxed, and
the effect sizes are assumed to have a distribution
(
)
2
,
i
N
θ
∼θτ.
The estimate of is given by
2
τ
()
()
2
2
1
ˆ
max , 0
iii
Qk
www
⎧
⎫
−−
⎪
⎪
τ=
⎨
⎬
−
⎪
⎪
⎩⎭
∑∑∑
,
where the are the inverse-variance weights, calculated as
i
w
{}
2
1
ˆ
SE
i
i
w =
θ
,
for log odds ratio, log risk ratio, risk difference, mean difference, standardized mean difference, or
for the intervention effect in the generic method, as appropriate.
For continuous data and for the generic method, Q is . For binary data, either or
IV
Q
IV
Q
M
H
Q may
be taken. Both are implemented in RevMan 5 (and this is the only difference between random-
effects methods under ‘Mantel-Haenszel’ and ‘inverse-variance’ options). Again, for odds ratios,
risk ratios and other ratio effects, the effect size is taken on the natural logarithmic scale.
Each study’s effect size is given weight
{}
2
2
1
ˆ
ˆ
SE
i
i
w
′
=
θ
+τ
.
The summary effect size is given by
ˆ
ˆ
ii
DL
i
w
w
′
θ
θ=
′
∑
∑
, (13)
and
{
}
1
ˆ
SE
DL
i
w
θ=
′
∑
. (14)
Note that in the case where the heterogeneity statistic is less than or equal to its degrees of
freedom , the estimate of the between study variation,
, is zero, and the weights coincide
with those given by the inverse-variance method.
Q
)1( −k
2
ˆ
τ
Confidence intervals
The )%1(100
α
− confidence interval for
ˆ
θ
is given by
)
to
{
}
()
ˆˆ
SE 1 2θ+ θ Φ −α ,
{
}
(
ˆˆ
SE 1 2θ− θ Φ −α
where is the log odds ratio, log risk ratio, risk difference, mean difference, standardized mean
difference or generic intervention effect estimate, and
ˆ
θ
Φ
is the standard normal deviate. For log
odds ratios, log risk ratios and generic intervention effects entered on the log scale (and identified
8
as such by the review author), the point estimate and confidence interval limits are exponentiated
for presentation.
Test statistics
Test for presence of an overall intervention effect
In all cases, the test statistic is given by
()
ˆ
ˆ
SE
Z
θ
=
θ
,
where the odds ratio, risk ratio and other ratio measures are again considered on the log scale.
Under the null hypothesis that there is no overall effect of intervention effect this follows a standard
normal distribution.
Test for comparison of subgroups
The test is valid for all methods. It is based on the notion of performing a test for heterogeneity
across
subgroups rather than across studies. Let
ˆ
θ
j
be the summary effect size for subgroup j, with
standard error
{
}
ˆ
θ
j
SE . The summary effect size may be based on either a fixed-effect or a
random-effects meta-analysis. For fixed-effect meta-analyses, these numbers correspond to above
equations (1) and (2); (3) and (4); (5) and (6); (7) and (8); (9) and (10); or (11) and (12), each
applied within each subgroup. For random-effects meta-analyses, these numbers correspond to
equations (13) and (14), each applied within each subgroup. Note that for ratio measures, all
computations here are performed on the log scale.
First we compute a weight for each subgroup:
{}
2
1
ˆ
SE
=
θ
j
j
w ,
then we perform a (fixed-effect) meta-analysis of the summary effect sizes across subgroups:
ˆ
ˆ
θ
θ=
∑
∑
jj
tot
j
w
w
.
The test statistic for differences across subgroups is given by
(
)
2
ˆˆ
=θ−θ
∑
int j j tot
Qw .
Under the null hypothesis that there are no differences in intervention effect across subgroups this
follows a chi-squared distribution with
1
−
S degrees of freedom (where S is the number of
subgroups with summary effect sizes).
I
2
for differences across subgroups is calculated as
(
)
2
1
max 100% ,0
−−
⎧
⎫
=×
⎨
⎬
⎩⎭
int
int
QS
I
Q
.
This measures the extent of inconsistency across the subgroups’ results, and is interpreted as
approximately the proportion of total variation in subgroup estimates that is due to genuine
variation across subgroups rather than sampling error.
9
Note. An alternative formulation for fixed-effect meta-analyses (inverse variance and Peto methods
only) is as follows. The
Q statistic defined by either or
IV
Q
P
eto
Q is calculated separately for each of
the
S subgroups and for the totality of studies, yielding statistics , …, and . The test
statistic is given by
1
Q
S
Q
tot
Q
1=
=−
∑
S
int tot j
j
QQ Q
.
This is identical to the test statistic given above, in these specific situations.
10
11
Bibliography
Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Introduction to Meta-analysis. John Wiley
& Sons, 2009.
Breslow NE, Day NE. Combination of results from a series of 2x2 tables; control of confounding.
In: Statistical Methods in Cancer Research, Volume 1: The analysis of case-control data. IARC
Scientific Publications No.32. Lyon: International Agency for Health Research on Cancer, 1980.
Deeks JJ, Altman DG, Bradburn MJ. Statistical methods for examining heterogeneity and
combining results from several studies in a meta-analysis. In: Egger M, Davey Smith G, Altman
DG. Systematic Reviewes and Healthcare: meta-analysis in context. BMJ Publications (in press).
DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials 1986; 7: 177-
188.
Greenland S, Robins J. Estimation of a common effect parameter from sparse follow-up data.
Biometrics 1985;41: 55-68.
Greenland S, Salvan A. Bias in the one-step method for pooling study results. Statistics in Medicine
1990; 9:247-252.
Hedges LV, Olkin I. Statistical Methods for Meta-analysis. San Diego: Academic Press 1985.
Chapter 5.
Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analysis.
BMJ 2003; 327: 557-560.
Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of
disease. Journal of the National Cancer Institute 1959;22: 719-748.
Robins J, Greenland S, Breslow NE. A general estimator for the variance of the Mantel-Haenszel
odds ratio. American Journal of Epidemiolgy 1986; 124:719-723.
Rosenthal R. Parametric measures of effect size. In: Cooper H, Hedges LV (eds.). The Handbook
of Research Synthesis. New York: Russell Sage Foundation, 1994.
Sinclair JC, Bracken MB. Effective Care of the Newborn infant.Oxford: Oxford University Press
1992.Chapter 2.
Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial
infarction: an overview of the randomized trials. Progress in Cardiovascular Diseases 1985;27:335-
371.