Content uploaded by Evangelos Kontopantelis
Author content
All content in this area was uploaded by Evangelos Kontopantelis
Content may be subject to copyright.
The Stata Journal (yyyy) vv, Number ii, pp. 1–11
metaan: random effects meta-analysis
Evangelos Kontopantelis
National Primary Care
Research & Development Centre
University of Manchester
Manchester, UK
e.kontopantelis@manchester.ac.uk
David Reeves
Health Sciences Primary Care
Research Group
University of Manchester
Manchester, UK
david.reeves@manchester.ac.uk
Abstract. This article describes a new meta-analysis command, metaan, which can
be used to perform fixed- or random-effects meta-analysis, offering a wide choice
of available models: maximum likelihood, profile likelihood, restricted maximum
likelihood and a permutation method, besides the standard DerSimonian and Laird
approach. The command reports a variety of heterogeneity measures including
Cochran’s Q, I
2
, H
2
M
and the between-study variance estimate ˆτ
2
. A forest plot
and a graph of the maximum likelihood function can also be generated.
Keywords: st0001, metaan, meta-analysis, random-effect(s), effect size(s), maxi-
mum likelihood, profile likelihood, restricted maximum likelihood, REML, permu-
tation(s) method, forest plot
1 Introduction
Meta-analysis is a statistical methodology that combines or integrates the results of
several independent clinical trials, or studies in general, considered by the analyst to
be ‘combinable’ (Huque 1988). Usually, this is a two-stage process: in the first stage
the appropriate summary statistic for each study is estimated, then at the second stage
these are combined into a weighted average. Methods also exist for combining and
meta-analysing data across studies at the individual patient level (IPD methods). An
IPD analysis provides advantages such as standardization (of marker values, outcome
definitions etc), follow-up information updating, detailed data-checking, subgroup anal-
yses and the ability to include participant-level covariates (Stewart and Clarke 1995;
Lambert et al. 2002). However, individual observations are rarely available; addition-
ally, if the main interest is in mean effects then the two-stage and the IPD approaches
can provide equivalent results (Olkin and Sampson 1998).
This paper concerns itself with the second stage of the two-stage approach to meta-
analysis. At this stage, researchers can select between two main approaches, the fixed-
or the random-effects model, in their effort to combine the study-level summary esti-
mates and calculate an overall average effect. The fixed-effect model is simpler and
assumes the true effect to be the same (homogeneous) across studies. However, homo-
geneity has been found to be the exception rather than the rule and some degree of
true effect variability between studies is to be expected (Thompson and Pocock 1991).
This between-study heterogeneity stems from differences in populations, interventions,
outcomes or follow-up times (clinical heterogeneity), or differences in trial design and
c
yyyy StataCorp LP st0001
2 metaan
quality (methodological heterogeneity) (Higgins and Green 2008; Thompson 1994). The
most common approach to modelling the between study variance is the method proposed
by DerSimonian and Laird (1986), which is widely used in generic and specialist meta-
analysis statistical packages alike. In Stata the DerSimonian and Laird (DL) model is
used in the most popular meta-analysis commands, the recently updated metan and
the older but still useful meta (Harris et al. 2008). However, the between-study vari-
ance component can be estimated using more advanced iterative (and computationally
expensive) techniques: maximum likelihood, profile likelihood and restricted maximum
likelihood(Hardy and Thompson 1996; Thompson and Sharp 1999). Alternatively, the
estimate can be obtained using non-parametric approaches, such as the ‘permutations’
method proposed by Follmann and Proschan (1999).
We have implemented these methods in metaan, which performs the second stage
of a two-stage meta-analysis, offering alternatives to the DerSimonian-Laird random-
effects model. The command requires the study effect estimates and standard errors as
input. We have also created metaeff - not discussed in the present paper - a command
which provides support in the first stage of the two-stage process and which compliments
metaan. The metaeff command calculates the effect size (standardised mean difference)
and its standard error from the input parameters supplied by the user, for each study,
using one of the methods described in the Cochrane Handbook for Systematic Reviews of
Interventions (Higgins and Green 2006). For more details type ssc describe metaeff
in Stata, or see Kontopantelis and Reeves (2009).
The metaan command does not offer the plethora of options metan does for inputting
various types of binary or continuous data. Other useful features in metan (and not
available in metaan) include: stratified meta-analysis, user-input study weights, vaccine
efficacy calculations, Mantel-Haenszel fixed-effect method, L’Abbe and funnel plots.
The REML model, assumed to be the best method to fit a random-effects meta-analysis
model even though this assumption has not been thoroughly investigated (Thompson
and Sharp 1999), has recently been coded in the updated meta-regression command
metareg (Harbord and Higgins 2008) and the new multivariate random-effects meta-
analysis command mvmeta (White 2009). However, the output and options provided by
metaan can be more useful in the univariate meta-analysis context.
2 The metaan command
2.1 Syntax
metaan varname1 varname2
if
in
, fe dl ml reml pl pe varc
label(varname) forest forestw(#) plplot(string )
where
varname1 the study effect sizes.
varname2 the study effect variation, with standard error used as default.
E. Kontopantelis and D. Reeves 3
2.2 Options
fe Fixed-effect (FE) model that assumes there is no heterogeneity between the studies.
The model assumes that within-study variances may differ, but that there is homo-
geneity of effect size across studies. Often the homogeneity assumption is unlikely
and variation in the true effect across studies is to be expected. Therefore, caution
is required when using this model. Reported heterogeneity measures are estimated
using the dl model.
dl DerSimonian-Laird (DL), the most commonly used random-effects model. Models
heterogeneity between the studies i.e. assumes that the true effect can be differ-
ent for each study. The method assumes that the individual study true effects are
distributed with a variance τ
2
, around an ‘overall’ true effect, but makes no as-
sumptions about the form of the distribution of either the within- or between-study
effects. Reported heterogeneity measures are estimated using the dl model.
ml Maximum-likelihood (ML) random-effects model. Makes the additional assump-
tion (necessary to derive the log-likelihood function, and also true for reml and pl
below) that both the within-study and between-study effects have Normal distribu-
tions. The log-likelihood function is solved iteratively to produce an estimate of the
between-study variance. However, the method does not always converge while in
some cases the between-study variance estimate is negative and set to zero (in which
case the model is reduced to the fe model). Estimates are reported as missing in
the event of non-convergence. Reported heterogeneity measures are estimated using
the ml model.
reml Restricted maximum-likelihood (REML) random-effects model. Similar method
to ml and using the same assumptions. The log-likelihood function is maximized
iteratively to provide estimates as in ml. However, under reml only the part of
the likelihood function which is location invariant is maximized (i.e. maximizing
the portion of the likelihood that does not involve µ, if estimating τ
2
, and vice
versa). The method does not always converge while in some cases the between-study
variance estimate is negative and set to zero (in which case the model is reduced to
the fe model). Estimates are reported as missing in the event of non-convergence.
Reported heterogeneity measures are estimated using the reml model.
pl Profile-likelihood (PL) random-effects model. Profile likelihood uses the same like-
lihood function as ml, but takes into account the uncertainty associated with the
between-study variance estimate when calculating an overall effect, by using nested
iterations to converge to a maximum. The confidence intervals provided by the
method are asymmetric and hence so is the diamond in the forest plot. However,
the method does not always converge. Values that were not computed are reported
as missing. Reported heterogeneity measures are estimated using the ml model,
since ˆµ and ˆτ
2
, the effect and between-study variance estimates, are the same (only
their confidence intervals are re-estimated). The method also provides a confidence
interval for the between-study variance estimate.
pe Permutations (PE) random-effects model. A non-parametric random-effects method
4 metaan
which utilises dl and does not assume a normal distribution for the random effects.
The confidence interval provided by the method is asymmetric and hence so is the
diamond in the forest plot. Reported heterogeneity measures are estimated using
the dl model.
varc Informs the program that the study effect variation variable varname2 holds vari-
ance values. If this option is omitted the program assumes the variable contains
standard error values (the default).
label(varname) Selects labels for the studies. Up to two variables can be selected and
converted to strings. If two variables are selected they will be separated by a comma.
Usually, the author names and the year of study are selected as labels. The final
string is truncated to 20 characters.
forest Requests a forest plot. The weights from the specified analysis are used for
plotting symbol sizes (PE uses DL weights).
forestw(#) Requests a forest plot with adjusted weight ratios for better display. The
value can be in the [1,50] range. For example if the largest to smallest weight ratio
is 60 and the graph looks awkward the user can use this command to improve the
appearance, by requesting the weight to be rescaled to a largest/smallest weight
ratio of 30. It should be noted that only the weight squares in the plot are affected
and not the model. The confidence intervals in the plot are unaffected.
plplot(string) Requests a plot of the likelihood function for the average effect or
between-study variance estimate of the ml, pl or reml models. Option plplot(mu)
fixes the average effect parameter to its model estimate, in the likelihood function,
and creates a two way plot of τ
2
vs the likelihood function. Option plplot(tsq)
fixes the between-study variance to its model estimate, in the likelihood function,
and creates a two way plot of µ vs the likelihood function.
2.3 Saved results
metaan saves the following scalar results (some varying by selected method) in r():
All methods
r(Hsq) Heterogeneity measure H
2
M
r(Isq) Heterogeneity measure I
2
r(Q) Cochran’s Q value r(Qpval) p-value for Cochran’s Q
r(df) Degrees of freedom
r(effvar) effect variance r(eff) effect size
r(efflo) effect size, lower 95% CI r(effup) effect size, upper 95% CI
fe, dl methods
r(tausq dl) ˆτ
2
, from the DL method
ml method
r(tausq dl) ˆτ
2
, from the DL method r(tausq ml) ˆτ
2
, from the ML method
r(conv ml) ML convergence information
reml method
r(tausq dl) ˆτ
2
, from the DL method r(tausq reml) ˆτ
2
, from the REML method
r(conv reml) REML convergence information
E. Kontopantelis and D. Reeves 5
pl method
r(tausq dl) ˆτ
2
, from the DL method r(tausq pl) ˆτ
2
, from the PL method
r(tausqlo pl) ˆτ
2
(PL), lower 95% CI r(tausqup pl) ˆτ
2
(PL), upper 95% CI
r(cloeff pl) convergence information, PL
effect size (lower CI)
r(cupeff pl) convergence info, PL effect size
(upper CI)
r(ctausqlo pl) convergence information, PL
ˆτ
2
(lower CI)
r(ctausqup pl) convergence information, PL ˆτ
2
(upper CI)
r(conv ml) ML convergence information
pe method
r(tausq dl) ˆτ
2
, from the DL method r(exec pe) Information on PE execution
In each case, heterogeneity measures H
2
M
and I
2
are computed using the returned
between-variance estimate ˆτ
2
. Convergence (and PE execution) information is returned
as 1 if succesful and as 0 otherwise. r(effvar) cannot be computed for PE. r(effvar)
is the same for ML and PL, but for PL the confidence intervals are ‘amended’ to take
into account the ˆτ
2
uncertainty.
2.4 Methods
The metaan command offers six meta-analysis methods for calculating a mean effect
estimate and its confidence intervals: fixed-effect model (FE), random-effects DerSimo-
nian & Laird method (DL), maximum-likelihood random-effects model (ML), restricted
maximum-likelihood random-effects model (REML), profile-likelihood random-effects
model (PL) and permutations method utilising a DL random-effects model (PE). Mod-
els of the random-effects family take into account the identified between-study variation,
estimate it and usually produce wider confidence intervals for the overall effect than a
fixed-effect analysis. Brief descriptions of the methods have been provided in section 2.2.
In this section, we will provide a few more details and practical advice in selecting be-
tween the methods. Their complexity prohibits complete descriptions in this paper and
users wishing to look into method details are encouranged to refer to the original papers
which have described them (DerSimonian and Laird 1986; Hardy and Thompson 1996;
Follmann and Proschan 1999; Brockwell and Gordon 2001).
The three maximum likelihood methods are iterative and usually computationally
expensive. ML and PL derive the µ (overall effect) and τ
2
estimates by maximizing the
log-likelihood function in (1), under different conditions. REML estimates τ
2
and µ by
maximizing the restricted log-likelihood function in (2).
log L(µ, τ
2
) = −
1
2
"
k
X
i=1
log(2π(ˆσ
2
i
+ τ
2
)) +
k
X
i=1
( ˆy
i
− µ)
2
ˆσ
2
i
+ τ
2
#
, µ ∈ < & τ
2
≥ 0 (1)
log L
0
(µ, τ
2
) = −
1
2
"
k
X
i=1
log(2π(ˆσ
2
i
+ τ
2
)) +
k
X
i=1
( ˆy
i
− ˆµ)
2
ˆσ
2
i
+ τ
2
#
−
1
2
log
k
X
i=1
1
ˆσ
2
i
+ τ
2
, ˆµ ∈ < & τ
2
≥ 0 (2)
6 metaan
where k is the number of studies to be meta-analysed, ˆy
i
and ˆσ
2
i
are the effect and
variance estimates for study i and ˆµ is the overall effect estimate.
Maximum likelihood follows the simplest approach, maximizing (1) in a single itera-
tion loop. A criticism of ML is that it takes no account of the loss in degrees of freedom
that results from estimating the overall effect. Restricted Maximum Likelihood derives
the likelihood function in a way that adjusts for this and removes downward bias in
the between-studies variance estimator. A useful description for REML, in the meta-
analysis context, has been provided by Normand (1999). Profile likelihood uses the
same likelihood function as ML, but takes into account the uncertainty associated with
the between-study variance estimate when calculating an overall effect, through the use
of use nested iterations to converge to a maximum. By incorporating this extra factor
of uncertainty, PL yields confidence intervals that are usually wider than for DL and
also asymmetric. PL has been shown to outperform DL in various scenarios (Brockwell
and Gordon 2001).
The PE method (Follmann and Proschan 1999) can be described as follows: First, in
line with a Null hypothesis that all true study effects are zero and observed effects are due
to random variation, a dataset of all possible combinations of observed study outcomes
is created by permuting the sign of each observed effect. Next, the dl method is used
to compute an overall effect for each combination. Finally, the resulting distribution of
overall effect sizes is used to derive a confidence interval for the observed overall effect.
Method performance is known to be affected by three factors: the number of studies
in the meta-analysis, the degree of heterogeneity in true effects and - provided there is
heterogeneity present - the distribution of the true effects (Brockwell and Gordon 2001).
Heterogeneity is a major problem researchers have to face, when combining study re-
sults in a meta-analysis, which is attributed to clinical and/or methodological diver-
sity (Higgins and Green 2006). The variability that arises from different interventions,
populations, outcomes or follow-up times is described by clinical heterogeneity, while
differences in trial design and quality are accounted for by methodological heterogene-
ity (Thompson 1994). Traditionally, heterogeneity is tested with Cochran’s Q which
provides a p-value for the test of homogeneity, when compared with a χ
2
k−1
distribution
(Brockwell and Gordon 2001) (where k is the number of studies). However the test is
known to be poor at detecting heterogeneity since its power is low when the number of
studies is small (Hardy and Thompson 1998). An alternative measure is I
2
, which is
thought to be more informative in assessing inconsistency between studies, with values
of 25%, 50% and 75% corresponding to low, moderate and high heterogeneity respec-
tively (Higgins et al. 2003). Another measure is H
2
M
, the measure least affected by the
value of k, taking values in the [0, +∞) range with 0 indicating perfect homogeneity
(Mittlbock and Heinzl 2006). Obviously, the between-study variance estimate ˆτ
2
can
also be informative about the presence or not of heterogeneity.
The test for heterogeneity is often used as the basis for applying a fixed-effect or
a random-effects model. However, the often low power of the Q test makes it unwise
to base a decision on the result of the test alone. Research studies, even on the same
topic, can vary on a large number of factors, hence homogeneity is often an unlikely
E. Kontopantelis and D. Reeves 7
assumption and some degree of variability between studies is to be expected (Thompson
and Pocock 1991). Some authors recommend the adoption of a random-effects model,
unless there are compelling reasons for doing otherwise, irrespective of the outcome of
the test for heterogeneity (Brockwell and Gordon 2001).
However, even though random-effects methods model heterogeneity, the performance
of the maximum likelihood methods (ML, REML and PL) in situations where the true
effects violate the assumptions of a Normal distribution may not be optimal (Brockwell
and Gordon 2001; Hardy and Thompson 1998; Bohning et al. 2002; Sidik and Jonkman
2007). The number of studies in the analysis is also an issue, since most meta-analysis
methods (including DL, ML, REML, PL, but not PE) are only asymptotically correct:
i.e. they provide the theoretical 95% coverage only as the number of studies increases
(approaches infinity). Method performance is therefore affected when the number of
studies is small, but the extent depends on the method (some are more susceptible),
along with the degree of heterogeneity and the distribution of the true effects (Brockwell
and Gordon 2001).
2.5 Example
As an example, we apply the metaan command to health risk outcome data from seven
studies. The information was collected for an unpublished meta-analysis and the data
is available from the authors. Using describe and list commands we provide details
of the dataset and proceed to perform a univariate meta-analysis with metaan.
. use metaan_example.dta,
. describe
Contains data from metaan_example.dta
obs: 7
vars: 4 19 Apr 2010 12:19
size: 532 (99.9% of memory free)
storage display value
variable name type format label variable label
study str16 %16s First author and year
outcome str48 %35s Outcome description
effsize float %9.0g effect sizes
se float %9.0g SE of the effect sizes
Sorted by: study outcome
. list study outcome effsize se, noobs clean
study outcome effsize se
Bakx A, 1985 Serum cholesterol (mmol/L) -.3041526 .0958199
Campbell A, 1998 Diet .2124063 .0812414
Cupples, 1994 BMI .0444239 .090661
Eckerlund SBP -.3991309 .12079
Moher, 2001 Cholesterol (mmol/l) -.9374746 .0691572
Woolard A, 1995 Alcohol intake (g/week) -.3098185 .206331
Woolard B, 1995 Alcohol intake (g/week) -.4898825 .2001602
8 metaan
. metaan effsize se, pl label(study) forest
Profile Likelihood method selected
Study Effect [95% Conf. Interval] % Weight
Bakx A, 1985 -0.304 -0.492 -0.116 15.09
Campbell A, 1998 0.212 0.053 0.372 15.40
Cupples, 1994 0.044 -0.133 0.222 15.20
Eckerlund -0.399 -0.636 -0.162 14.49
Moher, 2001 -0.937 -1.073 -0.802 15.62
Woolard A, 1995 -0.310 -0.714 0.095 12.01
Woolard B, 1995 -0.490 -0.882 -0.098 12.19
Overall effect (pl) -0.308 -0.622 0.004 100.00
ML method succesfully converged
PL method succesfully converged for both upper and lower CI limits
Heterogeneity Measures
value df p-value
Cochrane Q 139.81 6 0.000
I^2 (%) 91.96
H^2 11.44
value [95% Conf. Interval]
tau^2 est 0.121 0.000 0.449
Estimate obtained with Maximum likelihood - Profile likelihood provides the CI
PL method succesfully converged for both upper and lower CI limits of the tau
> estimate
The PL method used in the example converged successfuly, as did ML whose convergence
is a prerequisite. The overall effect is not found to be significant at the 95% level
and there is considerable heterogeneity across studies, according to the measures. The
method also displays a 95% confidence interval for the between-study variance estimate
ˆτ
2
(provided convergence is achieved, as is the case in this example). The forest plot
created by the command is displayed in Figure 1.
(Continued on next page)
E. Kontopantelis and D. Reeves 9
Overall effect (pl)
Woolard B, 1995
Woolard A, 1995
Moher, 2001
Eckerlund
Cupples, 1994
Campbell A, 1998
Bakx A, 1985
Studies
−1 −.5 0 .5
Effect sizes and CIs
Original weights (squares) displayed. Largest to smallest ratio: 1.30
Figure 1: Forest plot displaying profile-likelihood meta-analysis.
Re-executing the analysis with the plplot(mu) and plplot(tsq) options we obtain
the log-likelihood function plots (Figures 2 & 3).
−10
−8
−6
−4
−2
log−likelihood
0 .05 .1 .15 .2
tau² values
for mu fixed to the ML/PL estimate
Likelihood plot
Figure 2: Log-likelihood function plot, µ fixed to the model estimate.
10 metaan
−25
−20
−15
−10
−5
0
log−likelihood
−1.5 −1 −.5 0 .5
mu values
for tau² fixed to the ML/PL estimate
Likelihood plot
Figure 3: Log-likelihood function plot, τ
2
fixed to the model estimate.
3 Discussion
The metaan command can be a useful meta-analysis tool which includes newer and, in
certain circumstances, better performing methods than the standard Dersimonian-Laird
random-effects model. Unpublished results exploring method performance in various
scenarios are available from the authors. Future work will involve implementing more
methods in the metaan command and embellishing the forest plot.
4 Acknowledgments
We would like to thank the authors of meta and metan for all their work and the
anonymous reviewer whose useful comments improved the paper considerably.
5 References
Bohning, D., U. Malzahn, E. Dietz, P. Schlattmann, C. Viwatwongkasem, and A. Big-
geri. 2002. Some General Points in Estimating Heterogeneity Variance with the
DerSimonian-Laird Estimator. Biostatistics 3(4): 445–457.
Brockwell, S. E., and I. R. Gordon. 2001. A Comparison of Statistical Methods for
Meta-Analysis. Statistics in Medicine 20(6): 825–840.
DerSimonian, R., and N. Laird. 1986. Meta-Analysis in Clinical Trials. Controled
E. Kontopantelis and D. Reeves 11
Clinical Trials 7(3): 177–188.
Follmann, D. A., and M. A. Proschan. 1999. Valid Inference in Random Effects Meta-
Analysis. Biometrics 55(3): 732–737.
Harbord, R. M., and J. P. T. Higgins. 2008. Meta-regression in STATA. Stata Journal
8(4): 493–519.
Hardy, R. J., and S. G. Thompson. 1996. A Likelihood Approach to Meta-Analysis with
Random Effects. Statistics in Medicine 15(6): 619–629.
———. 1998. Detecting and Describing Heterogeneity in Meta-Analysis. Statistics in
Medicine 17(8): 841–856.
Harris, R., M. Bradburn, J. Deeks, R. Harbord, D. Altman, and J. Sterne. 2008. Metan:
Fixed- and Random-Effects Meta-Analysis. Stata Journal 8(1): 3–28.
Higgins, J. P., and S. Green. 2006. Cochrane Hand-
book for Systematic Reviews of Interventions: Version 4.2.6.
http://www.cochrane.org/resources/handbook/Handbook4.2.6Sep2006.pdf.
———. 2008. Cochrane Handbo ok for Systematic Reviews of Interventions: Version
5.0.1. http://www.cochrane-handbook.org/.
Higgins, J. P., S. G. Thompson, J. J. Deeks, and D. G. Altman. 2003. Measuring
Inconsistency in Meta-Analyses. British Medical Journal 327(7414): 557–560.
Huque, M. F. 1988. Experiences with Meta-Analysis in NDA Submissions. Proceedings
of the Biopharmaceutical Section of the American Statistical Association 2: 28–33.
Kontopantelis, E., and D. Reeves. 2009. MetaEasy: A Meta-Analysis Add-In for Mi-
crosoft Excel. Journal of Statistical Software 30(7): 1–25.
Lambert, P. C., A. J. Sutton, K. R. Abrams, and D. R. Jones. 2002. A comparison
of summary patient-level covariates in meta-regression with individual patient data
meta-analysis. J Clin Epidemiol 55(1): 86–94.
Mittlbock, M., and H. Heinzl. 2006. A Simulation Study Comparing Properties of
Heterogeneity Measures in Meta-Analyses. Statistics in Medicine 25(24): 4321–4333.
Normand, S. T. 1999. Tutorial in biostatistics. Meta-analysis: formulating, evaluating,
combining, and reporting. Stat Med 18: 321–359.
Olkin, I., and A. Sampson. 1998. Comparison of meta-analysis versus analysis of variance
of individual patient data. Biometrics 54(1): 317–322.
Sidik, K., and J. N. Jonkman. 2007. A comparison of heterogeneity vari-
ance estimators in combining results of studies. Stat Med 26(9): 1964–1981.
http://dx.doi.org/10.1002/sim.2688.
12 metaan
Stewart, L. A., and M. J. Clarke. 1995. Practical methodology of meta-analyses
(overviews) using updated individual patient data. Cochrane Working Group. Stat
Med 14(19): 2057–2079.
Thompson, S. G. 1994. Why Sources of Heterogeneity in Meta-Analysis Should be
Investigated. British Medical Journal 309(6965): 1351–1355.
Thompson, S. G., and S. J. Pocock. 1991. Can Meta-Analyses be Trusted? The Lancet
338(8775): 1127–1130.
Thompson, S. G., and S. J. Sharp. 1999. Explaining heterogeneity in meta-analysis: a
comparison of methods. Stat Med 18(20): 2693–2708.
White, I. R. 2009. Multivariate random-effects meta-analysis. Stata Journal 9(1): 40–
56.
About the authors
Evangelos (Evan) Kontopantelis is a research fellow in statistics at the National Primary Care
Research and Development Centre, University of Manchester, England. His research interests
include statistical methods in health sciences with a focus on meta-analysis, longitudinal data
modeling and large clinical database management.
David Reeves is a senior research fellow in statistics at the Health Sciences Primary Care
Research Group, University of Manchester, England. David has worked as a statistician in
health services research for nearly three decades, mainly in the fields of learning disability
and primary care. His methodological research interests include the robustness of statistical
methods, the analysis of observational studies, and applications of social network analysis
methods to health systems.