ArticlePDF Available

Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A comparison between DerSimonian-Laird and restricted maximum likelihood

Authors:

Abstract

Comment on: Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study.
1
Letter. Performance of statistical methods for meta-analysis when
true study effects are non-normally distributed: a comparison
between DerSimonian-Laird and Restricted Maximum
Likelihood.
Evangelos Kontopantelis
National Primary Care Research and Development Centre
University of Manchester, Williamson Building 5
th
floor.
Oxford Road, M13 9PL
UK
e.kontopantelis@manchester.ac.uk
David Reeves
Health Sciences Primary Care Research Group
University of Manchester, Williamson Building 5
th
floor.
Oxford Road, M13 9PL
UK
In a recent paper we evaluated the performance of seven different methods for random-
effects meta-analyses under various non-normal distributions for the effect sizes.
1
However,
due to computational limitations we did not include Restricted Maximum Likelihood
(REML) estimator for the between-study variance. Lately, we have observed that the iterative
REML approach has been increasingly replacing the non-iterative DerSimonian-Laird (DL)
as the method of choice in published meta-analyses. Jackson et al examined the performance
of the two methods in terms of coverage, for normally distributed effects only, and found that
results for the two methods were similar.
2
However, REML requires an assumption that study
effects are normally distributed, which DL does not, and so the two methods may differ more
substantially when this assumption is violated.
Using the same simulation method and scenarios as in our previous paper, we assessed
the performance of REML in terms of coverage, power and overall effect estimation when
effect sizes do not follow a normal distribution. REML is a computationally expensive
2
iterative method (it took several months and a few computers to complete the simulations in
STATA
3
) which estimates the between study variance
and effect by maximising the
restricted log-likelihood function:
log
(
,
)
=
1
2
log
{
2
(
+
)}
+
(
)
(
+
)


1
2
log
1
(
+
)

, &
0
(1)
where is the number of studies being combined,
and
are the effect and variance
estimates for study and is the overall effect estimate with =
[


]

[ 

]

. Non-
negativity for
must be enforced at each iteration and iteration continues until convergence
or the maximum number of iterations is reached. REML is considered an improvement over
Maximum Likelihood (ML) since it adjusts for the loss of degrees of freedom due to the
estimation of the overall effect .
4
Non-convergence is a possibility, although it was rare in
our simulations (around 0.1%). The method has been implemented in the STATA command
metaan.
5
Coverage, power and confidence interval estimation (estimated confidence interval as a
percentage of the interval based on the true between-study variance) for the REML method
are presented in table1, with results for DL provided for comparison. The two methods
performed very similarly across all scenarios. In terms of coverage DL outperformed REML
slightly, by a maximum of 2%, particularly when heterogeneity was low. As expected the
picture was reversed with regards to power, with REML performing slightly better
(maximum 2% for
= 0). As heterogeneity and/or the number of studies increased the two
methods converged to almost identical results. However, power for DL caught up somewhat
quicker than coverage for REML. Results for confidence interval estimation do not identify a
clear ‘winner’: DL performed better (maximum 2%) in cases of small or moderate
heterogeneity, and more so with larger number of studies, while REML returned a slightly
more accurate interval (maximum 1%) in certain large study number scenarios with high
heterogeneity. Although the form of the effect size distribution had some overall impact on
performance, it did not alter the comparison of results between methods.
In conclusion, it seems that REML’s performance does not justify the extra level of
complexity associated with the method. In general, DL performed just as well in most
scenarios and scored marginally better in some. We stand by our earlier recommendation to
3
meta-analysts to use either DerSimonian-Laird or Profile Likelihood, depending on the
scenario and the requirements, as described in our paper.
1. Kontopantelis E, Reeves D. Performance of statistical methods for meta-analysis
when true study effects are non-normally distributed: A simulation study. Stat
Methods Med Res.
2. DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials 1986;
7(3): 177-88.
3. STATA Statistical Software for Windows: Release 10.0 [program]. 10 version.
College Station, TX: Stata Corporation, 2007.
4. Jackson D, Bowden J, Baker R. How does the DerSimonian and Laird procedure for
random effects meta-analysis compare with its more efficient but harder to compute
counterparts? Journal of statistical planning and inference 2010; 140(4): 961-970.
5. Kontopantelis E, Reeves D. metaan: random effects meta-analysis. The STATA
Journal 2010; 10(3): 395-407.
4
Table 1: Coverage, power and confidence interval estimation by degree of heterogeneity, between-study effect distribution, and number of studies,
assuming
2
1
-based within-study variances
Power (25
th
centile)
Confidence interval estimation
# of studies:
2-5
6-15
16-25
26-35
2-5
6-15
16-25
26-35
2-5
6-15
16-25
26-35
2
H
i
Distribution
(skew, kurtosis)
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
REML
DL
1
None
0.95
0.96
0.94
0.96
0.94
0.96
0.95
0.96
0.30
0.29
0.69
0.67
0.93
0.92
0.99
0.99
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.18
Normal (0,3)
0.93
0.94
0.92
0.94
0.93
0.94
0.93
0.94
0.31
0.29
0.66
0.65
0.91
0.90
0.98
0.98
0.96
0.96
0.94
0.95
0.96
0.97
0.97
0.98
1.18
Skew-normal (1,4)
0.93
0.94
0.93
0.94
0.93
0.94
0.93
0.94
0.29
0.28
0.66
0.65
0.91
0.90
0.98
0.98
0.96
0.96
0.94
0.95
0.95
0.97
0.97
0.98
1.18
Skew-normal (2,9)
0.93
0.95
0.93
0.94
0.93
0.94
0.94
0.94
0.29
0.28
0.66
0.65
0.92
0.91
0.98
0.98
0.96
0.96
0.94
0.95
0.94
0.96
0.96
0.97
1.18
Uniform
0.93
0.94
0.92
0.94
0.93
0.94
0.93
0.94
0.29
0.28
0.66
0.65
0.91
0.90
0.98
0.98
0.96
0.96
0.94
0.95
0.96
0.97
0.97
0.98
1.18
Bimodal
0.93
0.94
0.92
0.94
0.93
0.94
0.93
0.94
0.30
0.29
0.66
0.65
0.91
0.90
0.98
0.98
0.96
0.96
0.95
0.95
0.96
0.97
0.97
0.98
1.18
D-spike
0.93
0.94
0.92
0.94
0.93
0.94
0.93
0.94
0.30
0.29
0.65
0.64
0.91
0.90
0.98
0.98
1.00
1.00
1.06
1.07
1.13
1.13
1.14
1.15
1.54
Normal (0,3)
0.90
0.91
0.91
0.92
0.92
0.92
0.93
0.93
0.33
0.32
0.65
0.65
0.90
0.90
0.97
0.97
0.90
0.91
0.93
0.94
0.97
0.97
0.98
0.98
1.54
Skew-normal (1,4)
0.90
0.91
0.91
0.92
0.92
0.92
0.93
0.93
0.31
0.30
0.66
0.65
0.91
0.91
0.98
0.98
0.90
0.91
0.93
0.94
0.96
0.97
0.97
0.98
1.54
Skew-normal (2,9)
0.91
0.92
0.91
0.92
0.92
0.93
0.92
0.93
0.29
0.28
0.64
0.63
0.90
0.89
0.98
0.98
0.90
0.90
0.90
0.92
0.94
0.95
0.95
0.96
1.54
Uniform
0.89
0.90
0.90
0.91
0.92
0.92
0.93
0.93
0.32
0.31
0.65
0.64
0.89
0.89
0.98
0.98
0.91
0.91
0.94
0.95
0.97
0.98
0.98
0.99
1.54
Bimodal
0.89
0.90
0.90
0.91
0.91
0.92
0.92
0.92
0.33
0.33
0.67
0.67
0.91
0.91
0.98
0.98
0.91
0.91
0.94
0.95
0.98
0.98
0.99
0.99
1.54
D-spike
0.89
0.90
0.89
0.90
0.91
0.91
0.92
0.92
0.32
0.32
0.67
0.67
0.91
0.91
0.98
0.98
1.04
1.06
1.29
1.30
1.37
1.37
1.38
1.39
2.78
Normal (0,3)
0.86
0.87
0.90
0.90
0.92
0.92
0.93
0.93
0.35
0.35
0.64
0.64
0.87
0.87
0.96
0.96
0.85
0.85
0.94
0.94
0.98
0.97
0.98
0.98
2.78
Skew-normal (1,4)
0.86
0.86
0.89
0.90
0.91
0.92
0.93
0.93
0.32
0.32
0.64
0.63
0.90
0.90
0.98
0.98
0.84
0.84
0.93
0.93
0.97
0.96
0.98
0.97
2.78
Skew-normal (2,9)
0.87
0.88
0.88
0.89
0.90
0.91
0.92
0.92
0.30
0.30
0.64
0.63
0.90
0.90
0.98
0.98
0.80
0.81
0.89
0.88
0.93
0.92
0.95
0.94
2.78
Uniform
0.84
0.85
0.89
0.89
0.92
0.92
0.93
0.93
0.36
0.36
0.66
0.66
0.91
0.91
0.98
0.98
0.86
0.86
0.96
0.96
0.99
0.98
0.99
0.99
2.78
Bimodal
0.82
0.83
0.88
0.88
0.92
0.92
0.93
0.93
0.36
0.36
0.67
0.67
0.92
0.92
0.99
0.99
0.87
0.87
0.97
0.97
0.99
0.99
0.99
0.99
2.78
D-spike
0.80
0.81
0.87
0.87
0.92
0.92
0.93
0.93
0.34
0.34
0.66
0.67
0.92
0.92
0.99
0.99
1.35
1.36
1.80
1.80
1.87
1.87
1.89
1.89
... One of the main assumptions that is usually taken for granted in applied meta-analysis is the normality underlying the population of true effect sizes in a randomeffects model. Previous works have tried to answer how the estimation and inference regarding the pooled effect size perform under non-normal random effects [1][2][3], but less has been said about other important parameters, like the heterogeneity or between-study variance. This paper presents a Monte Carlo simulation that examines how the available methods for computing a point estimate of the between-study variance perform when the distribution of effect sizes departs from normality in meta-analyses of standardized mean differences. ...
... Even though several simulation studies have assessed the influence of the lack of normality of the random effects on the meta-analytic results [1][2][3], one of the few studies that in the context of meta-analysis of standardized mean differences has reported results referring to how this lack of normality affects the estimation process of the heterogeneity parameter has been the study by Kromrey and Hogarty [42], and can therefore be considered as a precursor to the present work. These authors compared the performance of three estimators of τ 2 (CA, DL and ML) and found that all of them demonstrated extreme sensitivity to violations of the assumptions of normality. ...
... The implications of an improper estimation of the heterogeneity parameter due to the non-normality of the random-effects distribution are diverse: While the mean effect and its confidence interval have been shown to be relatively robust against non-normal conditions [1][2][3], its influence on the estimation of prediction intervals appears to be important [62,63]. ...
Article
Full-text available
Background Advantages of meta-analysis depend on the assumptions underlying the statistical procedures used being met. One of the main assumptions that is usually taken for granted is the normality underlying the population of true effects in a random-effects model, even though the available evidence suggests that this assumption is often not met. This paper examines how 21 frequentist and 24 Bayesian methods, including several novel procedures, for computing a point estimate of the heterogeneity parameter ( $${\tau }^{2}$$ τ 2 ) perform when the distribution of random effects departs from normality compared to normal scenarios in meta-analysis of standardized mean differences. Methods A Monte Carlo simulation was carried out using the R software, generating data for meta-analyses using the standardized mean difference. The simulation factors were the number and average sample size of primary studies, the amount of heterogeneity, as well as the shape of the random-effects distribution. The point estimators were compared in terms of absolute bias and variance, although results regarding mean squared error were also discussed. Results Although not all the estimators were affected to the same extent, there was a general tendency to obtain lower and more variable $${\tau }^{2}$$ τ 2 estimates as the random-effects distribution departed from normality. However, the estimators ranking in terms of their absolute bias and variance did not change: Those estimators that obtained lower bias also showed greater variance. Finally, a large number and sample size of primary studies acted as a bias-protective factor against a lack of normality for several procedures, whereas only a high number of studies was a variance-protective factor for most of the estimators analyzed. Conclusions Although the estimation and inference of the combined effect have proven to be sufficiently robust, our work highlights the role that the deviation from normality may be playing in the meta-analytic conclusions from the simulation results and the numerical examples included in this work. With the aim to exercise caution in the interpretation of the results obtained from random-effects models, the tau2() R function is made available for obtaining the range of $${\tau }^{2}$$ τ 2 values computed from the 45 estimators analyzed in this work, as well as to assess how the pooled effect, its confidence and prediction intervals vary according to the estimator chosen.
... Means, standard deviations, and sample sizes were the input data for all meta-analyses. The DerSimonian-Laird estimator was used because it does not need normally distributed data [49][50][51]. The heterogeneity of parameters was assessed using Cochran's Q test, with a p < 0.10 indicating evidence of heterogeneity [51]. ...
... The DerSimonian-Laird estimator was used because it does not need normally distributed data [49][50][51]. The heterogeneity of parameters was assessed using Cochran's Q test, with a p < 0.10 indicating evidence of heterogeneity [51]. Heterogeneity was also quantified by the I 2 statistics [50], expressed as a percentage: values greater than 50% indicate substantial heterogeneity [52]. ...
Article
Full-text available
Background: A substantial proportion of children with autism spectrum disorder (ASD) also have an intellectual disability (ID). However, the academic achievement levels of students with ASD and ID (ASD-ID) are poorly documented and known. Method: We systematically reviewed studies on school skills (reading, spelling, and math) in children and adolescents with ASD-ID. The search was conducted in seven bibliographic databases: Embase, Pubmed/MEDLINE, PsycINFO, Cochrane Library, Ebscohost, Proquest, and Scopus until 28 May 2022. Results: We identified 33,750 reports, four of which met the inclusion criteria for the review. The studies, characterized by Level III evidence (non-randomized controlled trials), included 535 students, 266 in the ASD-ID group and 269 in the ASD-no ID group. A random-effects model meta-analysis revealed that students with ASD-ID had significantly lower reading, spelling, and math scores than students with ASD-no ID. The effect sizes associated with reading score differences were large, although with significant heterogeneity; similarly, the effect sizes associated with spelling and math score differences were also large, although to a lesser extent than for reading. Conclusions: The co-presence of ASD and ID is associated with significant deficits in reading, spelling, and math. However, the present meta-analytic results rest on a limited number of studies. This contrasts with the substantial proportion of children with ASD who have ID and highlights the need for further research to fill a significant gap regarding the profile of academic abilities of students with ASD-ID.
... Ultimately, the restricted maximum likelihood method was recommended to be the preferred approach for estimating between-study variance [6]. In contrast, in a simulation study that compared the traditional Dersimonian and Laird method with the restricted maximum likelihood model that focused on nonnormally distributed data, Kontopantelis and Reeves [7] concluded that since the traditional Dersimonian and Laird method, in general, performed as well as the restricted maximum likelihood method in most scenarios, the traditional Dersimonian and Laird method was recommended given the greater level of complexity of the former. In still another simulation study that compared the fixed effect model with seven frequentist estimators for random-effects models and a focus on nonnormally distributed data, Kontopantelis and Reeves [7] recommended the profile likelihood model as well as the traditional Dersimonian and Laird method for pooling meta-analytic results [8]. ...
... In contrast, in a simulation study that compared the traditional Dersimonian and Laird method with the restricted maximum likelihood model that focused on nonnormally distributed data, Kontopantelis and Reeves [7] concluded that since the traditional Dersimonian and Laird method, in general, performed as well as the restricted maximum likelihood method in most scenarios, the traditional Dersimonian and Laird method was recommended given the greater level of complexity of the former. In still another simulation study that compared the fixed effect model with seven frequentist estimators for random-effects models and a focus on nonnormally distributed data, Kontopantelis and Reeves [7] recommended the profile likelihood model as well as the traditional Dersimonian and Laird method for pooling meta-analytic results [8]. In contrast, a simulation study by Petropoulou and Mavridis [9] compared 20 different between-study variance (heterogeneity) estimators. ...
Article
Full-text available
Purpose of review Meta-analyses are a common and important component of clinical practice guidelines. Concomitantly, there has been a tremendous increase over the past three decades in the number of published meta-analyses. An important factor in the quality of the results from a meta-analysis rests on selecting the most appropriate pooling model. In this brief review, the evolution of the numerous different pooling models that extend beyond the traditional fixed effect, fixed effects, and random effects models is described, with a focus on estimating between-study variance, that is, heterogeneity. The most recent evidence, including alternative models, is also described and recommendations for model selection and reporting provided. Recent findings In the absence of checking for between-study normality, appropriately conducted simulation studies have found that the IVhet model, a quasi-likelihood approach, may be the best model for pooling results in an aggregate data meta-analysis. Summary The IVhet model is recommended for pooling results for an aggregate data meta-analysis. If there is insistence on a random effects model, the restricted maximum likelihood method along with the Knapp-Hartung adjustment is recommended. A need exists for a large, collaborative, appropriately conducted simulation study that examines which pooling models are best based on the scenario presented.
... In cases where the study used more than one item, we rounded the number of successes and failures to the nearest integer. Aggregated effect sizes were calculated based on a random effect model with inverse variance weighting (Hedges & Olkin, 2014) and the DerSimonian-Laird estimator, which is robust to the distribution of the effect sizes (Kontopantelis & Reeves, 2012). The log odds ratios were converted back to odds ratios for better interpretability of the results. ...
Article
Full-text available
The conjunction fallacy is a classical judgment bias that was argued to be a robust cognitive illusion insensitive to the positive effect of incentivization. We conducted a meta-analysis of the literature (n = 3276) and found that although most studies did not report a significant effect of incentivization, the results across studies show a significant positive effect for incentivization, d = 0.19, with an odds ratio of 1.40 for answering correctly when incentivized. There was no moderating effect of payoff size despite the differences in incentive value between studies. Additionally, the effect was relatively smaller when examining absolute differences in the probability of correct judgment instead of odds ratios, suggesting that it may be partly driven by studies with low baseline performance. These findings join those of other judgment-bias studies to suggest a small but nevertheless robust debiasing effect of incentivization.
... Means, standard deviations and sample sizes were used for the computation of effect size and variance, which were then used as input parameters. For all meta-analyses, the DerSimonian-Laird estimator was used because it does not need normally distributed data (Hedges and Olkin, 1985;Higgins and Thompson, 2002;Kontopantelis and Reeves, 2012). Heterogeneity of parameters was assessed using the Cochran's Q-test, with a p < 0.10 indicating evidence of heterogeneity (Kontopantelis and Reeves, 2012). ...
Article
Studies explicitly reporting data concerning the evaluation of the effect of antidepressants on the periodic leg movements during sleep (PLMS) index obtained by polysomnography were reviewed and selected. A random-effects model meta-analysis was carried out. The level of evidence was also assessed for each paper. Twelve studies were included in the final meta-analysis, seven interventional and five observational. Most studies were characterized by Level III evidence (non-randomized controlled trials), with the exception of four studies, which were classified as Level IV (case series, case-control, or historically controlled studies). Selective serotonin reuptake inhibitors (SSRIs) were used in seven studies. The analysis of the assessments involving SSRIs or venlafaxine showed an overall large effect size, clearly much larger than that obtained with studies using other antidepressants. Heterogeneity was substantial. This meta-analysis confirms the previous reports on the increase in PLMS often associated with the use of SSRIs (and venlafaxine); however, the absent or smaller effect of the other categories of antidepressants needs to be confirmed by more numerous and better controlled studies.
... For all meta-analyses, the DerSimonian-Laird estimator was used because it does not need data that are normally distributed. 27 Heterogeneity of parameters was assessed using the Cochran's Q-test, with a P < .10 indicating evidence of heterogeneity. ...
Article
Full-text available
Study objectives: Periodic limb movements during sleep (PLMS) are a frequent finding in restless legs syndrome (RLS), but their impact on sleep is still debated, as well the indication for treatment. Unlabelled: We systematically reviewed the available literature to describe which drug categories are effective in suppressing PLMS, assessing their efficacy through a meta-analysis, when this was possible. Methods: The review protocol was preregistered on PROSPERO (CRD42021175848) and the systematic search was conducted on PubMed and EMBASE (last searched on March 2020). We included original human studies, which assessed PLMS modification on drug treatment with a full night polysomnography (PSG), through surface electrodes on each tibialis anterior muscle. When at least 4 studies were available on the same drug or drug category, we performed a random effect model meta-analysis. Results: Dopamine agonists like pramipexole and ropinirole resulted the most effective, followed by L-Dopa and other dopamine agonists. Alpha2delta ligands are moderately effective as well opioids, despite available data on these drugs are much more limited than those on dopaminergic agents. Valproate and carbamazepine did not show a significant effect on PLMS. Clonazepam showed contradictory results. Perampanel and dypiridamole showed promising but still insufficient data. The same applies to iron supplementation. Conclusions: Dopaminergic agents are the most powerful suppressors of PLMS. However, most therapeutic trials in RLS do not report objective polysomnographic findings, there's a lack of uniformity in presenting results on PLMS. Longitudinal polysomnographic interventional studies, using well-defined and unanimous scoring criteria and endpoints on PLMS are needed.
... Because we expected to observe higher variance between rather than within studies, we used random-effect restricted maximum likelihood (REML) modeling in our meta-analyses. Evidence suggests this technique is relatively robust against the influence of non-normally distributed data (Kontopantelis & Reeves, 2012b, 2012a). ...
Article
Full-text available
The hippocampus and perirhinal cortex are both broadly implicated in memory; nevertheless, their relative contributions to visual item recognition and location memory remain disputed. Neuropsychological studies in nonhuman primates that examine memory function after selective damage to medial temporal lobe structures report various levels of memory impairment-ranging from minor deficits to profound amnesia. The discrepancies in published findings have complicated efforts to determine the exact magnitude of visual item recognition and location memory impairments following damage to the hippocampus and/or perirhinal cortex. To provide the most accurate estimate to date of the overall effect size, we use meta-analytic techniques on data aggregated from 26 publications that assessed visual item recognition and/or location memory in nonhuman primates with and without selective neurotoxic lesions of the hippocampus or perirhinal cortex. We estimated the overall effect size, evaluated the relation between lesion extent and effect size, and investigated factors that may account for between-study variation. Grouping studies by lesion target and testing method, separate meta-analyses were conducted. One meta-analysis indicated that impairments on tests of visual item recognition were larger after lesions of perirhinal cortex than after lesions of the hippocampus. A separate meta-analysis showed that performance on tests of location memory was severely impaired by lesions of the hippocampus. For the most part, meta-regressions indicated that greater impairment corresponds with greater lesion extent; paradoxically, however, more extensive hippocampal lesions predicted smaller impairments on tests of visual item recognition. We conclude the perirhinal cortex makes a larger contribution than the hippocampus to visual item recognition, and the hippocampus predominately contributes to spatial navigation.
Article
Background Opioids are often prescribed for acute pain to emergency department (ED) discharged patients, but there is a paucity of data on their short-term use. The purpose of this study was to synthesize the evidence regarding the efficacy of prescribed opioids compared to non-opioid analgesics for acute pain relief in ED-discharged patients. Methods MEDLINE, EMBASE, CINAHL, PsycINFO, CENTRAL, and gray literature databases were searched from inception to January 2023. Two independent reviewers selected randomized controlled trials investigating the efficacy of prescribed opioids for ED-discharged patients, extracted data and assessed risk of bias. Authors were contacted for missing data and to identify additional studies. The primary outcome was the difference in pain intensity scores or pain relief. All meta-analyses used random-effect model and a sensitivity analysis compared patients treated with codeine versus those treated with other opioids. Results From 5,419 initially screened citations, 46 full texts were evaluated and six studies enrolling 1,161 patients were included. Risk of bias was low for five studies. There was no statistically significant difference in pain intensity scores or pain relief between opioids versus non-opioid analgesics (standardized mean difference [SMD]:0.12; 95%CI: −0.10 to 0.34). Contrary to children, adult patients treated with opioid had better pain relief (SMD: 0.28; 95%CI: 0.13-0.42) compared to non-opioids. In another sensitivity analysis excluding studies using codeine, opioids were more effective than non-opioids (SMD: 0.30; 95%CI: 0.15-0.45). However, there were more adverse events associated with opioids (odds ratio: 2.64; 95%CI: 2.04-3.42). Conclusions For ED-discharged patients with acute musculoskeletal pain, opioids do not seem to be more effective than non-opioid analgesics. However, this absence of efficacy seems to be driven by codeine, as opioids other than codeine are more effective than non-opioids (mostly NSAIDs). Further prospective studies on the efficacy of short-term opioid use after ED discharge (excluding codeine), measuring patient-centered outcomes, adverse events, and potential misuse, are needed.
Article
In recent years, meta-analysis has evolved to a critically important field of Statistics, and has significant applications in Medicine and Health Sciences. In this work we briefly present existing methodologies to conduct meta-analysis along with any discussion and recent developments accompanying them. Undoubtedly, studies brought together in a systematic review will differ in one way or another. This yields a considerable amount of variability, any kind of which may be termed heterogeneity. To this end, reports of meta-analyses commonly present a statistical test of heterogeneity when attempting to establish whether the included studies are indeed similar in terms of the reported output or not. We intend to provide an overview of the topic, discuss the potential sources of heterogeneity commonly met in the literature and provide useful guidelines on how to address this issue and to detect heterogeneity. Moreover, we review the recent developments in the Bayesian approach along with the various graphical tools and statistical software that are currently available to the analyst. In addition, we discuss sensitivity analysis issues and other approaches of understanding the causes of heterogeneity. Finally, we explore heterogeneity in meta-analysis for time to event data in a nutshell, pointing out its unique characteristics.
Article
Objectives This study aims to synthesize health state utility values (HSUVs) of type 2 diabetes mellitus (T2DM) and its related complications published in the literature, conducting a meta-analysis of the data when possible. Methods We conducted a systematic search in MEDLINE and School of Health and Related Research Health Utilities Database repository. Studies focused on T2DM and its complications reporting utility values elicited using direct and indirect methods were selected. We categorized the results according to the instrument to describe health and meta-analyzed them accordingly. Data included in the analysis were pooled in a fixed-effect model by the inverse of variance mean and random-effects DerSimonian-Laird method. Two approaches on sensitivity analysis were performed: leave-one-out method and including data of HSUVs obtained by foreign population value sets. Results We identified 70 studies for the meta-analysis from a total of 467 studies. Sufficient data to pool T2DM HSUVs from EQ-5D instrument, hypoglycemia, and stroke were obtained. HSUVs varied from 0.7 to 0.92 in direct valuations, and the pooled mean of 3-level version of EQ-5D studies was 0.772 (95% confidence interval 0.763-0.78) and of 5-level version of EQ-5D 0.815 (95% confidence interval 0.808-0.823). HSUVs of complications varied from 0.739 to 0.843, or reductions of HSUVs between −0.014 and −0.094. In general, HSUVs obtained from 3-level version of EQ-5D and Health Utility Index 3 instruments were lower than those directly elicited. A considerable amount of heterogeneity was observed. Some complications remained unable to be pooled due to scarce of original articles. Conclusions T2DM and its complications have a considerable impact on health-related quality of life. 5-level version of EQ-5D estimates seems comparable with direct elicited HSUVs.
Article
Full-text available
This article describes the new meta-analysis command metaan, which can be used to perform fixed- or random-effects meta-analysis. Besides the stan- dard DerSimonian and Laird approach, metaan offers a wide choice of available models: maximum likelihood, profile likelihood, restricted maximum likelihood, and a permutation model. The command reports a variety of heterogeneity mea- sures, including Cochran’s Q, I2, HM2 , and the between-studies variance estimate τb2. A forest plot and a graph of the maximum likelihood function can also be generated.
Article
Full-text available
Meta-analysis (MA) is a statistical methodology that combines the results of several independent studies considered by the analyst to be 'combinable'. The simplest approach, the fixed-effects (FE) model, assumes the true effect to be the same in all studies, while the random-effects (RE) family of models allows the true effect to vary across studies. However, all methods are only correct asymptotically, while some RE models assume that the true effects are normally distributed. In practice, MA methods are frequently applied when study numbers are small and the normality of the effect distribution unknown or unlikely. In this article, we discuss the performance of the FE approach and seven frequentist RE MA methods: DerSimonian-Laird, Q-based, maximum likelihood, profile likelihood, Biggerstaff-Tweedie, Sidik-Jonkman and Follmann-Proschan. We covered numerous scenarios by varying the MA sizes (small to moderate), the degree of heterogeneity (zero to very large) and the distribution of the effect sizes (normal, skew-normal and 'extremely' non-normal). Performance was evaluated in terms of coverage (Type I error), power (Type II error) and overall effect estimation (accuracy of point estimates and error intervals).
Article
The procedure suggested by DerSimonian and Laird is the simplest and most commonly used method for fitting the random effects model for meta-analysis. Here it is shown that, unless all studies are of similar size, this is inefficient when estimating the between-study variance, but is remarkably efficient when estimating the treatment effect. If formal inference is restricted to statements about the treatment effect, and the sample size is large, there is little point in implementing more sophisticated methodology. However, it is further demonstrated, for a simple special case, that use of the profile likelihood results in actual coverage probabilities for 95% confidence intervals that are closer to nominal levels for smaller sample sizes. Alternative methods for making inferences for the treatment effect may therefore be preferable if the sample size is small, but the DerSimonian and Laird procedure retains its usefulness for larger samples.
Article
This paper examines eight published reviews each reporting results from several related trials. Each review pools the results from the relevant trials in order to evaluate the efficacy of a certain treatment for a specified medical condition. These reviews lack consistent assessment of homogeneity of treatment effect before pooling. We discuss a random effects approach to combining evidence from a series of experiments comparing two treatments. This approach incorporates the heterogeneity of effects in the analysis of the overall treatment efficacy. The model can be extended to include relevant covariates which would reduce the heterogeneity and allow for more specific therapeutic recommendations. We suggest a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.