A substantial and confusing variation exists in handling of baseline covariates in randomized controlled trials: a review of trials published in leading medical journals.

Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Avenue, Toronto, Ontario M4N 3M5, Canada.
Journal of clinical epidemiology (Impact Factor: 5.48). 09/2009; 63(2):142-53. DOI: 10.1016/j.jclinepi.2009.06.002
Source: PubMed

ABSTRACT Statisticians have criticized the use of significance testing to compare the distribution of baseline covariates between treatment groups in randomized controlled trials (RCTs). Furthermore, some have advocated for the use of regression adjustment to estimate the effect of treatment after adjusting for potential imbalances in prognostically important baseline covariates between treatment groups.
We examined 114 RCTs published in the New England Journal of Medicine, the Journal of the American Medical Association, The Lancet, and the British Medical Journal between January 1, 2007 and June 30, 2007.
Significance testing was used to compare baseline characteristics between treatment arms in 38% of the studies. The practice was very rare in British journals and more common in the U.S. journals. In 29% of the studies, the primary outcome was continuous, whereas in 65% of the studies, the primary outcome was either dichotomous or time-to-event in nature. Adjustment for baseline covariates was reported when estimating the treatment effect in 34% of the studies.
Our findings suggest the need for greater editorial consistency across journals in the reporting of RCTs. Furthermore, there is a need for greater debate about the relative merits of unadjusted vs. adjusted estimates of treatment effect.

Download full-text


Available from: Andrea Manca, Jun 30, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this study was to quantify the effect of participation by New Zealand dairy farmers in a year-long extension programme designed to improve herd reproductive performance. This was estimated by comparing, over two successive years, the proportions of cows becoming pregnant during the first 6 weeks of the seasonal breeding programme (6 week in-calf rate) in herds involved in a full participation group (treatment), with herds in an actively monitored control group or a passively monitored control group. Possible interactions between treatment and various biophysical and socio-demographic factors were also assessed. Multivariable modelling was used to determine the effect of treatment on 6 week in-calf rate, adjusting for design factors (study year and region). It was estimated that the 6 week in-calf rate was 68% (95% confidence interval 65–67%) in the treatment group of farms that participated in the extension programme compared with 66% (95% confidence interval 67–69%) in the actively monitored control group of farms that did not participate in the extension programme (P = 0.05); thus the risk difference was 2.0% (95% confidence interval 0.0–3.9%). No significant interactions were found between treatment and region, study year or any of the biophysical and socio-demographic variables on the 6 week in-calf rate (P > 0.05). There was no significant difference in the 6 week in-calf rate between the actively and passively monitored control groups (P = 0.56). It was concluded that enrolment in the extension programme improved the 6 week in-calf rate, and that the treatment effect was not modified substantially by region, study year or any of the biophysical and socio-demographic variables assessed.
    The Veterinary Journal 01/2015; 203(2). DOI:10.1016/j.tvjl.2014.11.014 · 2.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In randomized controlled trials (RCTs), treatment assignment is unconfounded with baseline covariates, allowing outcomes to be directly compared between treatment arms. When outcomes are binary, the effect of treatment can be summarized using relative risks, absolute risk reductions and the number needed to treat (NNT). When outcomes are time-to-event in nature, the effect of treatment on the absolute reduction of the risk of an event occurring within a specified duration of follow-up and the associated NNT can be estimated. In observational studies of the effect of treatments on health outcomes, treatment is frequently confounded with baseline covariates. Regression adjustment is commonly used to estimate the adjusted effect of treatment on outcomes. We highlight several limitations of measures of treatment effect that are directly obtained from regression models. We illustrate how both regression-based approaches and propensity-score based approaches allow one to estimate the same measures of treatment effect as those that are commonly reported in RCTs. The CONSORT statement recommends that both relative and absolute measures of treatment effects be reported for RCTs with dichotomous outcomes. The methods described in this paper will allow for similar reporting in observational studies.
    The International Journal of Biostatistics 01/2011; 7(1):6-6. DOI:10.2202/1557-4679.1285 · 1.28 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
    The International Journal of Biostatistics 01/2010; 6(1):Article 16. DOI:10.2202/1557-4679.1195 · 1.28 Impact Factor

Questions & Answers about this publication

  • Eik Vettorazzi added an answer in Systematic Reviews:
    What is the main advantage of using the "mean differences adjusted for baseline" over the commonly used "mean differences" to analyze effect size?
    My question adresses specifically continuous variables. When does the first method is indicated over the second? When is it counter-indicated? How to calculate it? Can I calculate it on RevMan?
    Eik Vettorazzi · University Medical Center Hamburg - Eppendorf
    Are both effect measures available in all studies? If so, then fortunately a lot changed since the reviews of Altman and Doré (1990)
    and Austin et al (2009)
    Then according to Stephen Senn (see above) the adjusted estimate is unbiased and more efficient.
  • Eik Vettorazzi added an answer in Statistical Analysis:
    Is there sense in using inferential statistics to assess baseline comparability in an RCT?
    Inferential statistics are essential for estimating likely population effects from sample data. But are they useful for comparing groups for baseline comparability?
    Eik Vettorazzi · University Medical Center Hamburg - Eppendorf
    To second Leventes nice comment, Austin et al accomplished a review of the usual practice in reporting RCTs, with interesting country-specific approaches.

    And I find the article from Stephen Senn also helpful.