A substantial and confusing variation exists in handling of baseline covariates in randomized controlled trials: a review of trials published in leading medical journals.

Institute for Clinical Evaluative Sciences, G1 06, 2075 Bayview Avenue, Toronto, Ontario M4N 3M5, Canada.
Journal of clinical epidemiology (Impact Factor: 5.48). 09/2009; 63(2):142-53. DOI: 10.1016/j.jclinepi.2009.06.002
Source: PubMed

ABSTRACT Statisticians have criticized the use of significance testing to compare the distribution of baseline covariates between treatment groups in randomized controlled trials (RCTs). Furthermore, some have advocated for the use of regression adjustment to estimate the effect of treatment after adjusting for potential imbalances in prognostically important baseline covariates between treatment groups.
We examined 114 RCTs published in the New England Journal of Medicine, the Journal of the American Medical Association, The Lancet, and the British Medical Journal between January 1, 2007 and June 30, 2007.
Significance testing was used to compare baseline characteristics between treatment arms in 38% of the studies. The practice was very rare in British journals and more common in the U.S. journals. In 29% of the studies, the primary outcome was continuous, whereas in 65% of the studies, the primary outcome was either dichotomous or time-to-event in nature. Adjustment for baseline covariates was reported when estimating the treatment effect in 34% of the studies.
Our findings suggest the need for greater editorial consistency across journals in the reporting of RCTs. Furthermore, there is a need for greater debate about the relative merits of unadjusted vs. adjusted estimates of treatment effect.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this study was to quantify the effect of participation by New Zealand dairy farmers in a year-long extension programme designed to improve herd reproductive performance. This was estimated by comparing, over two successive years, the proportions of cows becoming pregnant during the first 6 weeks of the seasonal breeding programme (6 week in-calf rate) in herds involved in a full participation group (treatment), with herds in an actively monitored control group or a passively monitored control group. Possible interactions between treatment and various biophysical and socio-demographic factors were also assessed. Multivariable modelling was used to determine the effect of treatment on 6 week in-calf rate, adjusting for design factors (study year and region). It was estimated that the 6 week in-calf rate was 68% (95% confidence interval 65–67%) in the treatment group of farms that participated in the extension programme compared with 66% (95% confidence interval 67–69%) in the actively monitored control group of farms that did not participate in the extension programme (P = 0.05); thus the risk difference was 2.0% (95% confidence interval 0.0–3.9%). No significant interactions were found between treatment and region, study year or any of the biophysical and socio-demographic variables on the 6 week in-calf rate (P > 0.05). There was no significant difference in the 6 week in-calf rate between the actively and passively monitored control groups (P = 0.56). It was concluded that enrolment in the extension programme improved the 6 week in-calf rate, and that the treatment effect was not modified substantially by region, study year or any of the biophysical and socio-demographic variables assessed.
    The Veterinary Journal 01/2015; DOI:10.1016/j.tvjl.2014.11.014 · 2.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background According to the CONSORT statement, significance testing of baseline differences in randomized controlled trials should not be performed. In fact, this practice has been discouraged by numerous authors throughout the last forty years. During that time span, reporting of baseline differences has substantially decreased in the leading general medical journals. Our own experience in the field of nutrition behavior research however, is that co-authors, reviewers and even editors are still very persistent in their demand for these tests. The aim of this paper is therefore to negate this demand by providing clear evidence as to why testing for baseline differences between intervention groups statistically is superfluous and why such results should not be published.DiscussionTesting for baseline differences is often propagated because of the belief that it shows whether randomization was successful and it identifies real or important differences between treatment arms that should be accounted for in the statistical analyses. Especially the latter argument is flawed, because it ignores the fact that the prognostic strength of a variable is also important when the interest is in adjustment for confounding. In addition, including prognostic variables as covariates can increase the precision of the effect estimate. This means that choosing covariates based on significance tests for baseline differences might lead to omissions of important covariates and, less importantly, to inclusion of irrelevant covariates in the analysis. We used data from four supermarket trials on the effects of pricing strategies on fruit and vegetables purchases, to show that results from fully adjusted analyses sometimes do appreciably differ from results from analyses adjusted for significant baseline differences only. We propose to adjust for known or anticipated important prognostic variables. These could or should be pre-specified in trial protocols. Subsequently, authors should report results from the fully adjusted as well as crude analyses, especially for dichotomous and time to event data.SummaryBased on our arguments, which were illustrated by our findings, we propose that journals in and outside the field of nutrition behavior actively adopt the CONSORT 2010 statement on this topic by not publishing significance tests for baseline differences anymore.
    International Journal of Behavioral Nutrition and Physical Activity 01/2015; 12(1):4. DOI:10.1186/s12966-015-0162-z · 3.68 Impact Factor
  • Source

Full-text (2 Sources)

Available from
Jun 2, 2014