Article

Assmann SF, Pocock SJ, Enos LE, Kasten LESubgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet 355: 1064-1069

New England Research Institutes, Watertown, MA, USA.
The Lancet (Impact Factor: 45.22). 04/2000; 355(9209):1064-9. DOI: 10.1016/S0140-6736(00)02039-0
Source: PubMed

ABSTRACT

Baseline data collected on each patient at randomisation in controlled clinical trials can be used to describe the population of patients, to assess comparability of treatment groups, to achieve balanced randomisation, to adjust treatment comparisons for prognostic factors, and to undertake subgroup analyses. We assessed the extent and quality of such practices in major clinical trial reports.
A sample of 50 consecutive clinical-trial reports was obtained from four major medical journals during July to September, 1997. We tabulated the detailed information on uses of baseline data by use of a standard form.
Most trials presented baseline comparability in a table. These tables were often unduly large, and about half the trials inappropriately used significance tests for baseline comparison. Methods of randomisation, including possible stratification, were often poorly described. There was little consistency over whether to use covariate adjustment and the criteria for selecting baseline factors for which to adjust were often unclear. Most trials emphasised the simple unadjusted results and covariate adjustment usually made negligible difference. Two-thirds of the reports presented subgroup findings, but mostly without appropriate statistical tests for interaction. Many reports put too much emphasis on subgroup analyses that commonly lacked statistical power.
Clinical trials need a predefined statistical analysis plan for uses of baseline data, especially covariate-adjusted analyses and subgroup analyses. Investigators and journals need to adopt improved standards of statistical reporting, and exercise caution when drawing conclusions from subgroup findings.

    • "Simple randomization can fail if it creates groups of treated households that are unbalanced for critical characteristics that are known or suspected to affect treatment outcomes (Kernan et al., 1999). Furthermore, simple randomization can also fail if the analysis of interest is at the level of a particular subgroup of the population that is rare for which additional statistical power may be needed (Assmann et al., 2000). In our case, our goal is to study the impact of the cinema pack on the pirates' consumption of TV and Internet. "

    No preview · Conference Paper · Jul 2015
  • Source
    • "Simple randomization can fail if it creates groups of treated households that are unbalanced for critical characteristics that are known or suspected to affect treatment outcomes (Kernan et al., 1999). Furthermore, simple randomization can also fail if the analysis of interest is at the level of a particular subgroup of the population that is rare for which additional statistical power may be needed (Assmann et al., 2000). In our case, our goal is to study the impact of the cinema pack on the pirates' consumption of TV and Internet. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Peer-ratings have become increasingly important sources of product information, particularly in markets for " information goods. " However, in spite of the increasing prevalence of this information, there are relatively few academic studies that analyze the impact of peer-ratings on consumers transacting in " real world " marketplaces. In this paper, we partner with a major cable company to analyze the impact of peer-ratings in a real-world Video-on-Demand market where consumer participation is organic and where movies are costly and well-known to consumers. After experimentally manipulating the initial conditions of product information displayed to consumers, we find that, consistent with the prior literature, peer-ratings influence consumer behavior independently from underlying product quality. However, we also find that, in contrast to the prior literature, at least in our setting there is little evidence of long-term bias due to herding effects. Specifically, when movies are artificially promoted or demoted in peer-rating lists, subsequent reviews cause them to return to their true quality position relatively quickly. One explanation for this difference is that consumers in our empirical setting likely had more outside 1 information about the true quality of the products they were evaluating than did consumers in the studies reported in prior literature. While tentative, this explanation suggests that in real-world marketplaces where consumers have sufficient access to outside information about true product quality, peer-ratings may be more robust to herding effects and thus provide more reliable signals of true product quality, than previously thought.
    Full-text · Article · Mar 2015
  • Source
    • "The statistical properties of baseline adjustment methods are complex and often poorly understood, resulting in confusion about the choice of the most appropriate statistical strategy.7 Assman et al analyzed a sample of 50 trials from four top medical journals, British Medical Journal, Journal of the American Medical Association, The Lancet, and New England Journal of Medicine,8 and reported the use of seven different covariate-adjustment methods. The lack of consistency in the literature on pre–post design further contributes to the difficulty of establishing a standard statistical method. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Although seemingly straightforward, the statistical comparison of a continuous variable in a randomized controlled trial that has both a pre- and posttreatment score presents an interesting challenge for trialists. We present here empirical application of four statistical methods (posttreatment scores with analysis of variance, analysis of covariance, change in scores, and percent change in scores), using data from a randomized controlled trial of postoperative pain in patients following total joint arthroplasty (the Morphine COnsumption in Joint Replacement Patients, With and Without GaBapentin Treatment, a RandomIzed ControlLEd Study [MOBILE] trials). Methods Analysis of covariance (ANCOVA) was used to adjust for baseline measures and to provide an unbiased estimate of the mean group difference of the 1-year postoperative knee flexion scores in knee arthroplasty patients. Robustness tests were done by comparing ANCOVA with three comparative methods: the posttreatment scores, change in scores, and percentage change from baseline. Results All four methods showed similar direction of effect; however, ANCOVA (−3.9; 95% confidence interval [CI]: −9.5, 1.6; P=0.15) and the posttreatment score (−4.3; 95% CI: −9.8, 1.2; P=0.12) method provided the highest precision of estimate compared with the change score (−3.0; 95% CI: −9.9, 3.8; P=0.38) and percent change (−0.019; 95% CI: −0.087, 0.050; P=0.58). Conclusion ANCOVA, through both simulation and empirical studies, provides the best statistical estimation for analyzing continuous outcomes requiring covariate adjustment. Our empirical findings support the use of ANCOVA as an optimal method in both design and analysis of trials with a continuous primary outcome.
    Full-text · Article · Jul 2014 · Clinical Epidemiology
Show more