Subgroup analysis and other (mis)uses of baseline data in clinical trials

New England Research Institutes, Watertown, MA, USA.
The Lancet (Impact Factor: 39.21). 04/2000; 355(9209):1064-9. DOI: 10.1016/S0140-6736(00)02039-0
Source: PubMed

ABSTRACT Baseline data collected on each patient at randomisation in controlled clinical trials can be used to describe the population of patients, to assess comparability of treatment groups, to achieve balanced randomisation, to adjust treatment comparisons for prognostic factors, and to undertake subgroup analyses. We assessed the extent and quality of such practices in major clinical trial reports.
A sample of 50 consecutive clinical-trial reports was obtained from four major medical journals during July to September, 1997. We tabulated the detailed information on uses of baseline data by use of a standard form.
Most trials presented baseline comparability in a table. These tables were often unduly large, and about half the trials inappropriately used significance tests for baseline comparison. Methods of randomisation, including possible stratification, were often poorly described. There was little consistency over whether to use covariate adjustment and the criteria for selecting baseline factors for which to adjust were often unclear. Most trials emphasised the simple unadjusted results and covariate adjustment usually made negligible difference. Two-thirds of the reports presented subgroup findings, but mostly without appropriate statistical tests for interaction. Many reports put too much emphasis on subgroup analyses that commonly lacked statistical power.
Clinical trials need a predefined statistical analysis plan for uses of baseline data, especially covariate-adjusted analyses and subgroup analyses. Investigators and journals need to adopt improved standards of statistical reporting, and exercise caution when drawing conclusions from subgroup findings.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Transfusion Indication Threshold Reduction (TITRe2) trial is the largest randomized controlled trial to date to compare red blood cell transfusion strategies following cardiac surgery. This update presents the statistical analysis plan, detailing how the study will be analyzed and presented. The statistical analysis plan has been written following recommendations from the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, prior to database lock and the final analysis of trial data. Outlined analyses are in line with the Consolidated Standards of Reporting Trials (CONSORT). The study aims to randomize 2000 patients from 17 UK centres. Patients are randomized to either a restrictive (transfuse if haemoglobin concentration <7.5 g/dl) or liberal (transfuse if haemoglobin concentration <9 g/dl) transfusion strategy. The primary outcome is a binary composite outcome of any serious infectious or ischaemic event in the first 3 months following randomization. The statistical analysis plan details how non-adherence with the intervention, withdrawals from the study, and the study population will be derived and dealt with in the analysis. The planned analyses of the trial primary and secondary outcome measures are described in detail, including approaches taken to deal with multiple testing, model assumptions not being met and missing data. Details of planned subgroup and sensitivity analyses and pre-specified ancillary analyses are given, along with potential issues that have been identified with such analyses and possible approaches to overcome such issues. ISRCTN70923932 .
    Trials 12/2015; 16(1). DOI:10.1186/s13063-015-0564-x · 2.12 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Because each patient's baseline (pre-treatment) characteristics differ (e.g., age, sex, socioeconomic status, ethnicity/race, biomarkers), treatments do not work the same for every patient-some can even cause detrimental effects. To improve patient care, it is critical to identify such heterogeneity of treatment effects. But the standard analytic approach dichotomizes baseline characteristics (low vs. high) which often leads to a loss of critical patient-care information and power to detect heterogeneity, as the results may depend strongly on the cut-points chosen. A more powerful analytic approach is to analyze baseline characteristics (i.e., covariates) measured on a continuous scale that retains all of the information available for the covariate. In this article, we show how the Johnson-Neyman (J-N) method can be used to identify the prognostic and predictive value of baseline covariates measured on a continuous scale - findings that often cannot be determined using the traditional dichotomized approach. As an example, we used the J-N method to explore treatment effects for varying levels of the biomarker salivary mutans streptococci (MS) in a randomized clinical prevention trial comparing fluoride varnish with no fluoride varnish for 376 initially caries-free high-risk children, all of whom received oral health counseling. The J-N analysis showed that children with higher baseline MS values who were randomized to receive fluoride varnish had the poorest dental caries prognosis and may have benefitted most from the preventive agent. Such methods are likely to be an important tool in the field of personalized oral health care.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In randomized controlled trials (RCTs), the most compelling need is to determine whether the treatment condition was more effective than control. However, it is generally recognized that not all participants in the treatment group of most clinical trials benefit equally. While subgroup analyses are often used to compare treatment effectiveness across pre-determined subgroups categorized by patient characteristics, methods to empirically identify naturally occurring clusters of persons who benefit most from the treatment group have rarely been implemented. This article provides a modeling framework to accomplish this important task. Utilizing information about individuals from the treatment group who had poor outcomes, the present study proposes an a priori clustering strategy that classifies the individuals with initially good outcomes in the treatment group into: (a) group GE (good outcome, effective), the latent subgroup of individuals for whom the treatment is likely to be effective and (b) group GI (good outcome, ineffective), the latent subgroup of individuals for whom the treatment is not likely to be effective. The method is illustrated through a re-analysis of a publically available data set from the National Institute on Drug Abuse. The RCT examines the effectiveness of motivational enhancement therapy from 461 outpatients with substance abuse problems. The proposed method identified latent subgroups GE and GI, and the comparison between the two groups revealed several significantly different and informative characteristics even though both subgroups had good outcomes during the immediate post-therapy period. As a diagnostic means utilizing out-of-sample forecasting performance, the present study compared the relapse rates during the long-term follow-up period for the two subgroups. As expected, group GI, composed of individuals for whom the treatment was hypothesized to be ineffective, had a significantly higher relapse rate than group GE (63% vs. 27%; χ (2) = 9.99, p-value = .002).


Available from