Clinical Trials: Discerning Hype From Substance
ABSTRACT The interest in being able to interpret and report results in clinical trials as being favorable is pervasive throughout health care research. This important source of bias needs to be recognized, and approaches need to be implemented to effectively address it. The prespecified primary analyses of the primary and secondary end points of a clinical trial should be clearly specified when disseminating results in press releases and journal publications. There should be a focus on these analyses when interpreting the results. A substantial risk for biased conclusions is produced by conducting exploratory analyses with an intention to establish that the benefit-to-risk profile of the experimental intervention is favorable, rather than to determine whether it is. In exploratory analyses, P values will be misleading when the actual sampling context is not presented to allow for proper interpretation, and the effect sizes of outcomes having particularly favorable estimates are probably overestimated because of "random high" bias. Performing exploratory analyses should be viewed as generating hypotheses that usually require reassessment in prospectively conducted confirmatory trials. Awareness of these issues will meaningfully improve our ability to be guided by substance, not hype, in making evidence-based decisions about medical care.
- SourceAvailable from: Brian G. Moss
[Show abstract] [Hide abstract]
- "whether reported as grade or as pass / fail . With the usual call for caution that conclusions based on subset analyses should be made tenuously ( Fleming 2010 ) , we did find consistent results ( for both grade and pass / fail ) that males benefited from DE more than females . We also found evidence that White students were more likely to benefit from DE than multiracial students ( the estimate of benefit , . "
ABSTRACT: Annually, American colleges and universities provide developmental education (DE) to millions of underprepared students; however, evaluation estimates of DE benefits have been mixed. Using a prototypic exemplar of DE, our primary objective was to investigate the utility of a replicative evaluative framework for assessing program effectiveness. Within the context of the regression discontinuity (RD) design, this research examined the effectiveness of a DE program for five, sequential cohorts of first-time college students. Discontinuity estimates were generated for individual terms and cumulatively, across terms. Participants were 3,589 first-time community college students. DE program effects were measured by contrasting both college-level English grades and a dichotomous measure of pass/fail, for DE and non-DE students. Parametric and nonparametric estimates of overall effect were positive for continuous and dichotomous measures of achievement (grade and pass/fail). The variability of program effects over time was determined by tracking results within individual terms and cumulatively, across terms. Applying this replication strategy, DE's overall impact was modest (an effect size of approximately .20) but quite consistent, based on parametric and nonparametric estimation approaches. A meta-analysis of five RD results yielded virtually the same estimate as the overall, parametric findings. Subset analysis, though tentative, suggested that males benefited more than females, while academic gains were comparable for different ethnicities. The cumulative, within-study comparison, replication approach offers considerable potential for the evaluation of new and existing policies, particularly when effects are relatively small, as is often the case in applied settings.Evaluation Review 03/2014; 37(5). DOI:10.1177/0193841X14523620 · 1.20 Impact Factor
[Show abstract] [Hide abstract]
- "It is well established that performing multiple statistical tests can inflate the chance of falsely concluding a significant treatment effect    . The problem of multiplicity can arise due to multiple comparisons among different treatment groups, analyses of treatment effects in multiple subgroups, use of multiple outcome variables or the same outcome variable defined at multiple time points, interim analyses, or using multiple methods for statistical analysis to address the same scientific question . "
ABSTRACT: Performing multiple analyses in clinical trials can inflate the probability of a type I error, or the chance of falsely concluding a significant effect of the treatment. Strategies to minimize type I error probability include pre-specification of primary analyses and statistical adjustment for multiple comparisons, when applicable. The objective of this study was to assess the quality of primary analysis reporting and frequency of multiplicity adjustment in three major pain journals (i.e., European Journal of Pain, Journal of Pain, Pain). A total of 161 randomized controlled trials investigating non-invasive pharmacological treatments or interventional treatments for pain, published between 2006 and 2012, were included. Only 52% of trials identified a primary analysis and only 10% of trials reported pre-specification of that analysis. Among the 33 articles that identified a primary analysis with multiple testing, 15 (45%) adjusted for multiplicity; of those 15, only 2 (13%) reported pre-specification of the adjustment methodology. Trials in clinical pain conditions and industry-sponsored trials identified a primary analysis more often than trials in experimental pain models and non-industry sponsored trials, respectively. The results of this systematic review demonstrate deficiencies in the reporting and possibly execution of primary analyses in published analgesic trials. These deficiencies can be rectified by changes in, or better enforcement of, journal policies pertaining to requirements for the reporting of analyses of clinical trial data.Pain 11/2013; 155(3). DOI:10.1016/j.pain.2013.11.009 · 5.84 Impact Factor
- Pain 03/2011; 152(8):1705-8. DOI:10.1016/j.pain.2011.02.026 · 5.84 Impact Factor