An exploratory test for an excess of significant findings

University of Ioannina, Yannina, Epirus, Greece
Clinical Trials (Impact Factor: 1.93). 02/2007; 4(3):245-53. DOI: 10.1177/1740774507079441
Source: PubMed


The published clinical research literature may be distorted by the pursuit of statistically significant results.
We aimed to develop a test to explore biases stemming from the pursuit of nominal statistical significance.
The exploratory test evaluates whether there is a relative excess of formally significant findings in the published literature due to any reason (e.g., publication bias, selective analyses and outcome reporting, or fabricated data). The number of expected studies with statistically significant results is estimated and compared against the number of observed significant studies. The main application uses alpha = 0.05, but a range of alpha thresholds is also examined. Different values or prior distributions of the effect size are assumed. Given the typically low power (few studies per research question), the test may be best applied across domains of many meta-analyses that share common characteristics (interventions, outcomes, study populations, research environment).
We evaluated illustratively eight meta-analyses of clinical trials with >50 studies each and 10 meta-analyses of clinical efficacy for neuroleptic agents in schizophrenia; the 10 meta-analyses were also examined as a composite domain. Different results were obtained against commonly used tests of publication bias. We demonstrated a clear or possible excess of significant studies in 6 of 8 large meta-analyses and in the wide domain of neuroleptic treatments.
The proposed test is exploratory, may depend on prior assumptions, and should be applied cautiously.
An excess of significant findings may be documented in some clinical research fields.

44 Reads
  • Source
    • "To minimize the fourth concern— overestimation of mean effect sizes due to publication bias—we searched for, retrieved, and included results from as many unpublished experiments as possible. Moreover, we applied both classic and more recently developed statistical techniques to assess and correct meta-analytic estimates for the influence of small-study effects such as publication bias (Duval & Tweedie, 2000; Ioannidis & Trikalinos, 2007; Stanley & Doucouliagos, 2014). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Failures of self-control are thought to underlie various important behaviors (e.g., addiction, violence, obesity, poor academic achievement). The modern conceptualization of self-control failure has been heavily influenced by the idea that self-control functions as if it relied upon a limited physiological or cognitive resource. This view of self-control has inspired hundreds of experiments designed to test the prediction that acts of self-control are more likely to fail when they follow previous acts of self-control (the depletion effect). Here, we evaluated the empirical evidence for this effect with a series of focused, meta-analytic tests that address the limitations in prior appraisals of the evidence. We find very little evidence that the depletion effect is a real phenomenon, at least when assessed with the methods most frequently used in the laboratory. Our results strongly challenge the idea that self-control functions as if it relies on a limited psychological or physical resource. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
    Journal of Experimental Psychology General 06/2015; 144(4). DOI:10.1037/xge0000083 · 5.50 Impact Factor
    • "This IC index is not close to the Ͼ .90 criterion that has been suggested as criterion to infer that there might exist more nonsignificant findings than have been reported (Ioannidis & Trikalinos, 2007; Schimmack, 2012). As Schimmack (2012) warns that a low IC score is a necessary but not sufficient condition to ensure credibility, we also looked at two other indices of publication bias. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A rich tradition in self-control research has documented the negative consequences of exerting self-control in 1 task for self-control performance in subsequent tasks. However, there is a dearth of research examining what happens when people exert self-control in multiple domains simultaneously. The current research aims to fill this gap. We integrate predictions from the most prominent models of self-control with recent neuropsychological insights in the human inhibition system to generate the novel hypothesis that exerting effortful self-control in 1 task can simultaneously improve self-control in completely unrelated domains. An internal meta-analysis on all 18 studies we conducted shows that exerting self-control in 1 domain (i.e., controlling attention, food consumption, emotions, or thoughts) simultaneously improves self-control in a range of other domains, as demonstrated by, for example, reduced unhealthy food consumption, better Stroop task performance, and less impulsive decision making. A subset of 9 studies demonstrates the crucial nature of task timing-when the same tasks are executed sequentially, our results suggest the emergence of an ego depletion effect. We provide conservative estimates of the self-control facilitation (d = |0.22|) as well as the ego depletion effect size (d = |0.17|) free of data selection and publication biases. These results (a) shed new light on self-control theories, (b) confirm recent claims that previous estimates of the ego depletion effect size were inflated due to publication bias, and (c) provide a blueprint for how to handle the power issues and associated file drawer problems commonly encountered in multistudy research projects. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
    Journal of Experimental Psychology General 03/2015; 144(3). DOI:10.1037/xge0000065 · 5.50 Impact Factor
  • Source
    • "The TES has low power when only a limited number of studies is included in a meta-analysis (Francis, 2012, 2013; Ioannidis & Trikalinos, 2007a), and has particularly low power when population effects are heterogenous (Francis, 2013). Ioannidis and Trikalinos (2007b) also recommend not using the test if between-study heterogeneity exists, but to first create homogenous subgroups of effect sizes before applying the test. "
    Dataset: puniform
Show more