Xiao R, Boehnke M. Quantifying and correcting for the winner's curse in genetic association studies. Genet Epidemiol 33: 453-462

Department of Biostatistics and Center for Statistical Genetics, University of Michigan, USA.
Genetic Epidemiology (Impact Factor: 2.95). 07/2009; 33(5):453-62. DOI: 10.1002/gepi.20398
Source: PubMed

ABSTRACT Genetic association studies are a powerful tool to detect genetic variants that predispose to human disease. Once an associated variant is identified, investigators are also interested in estimating the effect of the identified variant on disease risk. Estimates of the genetic effect based on new association findings tend to be upwardly biased due to a phenomenon known as the "winner's curse." Overestimation of genetic effect size in initial studies may cause follow-up studies to be underpowered and so to fail. In this paper, we quantify the impact of the winner's curse on the allele frequency difference and odds ratio estimators for one- and two-stage case-control association studies. We then propose an ascertainment-corrected maximum likelihood method to reduce the bias of these estimators. We show that overestimation of the genetic effect by the uncorrected estimator decreases as the power of the association study increases and that the ascertainment-corrected method reduces absolute bias and mean square error unless power to detect association is high.

  • Source
    • "Thus, even conservative estimates of the sample size needed for tests of replication based on the lower bound of the original effect size estimate may be too small. Statistical methods have been developed to improve sample size estimates under conditions of expected shrinking effect sizes, however (Xiao and Boehnke 2009). "
    Journal of Abnormal Child Psychology 04/2013; 41(4). DOI:10.1007/s10802-013-9741-0 · 3.09 Impact Factor
  • Source
    • "In this case, a follow-up study designed to have 80% power at significance level a 5 0.05 would include 595 samples, but have actual power of only 35%. As for case-control studies [Zö llner and Pritchard, 2007; Xiao and Boehnke, 2009], we found that for a fixed significance level a, the proportional bias in the uncorrected estimator of b 1 is solely a function of power, whatever the sample size, allele frequency, and genetic model (Fig. 2). Given fixed power, different significance levels result in different proportional bias in the naı¨ve estimator of b 1 . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Quantitative traits (QT) are an important focus of human genetic studies both because of interest in the traits themselves and because of their role as risk factors for many human diseases. For large-scale QT association studies including genome-wide association studies, investigators usually focus on genetic loci showing significant evidence for SNP-QT association, and genetic effect size tends to be overestimated as a consequence of the winner's curse. In this paper, we study the impact of the winner's curse on QT association studies in which the genetic effect size is parameterized as the slope in a linear regression model. We demonstrate by analytical calculation that the overestimation in the regression slope estimate decreases as power increases. To reduce the ascertainment bias, we propose a three-parameter maximum likelihood method and then simplify this to a one-parameter method by excluding nuisance parameters. We show that both methods reduce the bias when power to detect association is low or moderate, and that the one-parameter model generally results in smaller variance in the estimate.
    Genetic Epidemiology 04/2011; 35(3):133-8. DOI:10.1002/gepi.20551 · 2.95 Impact Factor
  • Source
    • "Genome-wide (population-based) association analysis is generally considered to be a main tool to infer causative links between genomic marker data and phenotype (Risch and Merikangas, 1996; McCarthy and Hirschhorn, 2008). This happens regardless of problems such as genetic heterogeneity (Terwilliger and Weiss, 1998; Sillanpää and Bhattacharjee, 2006), winner's curse (Lande and Thompson, 1990; Beavis, 1998; Gö ring et al., 2001; Xiao and Boehnke, 2009) and missing heritability (Maher, 2008; McCarthy and Hirschhorn, 2008; Slatkin, 2009). A particular property of marker data is the systematic spatial dependence along the chromosome. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Population-based genomic association analyses are more powerful than within-family analyses. However, population stratification (unknown or ignored origin of individuals from multiple source populations) and cryptic relatedness (unknown or ignored covariance between individuals because of their relatedness) are confounding factors in population-based genomic association analyses, which inflate the false-positive rate. As a consequence, false association signals may arise in genomic data association analyses for reasons other than true association between the tested genomic factor (marker genotype, gene or protein expression) and the study phenotype. It is therefore important to correct or account for these confounders in population-based genomic data association analyses. The common correction techniques for population stratification and cryptic relatedness problems are presented here in the phenotype-marker association analysis context, and comments on their suitability for other types of genomic association analyses (for example, phenotype-expression association) are also provided. Even though many of these techniques have originally been developed in the context of human genetics, most of them are also applicable to model organisms and breeding populations.
    Heredity 04/2011; 106(4):511-9. DOI:10.1038/hdy.2010.91 · 3.80 Impact Factor
Show more


Available from