Klein RJPower analysis for genome-wide association studies. BMC Genetics 8: 58

Program in Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY, USA.
BMC Genetics (Impact Factor: 2.36). 02/2007; 8:58. DOI: 10.1186/1471-2156-8-58
Source: PubMed

ABSTRACT Genome-wide association studies are a promising new tool for deciphering the genetics of complex diseases. To choose the proper sample size and genotyping platform for such studies, power calculations that take into account genetic model, tag SNP selection, and the population of interest are required.
The power of genome-wide association studies can be computed using a set of tag SNPs and a large number of genotyped SNPs in a representative population, such as available through the HapMap project. As expected, power increases with increasing sample size and effect size. Power also depends on the tag SNPs selected. In some cases, more power is obtained by genotyping more individuals at fewer SNPs than fewer individuals at more SNPs.
Genome-wide association studies should be designed thoughtfully, with the choice of genotyping platform and sample size being determined from careful power calculations.

  • Source
    • "The small sample size of the GWAS was insufficient considering that for the detection of markers with small effects on quantitative traits several thousands of individuals are required (Klein, 2007; Spencer et al., 2009). Thus, it was not surprising that no effects were significant after correction for multiple testing. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present results from a genome-wide association study (GWAS) and a single-marker association study. The GWAS was performed with the Illumina PorcineSNP60 BeadChip from which five markers were selected for a validation analysis. Genetic effects were estimated for feed intake, weight gain and traits of fat and muscle tissue in German Landrace boars kept on performance test stations. The GWAS was carried out in a population of 288 boars and the validation study for another 432 boars. No statistically significant effect was found in the GWAS after adjusting for multiple testing. Effects of two markers, which were genome-wide significant before correction for multiple testing (P<0.00005), could be confirmed in the validation study. The major allele of marker ALGA0056781 on SSC1 was positively associated with both higher weight gain and fat deposition. The effect on live-weight gain was 2.25 g/day in the GWAS (P=0.0003) and 3.73 g/day in the validation study (P=0.01), and for back fat thickness 0.15 mm in the GWAS (P<0.0001) and 0.20 mm in the validation study (P=0.02). The marker had similar effects on test-day weight gain (GWAS: 3.85 g/day, P=0.001; validation study: 6.80 g/day, P=0.003) and back fat area (GWAS: 0.27 cm&sup2;, P<0.0001; validation study: 0.35 cm&sup2;, P=0.03). Marker ASGA0056782 on SSC13 was associated with live-weight gain. The major allele had negative effects in both studies (GWAS: -4.88 g/day, P<0.0001; validation study: -3.75 g/day, P=0.02). The effects of these two markers would have been excluded based on the GWAS alone but were shown to be significantly trait associated in the validation study indicating a false-negative result. The G protein-coupled receptor 126 (GPR126) gene ∼200kb down-stream of marker ALGA0001781 was shown to be associated with human height, and thus might explain the association with weight gain in pigs. Several traits were affected in an economically desired direction by the minor allele of the markers, pointing to the possibility of improvement through further selection.
    Journal of Animal Science 03/2014; 92(5). DOI:10.2527/jas.2013-7247 · 1.92 Impact Factor
  • Source
    • "Power is an even more complicated function of several factors: study design, correlation patterns in the genotypic data, sizes of cohorts, frequency of the susceptibility allele, relative risk conferred by the causal factor, genetic model (additive, dominant, recessive, multiplicative) [10]. As a consequence, the analytical computation of power requires simplified assumptions , including the approximation of the test statistic distribution under H1 through a probability law [11]. Most power calculators based on analytical approaches are used for two-stage GWA design, e.g. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Assessing the statistical power to detect susceptibility variants plays a critical role in GWA studies both from the prospective and retrospective points of view. Power is empirically estimated by simulating phenotypes under a disease model H1. For this purpose, the "gold" standard consists in simulating genotypes given the phenotypes (e.g. Hapgen). We introduce here an alternative approach for simulating phenotypes under H1 that does not require generating new genotypes for each simulation. In order to simulate phenotypes with a fixed total number of cases and under a given disease model, we suggest three algorithms: i) a simple rejection algorithm; ii) a numerical Markov Chain Monte-Carlo (MCMC) approach; iii) and an exact and efficient backward sampling algorithm. In our study, we validated the three algorithms both on a toy-dataset and by comparing them with Hapgen on a more realistic dataset. As an application, we then conducted a simulation study on a 1000 Genomes Project dataset consisting of 629 individuals (314 cases) and 8,048 SNPs from Chromosome X. We arbitrarily defined an additive disease model with two susceptibility SNPs and an epistatic effect. The three algorithms are consistent, but backward sampling is dramatically faster than the other two. Our approach also gives consistent results with Hapgen. Using our application data, we showed that our limited design requires a biological a priori to limit the investigated region. We also proved that epistatic effects can play a significant role even when simple marker statistics (e.g. trend) are used. We finally showed that the overall performance of a GWA study strongly depends on the prevalence of the disease: the larger the prevalence, the better the power.
    Human Heredity 01/2012; 73(2). DOI:10.1159/000336194 · 1.64 Impact Factor
  • Source
    • "All association analyses were performed using PLINK v1.06, as detailed in the Appendix. For each of the 37 SNPs previously reported to be associated with prostate cancer, we used the reported odds ratio along with the minor allele frequency we observed to compute power for a significance level of α = 0.05 [13]. We performed 10 000 iterations of a simulation in which each SNP was randomly assigned " significant " or " not significant " based on the power, and the total number of " significant " SNPs was counted. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Although case-control studies have identified numerous single nucleotide polymorphisms (SNPs) associated with prostate cancer, the clinical role of these SNPs remains unclear. Evaluate previously identified SNPs for association with prostate cancer and accuracy in predicting prostate cancer in a large prospective population-based cohort of unscreened men. This study used a nested case-control design based on the Malmö Diet and Cancer cohort with 943 men diagnosed with prostate cancer and 2829 matched controls. Blood samples were collected between 1991 and 1996, and follow-up lasted through 2005. We genotyped 50 SNPs, analyzed prostate-specific antigen (PSA) in blood from baseline, and tested for association with prostate cancer using the Cochran-Mantel-Haenszel test. We further developed a predictive model using SNPs nominally significant in univariate analysis and determined its accuracy to predict prostate cancer. Eighteen SNPs at 10 independent loci were associated with prostate cancer. Four independent SNPs at four independent loci remained significant after multiple test correction (p<0.001). Seven SNPs at five independent loci were associated with advanced prostate cancer defined as clinical stage≥T3 or evidence of metastasis at diagnosis. Four independent SNPs were associated with advanced or aggressive cancer defined as stage≥T3, metastasis, Gleason score≥8, or World Health Organization grade 3 at diagnosis. Prostate cancer risk prediction with SNPs alone was less accurate than with PSA at baseline (area under the curve of 0.57 vs 0.79), with no benefit from combining SNPs with PSA. This study is limited by our reliance on clinical diagnosis of prostate cancer; there are likely undiagnosed cases among our control group. Only a few previously reported SNPs were associated with prostate cancer risk in the large prospective Diet and Cancer cohort in Malmö, Sweden. SNPs were less useful in predicting prostate cancer risk than PSA at baseline.
    European Urology 11/2011; 61(3):471-7. DOI:10.1016/j.eururo.2011.10.047 · 12.48 Impact Factor
Show more

Preview (2 Sources)

Available from