Sampling GWAS subjects from risk populations

Institute of Human Genetics, Klinikum rechts der Isar, TU München, Munich, Germany.
Genetic Epidemiology (Impact Factor: 2.6). 04/2011; 35(3):148-53. DOI: 10.1002/gepi.20562
Source: PubMed


Power, i.e. sample size, is a crucial issue in genome-wide association studies (GWAS) on disorders generated by a multitude of weak genetic effects. Here, we examine the influence of sampling cases and/or controls from populations that are subjected to an external risk factor (such as smoking or nutritional factors). We use an additive threshold model and derive the necessary sample size as function of the external risk factor's strength and of the sampling scheme. If both cases and controls are sampled from the risk population, a loss of power must be expected. The loss of power (i.e. the increase of the necessary sample size) is even larger if only the cases are sampled from the risk population, whereas the inverse scheme (nonrisk cases and risk controls) provides a gain of power since nonrisk cases are enriched for disease-favouring alleles while risk controls are enriched for protective alleles. For small effect sizes, we derive simple approximations in analytically closed form. A strategy of GWAS sample collection from risk populations minimizing the necessary sample sizes may thus be deduced that generally applies as long as strong gene-environment interactions can be excluded.

2 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Classical population genetics shows that varying permutations of genes and risk factors permit or disallow the effects of causative agents, depending on circumstance. For example, genes and environment determine whether a fox kills black or white rabbits on snow or black ash covered islands. Risk promoting effects are different on each island, but obscured by meta-analysis or GWAS data from both islands, unless partitioned by different contributory factors. In Alzheimer's disease, the foxes appear to be herpes, borrelia or chlamydial infection, hypercholesterolemia, hyperhomocysteinaemia, diabetes, cerebral hypoperfusion, oestrogen depletion, or vitamin A deficiency, all of which promote beta-amyloid deposition in animal models—without the aid of gene variants. All relate to risk factors and subsets of susceptibility genes, which condition their effects. All are less prevalent in convents, where nuns appear less susceptible to the ravages of ageing. Antagonism of the antimicrobial properties of beta-amyloid by Abeta autoantibodies in the ageing population, likely generated by antibodies raised to beta-amyloid/pathogen protein homologues, may play a role in this scenario. These agents are treatable by diet and drugs, vitamin supplementation, pathogen detection and elimination, and autoantibody removal, although again, the beneficial effects of individual treatments may be tempered by genes and environment.
    01/2011; DOI:10.5402/2011/394678
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With limited funding and biological specimen availability, choosing an optimal sampling design to maximize power for detecting gene-by-environment (G–E) interactions is critical. Exposure-enriched sampling is often used to select subjects with rare exposures for genotyping to enhance power for tests of G–E effects. However, exposure misclassification (MC) combined with biased sampling can affect characteristics of tests for G–E interaction and joint tests for marginal association and G–E interaction. Here, we characterize the impact of exposure-biased sampling under conditions of perfect exposure information and exposure MC on properties of several methods for conducting inference. We assess the Type I error, power, bias, and mean squared error properties of case-only, case–control, and empirical Bayes methods for testing/estimating G–E interaction and a joint test for marginal G (or E) effect and G–E interaction across three biased sampling schemes. Properties are evaluated via empirical simulation studies. With perfect exposure information, exposure-enriched sampling schemes enhance power as compared to random selection of subjects irrespective of exposure prevalence but yield bias in estimation of the G–E interaction and marginal E parameters. Exposure MC modifies the relative performance of sampling designs when compared to the case of perfect exposure information. Those conducting G–E interaction studies should be aware of exposure MC properties and the prevalence of exposure when choosing an ideal sampling scheme and method for characterizing G–E interactions and joint effects.
    European Journal of Epidemiology 06/2014; 30(5). DOI:10.1007/s10654-014-9908-1 · 5.34 Impact Factor