Gene-Environment Interactions in Genome-Wide Association Studies: A Comparative Study of Tests Applied to Empirical Studies of Type 2 Diabetes

Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts, USA.
American journal of epidemiology (Impact Factor: 5.23). 12/2011; 175(3):191-202. DOI: 10.1093/aje/kwr368
Source: PubMed


The question of which statistical approach is the most effective for investigating gene-environment (G-E) interactions in the context of genome-wide association studies (GWAS) remains unresolved. By using 2 case-control GWAS (the Nurses' Health Study, 1976-2006, and the Health Professionals Follow-up Study, 1986-2006) of type 2 diabetes, the authors compared 5 tests for interactions: standard logistic regression-based case-control; case-only; semiparametric maximum-likelihood estimation of an empirical-Bayes shrinkage estimator; and 2-stage tests. The authors also compared 2 joint tests of genetic main effects and G-E interaction. Elevated body mass index was the exposure of interest and was modeled as a binary trait to avoid an inflated type I error rate that the authors observed when the main effect of continuous body mass index was misspecified. Although both the case-only and the semiparametric maximum-likelihood estimation approaches assume that the tested markers are independent of exposure in the general population, the authors did not observe any evidence of inflated type I error for these tests in their studies with 2,199 cases and 3,044 controls. Both joint tests detected markers with known marginal effects. Loci with the most significant G-E interactions using the standard, empirical-Bayes, and 2-stage tests were strongly correlated with the exposure among controls. Study findings suggest that methods exploiting G-E independence can be efficient and valid options for investigating G-E interactions in GWAS.

Download full-text


Available from: Peter Kraft, Oct 05, 2015
34 Reads
  • Source
    • "To account for cellular heterogeneity in epigenetic association studies, one commonly used approach is to include these white cell counts (or their proportions) as fixed effects in regression models (44). These factors are often included as linear terms in the regression model and can adjust for confounding in most situations, although non-linear effects from cell counts should be considered when examining interactions (45). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Platform technologies for measurement of CpG methylation at multiple loci across the genome have made ambitious epigenome-wide association studies affordable and practicable. In contrast to genetic studies, which estimate the effects of structural changes in DNA, and transcriptomic studies, which measure genomic outputs, epigenetic studies can access states of regulation of genome function in particular cells and in response to specific stimuli. Although many factors complicate the interpretation of epigenetic variation in human disease, cell-specific methylation patterns and the cellular heterogeneity present in peripheral blood and tissue biopsies are anticipated to cause the most problems. In this review, we suggest that the difficulties may be exaggerated and we explore how cellular heterogeneity may be embraced with appropriate study designs and analytical tools. We further suggest that systematic mapping of the loci influenced by age, sex and genetic polymorphisms will bring important biological insights as well as improved control of epigenome-wide association studies.
    Human Molecular Genetics 06/2014; 23(R1). DOI:10.1093/hmg/ddu284 · 6.39 Impact Factor
  • Source
    • "To facilitate such collaboration, international standards regarding type of biological samples and questionnaire information to be collected as well should be developed. Furthermore, taking into consideration the limited availability of sufficiently sized study cohorts, efforts should be made to develop more efficient study designs and statistical models for studying gene-environment interactions.137138 "
    [Show abstract] [Hide abstract]
    ABSTRACT: Over the last few decades, there have been numerous reports of adverse effects on the reproductive health of wildlife and laboratory animals caused by exposure to endocrine disrupting chemicals (EDCs). The increasing trends in human male reproductive disorders and the mounting evidence for causative environmental factors have therefore sparked growing interest in the health threat posed to humans by EDCs, which are substances in our food, environment and consumer items that interfere with hormone action, biosynthesis or metabolism, resulting in disrupted tissue homeostasis or reproductive function. The mechanisms of EDCs involve a wide array of actions and pathways. Examples include the estrogenic, androgenic, thyroid and retinoid pathways, in which the EDCs may act directly as agonists or antagonists, or indirectly via other nuclear receptors. Dioxins and dioxin-like EDCs exert their biological and toxicological actions through activation of the aryl hydrocarbon-receptor, which besides inducing transcription of detoxifying enzymes also regulates transcriptional activity of other nuclear receptors. There is increasing evidence that genetic predispositions may modify the susceptibility to adverse effects of toxic chemicals. In this review, potential consequences of hereditary predisposition and EDCs are discussed, with a special focus on the currently available publications on interactions between dioxin and androgen signaling.
    Asian Journal of Andrology 01/2014; 16(1):89-96. DOI:10.4103/1008-682X.122193 · 2.60 Impact Factor
  • Source
    • "If there are correlation between null SNPs and E, the two plots on the bottom of Figure 3 shows that the power advantage of SBERIA and SBERIA-M is reduced, which is expected because the correlation between null SNPs and E would make the null SNPs more likely to be selected and therefore dilute the interaction signal. It is worth noting , however, that gene-environment correlation in population is relatively rare in real applications [Cornelis et al., 2012]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Identification of gene-environment interaction (G × E) is important in understanding the etiology of complex diseases. However, partially due to the lack of power, there have been very few replicated G × E findings compared to the success in marginal association studies. The existing G × E testing methods mainly focus on improving the power for individual markers. In this paper, we took a different strategy and proposed a set-based gene-environment interaction test (SBERIA), which can improve the power by reducing the multiple testing burdens and aggregating signals within a set. The major challenge of the signal aggregation within a set is how to tell signals from noise and how to determine the direction of the signals. SBERIA takes advantage of the established correlation screening for G × E to guide the aggregation of genotypes within a marker set. The correlation screening has been shown to be an efficient way of selecting potential G × E candidate SNPs in case-control studies for complex diseases. Importantly, the correlation screening in case-control combined samples is independent of the interaction test. With this desirable feature, SBERIA maintains the correct type I error level and can be easily implemented in a regular logistic regression setting. We showed that SBERIA had higher power than benchmark methods in various simulation scenarios, both for common and rare variants. We also applied SBERIA to real genome-wide association studies (GWAS) data of 10,729 colorectal cancer cases and 13,328 controls and found evidence of interaction between the set of known colorectal cancer susceptibility loci and smoking.
    Genetic Epidemiology 07/2013; 37(5). DOI:10.1002/gepi.21735 · 2.60 Impact Factor
Show more