Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies.

Department of Preventive Medicine, University of Southern California, Los Angeles, California, 90089-9011, USA.
Annual Review of Public Health (Impact Factor: 6.63). 04/2010; 31:21-36. DOI: 10.1146/annurev.publhealth.012809.103619
Source: PubMed

ABSTRACT Despite the considerable enthusiasm about the yield of novel and replicated discoveries of genetic associations from the new generation of genome-wide association studies (GWAS), the proportion of the heritability of most complex diseases that have been studied to date remains small. Some of this "dark matter" could be due to gene-environment (G x E) interactions or more complex pathways involving multiple genes and exposures. We review the basic epidemiologic study design and statistical analysis approaches to studying G x E interactions individually and then consider more comprehensive approaches to studying entire pathways or GWAS data. In addition to the usual issues in genetic association studies, particular care is needed in exposure assessment, and very large sample sizes are required. Although hypothesis-driven, pathway-based and agnostic GWA study approaches are generally viewed as opposite poles, we suggest that the two can be usefully married using hierarchical modeling strategies that exploit external pathway knowledge in mining genome-wide data.


Available from: Duncan C Thomas, Jun 12, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: For many complex diseases, prognosis is of essential importance. It has been shown that, beyond the main effects of genetic (G) and environmental (E) risk factors, the gene-environment (G$\times$E) interactions also play a critical role. In practice, the prognosis outcome data can be contaminated, and most of the existing methods are not robust to data contamination. In the literature, it has been shown that even a single contaminated observation can lead to severely biased model estimation. In this study, we describe prognosis using an accelerated failure time (AFT) model. An exponential squared loss is proposed to accommodate possible data contamination. A penalization approach is adopted for regularized estimation and marker selection. The proposed method is realized using an effective coordinate descent (CD) and minorization maximization (MM) algorithm. Simulation shows that without contamination, the proposed method has performance comparable to or better than the unrobust alternative. With contamination, it outperforms the unrobust alternative and, under certain scenarios, can be superior to the robust method based on quantile regression. The proposed method is applied to the analysis of TCGA (The Cancer Genome Atlas) lung cancer data. It identifies interactions different from those using the alternatives. The identified marker have important implications and satisfactory stability.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: After participating in this activity, learners should be better able to: 1. Evaluate current evidence regarding the genetic determinants of depression 2. Assess findings from studies of gene-environment interaction 3. Identify challenges to gene discovery in depression Depression is one of the most prevalent, disabling, and costly mental health conditions in the United States and also worldwide. One promising avenue for preventing depression and informing its clinical treatment lies in uncovering the genetic and environmental determinants of the disorder as well as their interaction (G×E). The overarching goal of this review article is to translate recent findings from studies of genetic association and G×E related to depression, particularly for readers without in-depth knowledge of genetics or genetic methods. The review is organized into three major sections. In the first, we summarize what is currently known about the genetic determinants of depression, focusing on findings from genome-wide association studies (GWAS). In the second section, we review findings from studies of G×E, which seek to simultaneously examine the role of genes and exposure to specific environments or experiences in the etiology of depression. In the third section, we describe the challenges to genetic discovery in depression and promising strategies for future progress.
    Harvard Review of Psychiatry 01/2015; 23(1):1-18. DOI:10.1097/HRP.0000000000000054 · 2.49 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The recent successes of genome-wide association studies (GWAS) have renewed interest in genome environment wide interaction studies (GEWIS) to discover genetic factors that modulate penetrance of environmental exposures to human diseases. Indeed, gene-environment interactions (G × E), which have not been emphasized in the GWAS era, could be a source contributing to the missing heritability, a major bottleneck limiting continuing GWAS successes. In this manuscript, we describe a design and analytic strategy to focus on G × E using only exposed subjects, dubbed as e-GEWIS. Operationally, an e-GEWIS analysis is equivalent to a GWAS analysis on exposed subjects only, and it has actually been used in some earlier GWAS without being explicitly identified as such. Through both analytics and simulations, e-GEWIS has been shown better efficiency than the usual cross-product-based analysis of G × E interaction with both cases and controls (cc-GEWIS), and they have comparable efficiency to case-only analysis of G × E (c-GEWIS), with potentially smaller sample sizes. The formalization of e-GEWIS here provides a theoretical basis to legitimize this framework for routine investigation of G × E, for more efficient G × E study designs, and for improvement of reproducibility in replicating GEWIS findings. As an illustration, we apply e-GEWIS to a lung cancer GWAS data set to perform a GEWIS, focusing on gene and smoking interaction. The e-GEWIS analysis successfully uncovered positive genetic associations on chromosome 15 among current smokers, suggesting a gene-smoking interaction. Although this signal was detected earlier, the current finding here serves as a positive control in support of this e-GEWIS strategy. © 2015 WILEY PERIODICALS, INC.
    Genetic Epidemiology 02/2015; DOI:10.1002/gepi.21890 · 2.95 Impact Factor