Detecting gene-gene interactions that underlie human diseases.

Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK.
Nature Reviews Genetics (Impact Factor: 39.79). 06/2009; 10(6):392-404. DOI: 10.1038/nrg2579
Source: PubMed

ABSTRACT Following the identification of several disease-associated polymorphisms by genome-wide association (GWA) analysis, interest is now focusing on the detection of effects that, owing to their interaction with other genetic or environmental factors, might not be identified by using standard single-locus tests. In addition to increasing the power to detect associations, it is hoped that detecting interactions between loci will allow us to elucidate the biological and biochemical pathways that underpin disease. Here I provide a critical survey of the methods and related software packages currently used to detect the interactions between genetic loci that contribute to human genetic disease. I also discuss the difficulties in determining the biological relevance of statistical interactions.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Detecting complex interactions among risk factors in case-control studies is a fundamental task in clinical and population research. However, though hypothesis testing using logistic regression (LR) is a convenient solution, the LR framework is poorly powered and ill-suited under several common circumstances in practice including missing or unmeasured risk factors, imperfectly correlated "surrogates", and multiple disease sub-types. The weakness of LR in these settings is related to the way in which the null hypothesis is defined. Here we propose the Asymmetric Independence Model (AIM) as a biologically-inspired alternative to LR, based on the key observation that the mechanisms associated with acquiring a "disease" versus maintaining "health" are asymmetric. We prove mathematically that, unlike LR, AIM is a robust model under the abovementioned confounding scenarios. Further, we provide a mathematical definition of a "synergistic" interaction, and prove that theoretically AIM has better power than LR for such interactions. We then experimentally show the superior performance of AIM as compared to LR on both simulations and four real datasets. While the principal application here involves genetic or environmental variables in the life sciences, our methodology is readily applied to other types of measurements and inferences, e.g. in the social sciences.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.
  • [Show abstract] [Hide abstract]
    ABSTRACT: We investigated the role of glutathione S-transferase (GST) genes in Autism Spectrum Disorder (ASD). We used data from 111 pairs of age- and sex-matched ASD cases and typically developing (TD) controls between 2 and 8 years of age from Jamaica to investigate the role of GST pi 1 (GSTP1), GST theta 1 (GSTT1), and GST mu 1 (GSTM1) polymorphisms in susceptibility to ASD. In univariable conditional logistic regression models we did not observe significant associations between ASD status and GSTT1, GSTM1, or GSTP1 genotype (all P > 0.15). However, in multivariable conditional logistic regression models, we identified a significant interaction between GSTP1 and GSTT1 in relation to ASD. Specifically, in children heterozygous for the GSTP1 Ile105Val polymorphism, the odds of ASD was significantly higher in those with the null GSTT1 genotype than those with the other genotypes [matched odds ratio (MOR) = 2.97, 95% CI (1.09, 8.01), P = 0.03]. Replication in other populations is warranted.
    Research in Autism Spectrum Disorders 04/2015; 12. · 2.96 Impact Factor


Available from