Detecting gene-gene interactions that underlie human diseases.

Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK.
Nature Reviews Genetics (Impact Factor: 39.79). 06/2009; 10(6):392-404. DOI: 10.1038/nrg2579
Source: PubMed

ABSTRACT Following the identification of several disease-associated polymorphisms by genome-wide association (GWA) analysis, interest is now focusing on the detection of effects that, owing to their interaction with other genetic or environmental factors, might not be identified by using standard single-locus tests. In addition to increasing the power to detect associations, it is hoped that detecting interactions between loci will allow us to elucidate the biological and biochemical pathways that underpin disease. Here I provide a critical survey of the methods and related software packages currently used to detect the interactions between genetic loci that contribute to human genetic disease. I also discuss the difficulties in determining the biological relevance of statistical interactions.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: For many complex diseases, prognosis is of essential importance. It has been shown that, beyond the main effects of genetic (G) and environmental (E) risk factors, the gene-environment (G$\times$E) interactions also play a critical role. In practice, the prognosis outcome data can be contaminated, and most of the existing methods are not robust to data contamination. In the literature, it has been shown that even a single contaminated observation can lead to severely biased model estimation. In this study, we describe prognosis using an accelerated failure time (AFT) model. An exponential squared loss is proposed to accommodate possible data contamination. A penalization approach is adopted for regularized estimation and marker selection. The proposed method is realized using an effective coordinate descent (CD) and minorization maximization (MM) algorithm. Simulation shows that without contamination, the proposed method has performance comparable to or better than the unrobust alternative. With contamination, it outperforms the unrobust alternative and, under certain scenarios, can be superior to the robust method based on quantile regression. The proposed method is applied to the analysis of TCGA (The Cancer Genome Atlas) lung cancer data. It identifies interactions different from those using the alternatives. The identified marker have important implications and satisfactory stability.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Fifteen SNPs from nine different genes were genotyped on 1379 individuals, 758 T2DM patients and 621 controls, from the city of Hyderabad, India, using Sequenom Massarray platform. These data were analyzed to examine the role of gene–gene and gene–environment interactions in the manifestation of T2DM. The multivariate analysis suggests that TCF7L2, CDKAL1, IGF2BP2, HHEX and PPARG genes are significantly associated with T2DM, albeit only the first two of the above 5 were associated in the univariate analysis. Significant gene–gene and gene–environment interactions were also observed with reference to TCF7L2, CAPN10 and CDKAL1 genes, highlighting their importance in the pathophysiology of T2DM. In the analysis for cumulative effect of risk alleles, SLC30A8 steps in as significant contributor to the disease by its presence in all combinations of risk alleles. A striking difference between risk allele categories, 1–4 and 5–6, was evident in showing protective and susceptible roles, respectively, while the latter was characterized by the presence of TCF7L2 and CDKAL1 variants. Overall, these two genes TCF7L2 and CDKAL1 showed strong association with T2DM, either individually or in interaction with the other genes. However, we need further studies on gene–gene and gene–environment interactions among heterogeneous Indian populations to obtain unequivocal conclusions that are applicable for the Indian population as a whole.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Familial Alzheimer's disease (AD), mostly associated with early onset, is caused by mutations in three genes (APP, PSEN1, and PSEN2) involved in the production of the amyloid β peptide. In contrast, the molecular mechanisms that trigger the most common late onset sporadic AD remain largely unknown. With the implementation of an increasing number of case-control studies and the upcoming of large-scale genome-wide association studies there is a mounting list of genetic risk factors associated with common genetic variants that have been associated with sporadic AD. Besides apolipoprotein E, that presents a strong association with the disease (OR∼4), the rest of these genes have moderate or low degrees of association, with OR ranging from 0.88 to 1.23. Taking together, these genes may account only for a fraction of the attributable AD risk and therefore, rare variants and epistastic gene interactions should be taken into account in order to get the full picture of the genetic risks associated with AD. Here, we review recent whole-exome studies looking for rare variants, somatic brain mutations with a strong association to the disease, and several studies dealing with epistasis as additional mechanisms conferring genetic susceptibility to AD. Altogether, recent evidence underlines the importance of defining molecular and genetic pathways, and networks rather than the contribution of specific genes.
    Frontiers in Cellular Neuroscience 04/2015; 9:138. DOI:10.3389/fncel.2015.00138 · 4.18 Impact Factor


Available from