Article

Cordell, H. J. Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10, 392-404

Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK.
Nature Reviews Genetics (Impact Factor: 36.98). 06/2009; 10(6):392-404. DOI: 10.1038/nrg2579
Source: PubMed

ABSTRACT

Following the identification of several disease-associated polymorphisms by genome-wide association (GWA) analysis, interest is now focusing on the detection of effects that, owing to their interaction with other genetic or environmental factors, might not be identified by using standard single-locus tests. In addition to increasing the power to detect associations, it is hoped that detecting interactions between loci will allow us to elucidate the biological and biochemical pathways that underpin disease. Here I provide a critical survey of the methods and related software packages currently used to detect the interactions between genetic loci that contribute to human genetic disease. I also discuss the difficulties in determining the biological relevance of statistical interactions.

Full-text preview

Available from: ncbi.nlm.nih.gov
  • Source
    • "At present study, we aimed to develop a powerful score-based test statistic to identify co-association at gene or region level, which essentially captured the effect of covariance matrix between two genes on disease . Various simulation studies were conducted to assess its type I error rate and power, comparing with the commonly-used single SNP-based logistic regression model (SNP-LRT)171819, principle component analysis (PCA)-based logistic regression model (PCA-LRT)[20], the delta-square (δ 2 ) statistic[16], the CCU statistic[8], the KCCU statistic[11]and the PLSPM-based statistic[10]. Finally, the proposed score-based statistic (SBS) was applied to analyze a rheumatoid arthritis (RA) data from GAW16 Problem 1. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the “missing heritability” problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ 2 ) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
    Full-text · Article · Dec 2016 · BMC Genetics
  • Source
    • "RF has been widely used for modeling complex joint and interactive associations between response and multiple features[12,32,33,53]. In particular, many nice properties of RF make it an extremely attractive tool for genome studies: the data structure of response and features can be a mixture of categorical and continuous variables; it can nonparametrically incorporate complex nonlinear associations between feature and response; it can implicitly incorporate joint and unknown complex interactions among a large number of features (higher orders or any structure); it is able to handle big data with a large number of features but limited sample size; it can implicitly accommodate highly correlated features; it is less prone to over-fitting; it has good predictive performance even when the majority of features are noise; it is invariant to monotone transformations of the features; it is robust to changes in its tuning parameters; it performs internal estimation of error, so does not need to assess classification performance by cross-validation, and hence greatly reduces computational time[13,32,53,54]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Genome-wide association studies (GWAS) interrogate large-scale whole genome to characterize the complex genetic architecture for biomedical traits. When the number of SNPs dramatically increases to half million but the sample size is still limited to thousands, the traditional p-value based statistical approaches suffer from unprecedented limitations. Feature screening has proved to be an effective and powerful approach to handle ultrahigh dimensional data statistically, yet it has not received much attention in GWAS. Feature screening reduces the feature space from millions to hundreds by removing non-informative noise. However, the univariate measures used to rank features are mainly based on individual effect without considering the mutual interactions with other features. In this article, we explore the performance of a random forest (RF) based feature screening procedure to emphasize the SNPs that have complex effects for a continuous phenotype. Results: Both simulation and real data analysis are conducted to examine the power of the forest-based feature screening. We compare it with five other popular feature screening approaches via simulation and conclude that RF can serve as a decent feature screening tool to accommodate complex genetic effects such as nonlinear, interactive, correlative, and joint effects. Unlike the traditional p-value based Manhattan plot, we use the Permutation Variable Importance Measure (PVIM) to display the relative significance and believe that it will provide as much useful information as the traditional plot. Conclusion: Most complex traits are found to be regulated by epistatic and polygenic variants. The forest-based feature screening is proven to be an efficient, easily implemented, and accurate approach to cope whole genome data with complex structures. Our explorations should add to a growing body of enlargement of feature screening better serving the demands of contemporary genome data.
    Full-text · Article · Dec 2015 · BMC Genetics
  • Source
    • "SNP-based methods test for all pairwise interactions between SNPs, while group-based methods detect interactions between groups of SNPs. Regression-based methods2345678, haplotype-based methods9101112131415, machine learning-based methods1617181920are widely used for epistasis analysis. To date, most genetic analyses of phenotypes have focused on analyzing single traits or, analyzing each phenotype independently[21]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: To date, most genetic analyses of phenotypes have focused on analyzing single traits or, analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power, and hold the key to understanding the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two gens in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large scale simulations to calculate its type I error rates for testing interaction between two genes with multiple phenotypes and to compare its power with multivariate pair-wise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate its performance, the MFRG for epistasis analysis is applied to five phenotypes and exome sequence data from the NHLBI Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 136 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has much higher power to detect interaction than the interaction analysis of single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.
    Full-text · Article · Dec 2015
Show more

Questions & Answers about this publication

  • James R. Butler added an answer in Gene-gene Interaction:
    What is the best method to study gene-gene interaction?

    I am looking for the most reliable method to study gene-gene interaction.

    James R. Butler

    Here is a nice review on the topic. from nature reviews in genetics.

    hope this is of help

    all the best,

     James

    • Source
      [Show abstract] [Hide abstract]
      ABSTRACT: Following the identification of several disease-associated polymorphisms by genome-wide association (GWA) analysis, interest is now focusing on the detection of effects that, owing to their interaction with other genetic or environmental factors, might not be identified by using standard single-locus tests. In addition to increasing the power to detect associations, it is hoped that detecting interactions between loci will allow us to elucidate the biological and biochemical pathways that underpin disease. Here I provide a critical survey of the methods and related software packages currently used to detect the interactions between genetic loci that contribute to human genetic disease. I also discuss the difficulties in determining the biological relevance of statistical interactions.
      Preview · Article · Jun 2009 · Nature Reviews Genetics