The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases

Program in Human Genetics, Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School, Nashville, TN 37232-0700, USA.
Human Heredity (Impact Factor: 1.47). 02/2003; 56(1-3):73-82. DOI: 10.1159/000073735
Source: PubMed


There is increasing awareness that epistasis or gene-gene interaction plays a role in susceptibility to common human diseases. In this paper, we formulate a working hypothesis that epistasis is a ubiquitous component of the genetic architecture of common human diseases and that complex interactions are more important than the independent main effects of any one susceptibility gene. This working hypothesis is based on several bodies of evidence. First, the idea that epistasis is important is not new. In fact, the recognition that deviations from Mendelian ratios are due to interactions between genes has been around for nearly 100 years. Second, the ubiquity of biomolecular interactions in gene regulation and biochemical and metabolic systems suggest that relationship between DNA sequence variations and clinical endpoints is likely to involve gene-gene interactions. Third, positive results from studies of single polymorphisms typically do not replicate across independent samples. This is true for both linkage and association studies. Fourth, gene-gene interactions are commonly found when properly investigated. We review each of these points and then review an analytical strategy called multifactor dimensionality reduction for detecting epistasis. We end with ideas of how hypotheses about biological epistasis can be generated from statistical evidence using biochemical systems models. If this working hypothesis is true, it suggests that we need a research strategy for identifying common disease susceptibility genes that embraces, rather than ignores, the complexity of the genotype to phenotype relationship.

Download full-text


Available from: Jason H Moore, Jun 10, 2015
  • Source
    • "Therefore , all these network interactions are likely to involve epistatic effects . Canalization theory thus explains why so many variants only provide small contributions to the phenotype ( Moore , 2003 ) . "
    [Show abstract] [Hide abstract]
    ABSTRACT: During the past decade, findings of genome-wide association studies (GWAS) improved our knowledge and understanding of disease genetics. To date, thousands of SNPs have been associated with diseases and other complex traits. Statistical analysis typically looks for association between a phenotype and a SNP taken individually via single-locus tests. However, geneticists admit this is an oversimplified approach to tackle the complexity of underlying biological mechanisms. Interaction between SNPs, namely epistasis, must be considered. Unfortunately, epistasis detection gives rise to analytic challenges since analyzing every SNP combination is at present impractical at a genome-wide scale. In this review, we will present the main strategies recently proposed to detect epistatic interactions, along with their operating principle. Some of these methods are exhaustive, such as multifactor dimensionality reduction, likelihood ratio-based tests or receiver operating characteristic curve analysis; some are non-exhaustive, such as machine learning techniques (random forests, Bayesian networks) or combinatorial optimization approaches (ant colony optimization, computational evolution system).
    Frontiers in Genetics 09/2015; 6. DOI:10.3389/fgene.2015.00285
  • Source
    • "Historically, two statistical principles have evolved: one represents genetic model-based methods and one model-free method [27,28]. The model-based approaches are designed to calculate interactions based on linear regression [29]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Genetic contributions to major depressive disorder (MDD) are thought to result from multiple genes interacting with each other. Different procedures have been proposed to detect such interactions. Which approach is best for explaining the risk of developing disease is unclear. This study sought to elucidate the genetic interaction landscape in candidate genes for MDD by conducting a SNP-SNP interaction analysis using an exhaustive search through 3,704 SNP-markers in 1,732 cases and 1,783 controls provided from the GAIN MDD study. We used three different methods to detect interactions, two logistic regressions models (multiplicative and additive) and one data mining and machine learning (MDR) approach. Results Although none of the interaction survived correction for multiple comparisons, the results provide important information for future genetic interaction studies in complex disorders. Among the 0.5% most significant observations, none had been reported previously for risk to MDD. Within this group of interactions, less than 0.03% would have been detectable based on main effect approach or an a priori algorithm. We evaluated correlations among the three different models and conclude that all three algorithms detected the same interactions to a low degree. Although the top interactions had a surprisingly large effect size for MDD (e.g. additive dominant model Puncorrected = 9.10E-9 with attributable proportion (AP) value = 0.58 and multiplicative recessive model with Puncorrected = 6.95E-5 with odds ratio (OR estimated from β3) value = 4.99) the area under the curve (AUC) estimates were low (< 0.54). Moreover, the population attributable fraction (PAF) estimates were also low (< 0.15). Conclusions We conclude that the top interactions on their own did not explain much of the genetic variance of MDD. The different statistical interaction methods we used in the present study did not identify the same pairs of interacting markers. Genetic interaction studies may uncover previously unsuspected effects that could provide novel insights into MDD risk, but much larger sample sizes are needed before this strategy can be powerfully applied.
    BioData Mining 09/2014; 7(1):19. DOI:10.1186/1756-0381-7-19 · 2.02 Impact Factor
  • Source
    • "Alternatively, in some application domains one may wish to model purely epistatic landscapes in which no main effects are present. For example, in complex diseases there may be little if any association between single genes and incidence of disease [27]. Similarly, the electrical grid is explicitly ensured to be stable with respect to the loss of any one component, but interactions between two or more component outages can lead to large cascading failures [28]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose NM landscapes as a new class of tunably rugged benchmark problems. NM landscapes are well-defined on alphabets of any arity, including both discrete and real-valued alphabets, include epistasis in a natural and transparent manner, are proven to have known value and location of the global maximum and, with some additional constraints, are proven to also have a known global minimum. Empirical studies are used to illustrate that, when coefficients are selected from a recommended distribution, the ruggedness of NM landscapes is smoothly tunable and correlates with several measures of search difficulty. We discuss why these properties make NM landscapes preferable to both NK landscapes and Walsh polynomials as benchmark landscape models with tunable epistasis.
    IEEE Transactions on Evolutionary Computation 09/2014; DOI:10.1109/TEVC.2015.2454857 · 3.65 Impact Factor
Show more