Cordell, H. J. Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10, 392-404

Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK.
Nature Reviews Genetics (Impact Factor: 36.98). 06/2009; 10(6):392-404. DOI: 10.1038/nrg2579
Source: PubMed


Following the identification of several disease-associated polymorphisms by genome-wide association (GWA) analysis, interest is now focusing on the detection of effects that, owing to their interaction with other genetic or environmental factors, might not be identified by using standard single-locus tests. In addition to increasing the power to detect associations, it is hoped that detecting interactions between loci will allow us to elucidate the biological and biochemical pathways that underpin disease. Here I provide a critical survey of the methods and related software packages currently used to detect the interactions between genetic loci that contribute to human genetic disease. I also discuss the difficulties in determining the biological relevance of statistical interactions.

1 Follower
33 Reads
  • Source
    • "Consequently, a number of algorithms have been developed to detect 2-SNP epistatic interactions in high-throughput GWAS datasets (e.g. [3], [4], [5]). The main goal of these approaches is to find pairs of SNPs whose joint genotype frequencies show a statistically significant difference between cases and controls which potentially explains the effect of the genetic variation leading to disease. "
    [Show abstract] [Hide abstract]
    ABSTRACT: High-throughput genotyping technologies (such as SNP-arrays) allow the rapid collection of up to a few million genetic markers of an individual. Detecting epistasis (based on 2-SNP interactions) in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. Computational methods to detect epistasis therefore suffer from prohibitively long runtimes; e.g. processing a moderately-sized dataset consisting of about 500,000 SNPs and 5,000 samples requires several days using state-of-the-art tools on a standard 3GHz CPU. In this paper we demonstrate how this task can be accelerated using a combination of fine-grained and coarsegrained parallelism on two different computing systems. The first architecture is based on reconfigurable hardware (FPGAs) while the second architecture uses multiple GPUs connected to the same host. We show that both systems can achieve speedups of around four orders-of-magnitude compared to the sequential implementation. This significantly reduces the runtimes for detecting epistasis to only a few minutes for moderately-sized datasets and to a few hours for large-scale datasets.
    IEEE/ACM Transactions on Computational Biology and Bioinformatics 10/2015; 12(5):1-1. DOI:10.1109/TCBB.2015.2389958 · 1.44 Impact Factor
  • Source
    • "On the other hand, while IGF2BP2, HHEX and PPARG that were not independently associated, turned out to be significant in the multivariate context. IRS-1 and CAPN10 that were associated individually failed to show association in the multivariate analysis, both these situations conforming probably to the phenomenon of epistasis (Cordell, 2010; Culverhouse et al., 2002); where as in the first case, the genes with no independent effect, turn out to be significant in the presence of other genes, in case of the latter the potential independent effects of IRS-1 and CAPN10 were probably masked by other genes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Fifteen SNPs from nine different genes were genotyped on 1379 individuals, 758 T2DM patients and 621 controls, from the city of Hyderabad, India, using Sequenom Massarray platform. These data were analyzed to examine the role of gene–gene and gene–environment interactions in the manifestation of T2DM. The multivariate analysis suggests that TCF7L2, CDKAL1, IGF2BP2, HHEX and PPARG genes are significantly associated with T2DM, albeit only the first two of the above 5 were associated in the univariate analysis. Significant gene–gene and gene–environment interactions were also observed with reference to TCF7L2, CAPN10 and CDKAL1 genes, highlighting their importance in the pathophysiology of T2DM. In the analysis for cumulative effect of risk alleles, SLC30A8 steps in as significant contributor to the disease by its presence in all combinations of risk alleles. A striking difference between risk allele categories, 1–4 and 5–6, was evident in showing protective and susceptible roles, respectively, while the latter was characterized by the presence of TCF7L2 and CDKAL1 variants. Overall, these two genes TCF7L2 and CDKAL1 showed strong association with T2DM, either individually or in interaction with the other genes. However, we need further studies on gene–gene and gene–environment interactions among heterogeneous Indian populations to obtain unequivocal conclusions that are applicable for the Indian population as a whole.
    Meta Gene 09/2015; 5. DOI:10.1016/j.mgene.2015.05.001
  • Source
    • "Previous efforts have already addressed this problem in high-throughput GWAS datasets [4], [5]. Computing epistasis is highly time-consuming due to the large number of pairwise tests to be calculated . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able to efficiently exploit the whole computational capacity of modern clusters that contain both GPUs and Xeon Phi coprocessors. In this paper we investigate approaches to map pairwise epistasic detection on heterogeneous clusters using both types of accelerators. The runtimes to analyze the well-known WTCCC dataset consisting of about 500K SNPs and 5K samples on one and two NVIDIA K20m are reduced by 27% thanks to the use of a hybrid approach with one additional single Xeon Phi coprocessor.
    IEEE Transactions on Parallel and Distributed Systems 07/2015; DOI:10.1109/TPDS.2015.2460247 · 2.17 Impact Factor
Show more

Preview (2 Sources)

33 Reads
Available from