Accelerating epistasis analysis in human genetics with consumer graphics hardware

Computational Genetics Lab, Department of Genetics, Norris-Cotton Cancer Center, Dartmouth Medical School, Lebanon, NH, USA.
BMC Research Notes 08/2009; 2(1):149. DOI: 10.1186/1756-0500-2-149
Source: PubMed


Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions.
We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500.
Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.

Download full-text


Available from: Fabio Cancare
  • Source
    • "However, these implementations are developed for specific and expensive hardware, need consequent deployment efforts and lack of portabilty. While Graphics Processing Units (GPUs) systems provide cost effective performance compared to supercomputers, severel works implemented the MDR method on GPUs [8] [15]. Despite these works achieved good performances, they require the users to well understand the GPU architecture to fully exploit the computation power of the GPU. "

    Full-text · Article · Jan 2012
  • Source
    • "As shown in previous studies [3] [4] [5], parallel GPU computing techniques can substantially improve the performance of genetic analysis tools. Our previously published block-based evolutionary optimization strategy was designed to improve the power and efficiency of gene-gene interaction detection by taking advantage of local linkage disequilibrium structure. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of gene-gene interactions related to common complex human diseases is complicated by the increasing scale of genetic association analysis. Concurrent with the advances in genetic technology that led to these large data sets, improvements have been made in parallel computing with graphics processing units (GPUs). The data-intensive nature of genetic association analysis makes this problem particularly suitable for improved computation with the powerful computing resources available in GPUs. In this study, we present a GPU-accelerated discrete optimization strategy to improve the computational efficiency of multi-locus association analysis. We implemented an adaptive evolutionary algorithm that takes advantage of linkage disequilibrium to reduce the need for exhaustive search for combinations of genetic markers. The proposed GPU algorithm was shown to have improved efficiency and equivalent power relative to the CPU version.
    Full-text · Article · Aug 2011 · Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference
  • Source
    • "Recently, several authors have investigated the utility of using graphics processing units (GPUs) for computationally intensive statistical techniques, for example in statistical genetics (Jiang et al., 2009; Manavski and Valle, 2008; Sinnott-Armstrong et al., 2009; Suchard and Rambaut, 2009), and sequential Monte Carlo (Lee et al., 2009). In contrast to conventional parallel computing which employs perhaps a few dozen CPU-based threads, GPUs have sufficient hardware to support thousands or tens of thousands of simultaneous operations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Slice sampling provides an easily implemented method for constructing a Markov chain Monte Carlo (MCMC) algorithm. However, slice sampling has two major drawbacks: (i) it requires repeated evaluation of likelihoods for each update, which can make it impractical when evaluations are expensive or as the number of evaluations grows (geometrically) with the dimension of the slice sampler, and (ii) since it can be challenging to construct multivariate updates, the updates are typically univariate, which often results in slow mixing samplers. We propose an approach to multivariate slice sampling that naturally lends itself to a parallel implementation. Our approach takes advantage of recent advances in computer architectures, for instance, the newest generation of graphics cards can execute roughly 30,000 threads simultaneously. We demonstrate that it is possible to construct a multivariate slice sampler that has good mixing properties and is efficient in terms of computing time. The contributions of this article are therefore twofold. We study approaches for constructing a multivariate slice sampler, and we show how parallel computing can be useful for making MCMC algorithms computationally efficient. We study various implementations of our algorithm in the context of real and simulated data.
    Full-text · Article · Jul 2011 · Statistics and Computing
Show more