Studying Gene and Gene-Environment Effects of Uncommon and Common Variants on Continuous Traits: A Marker-Set Approach Using Gene-Trait Similarity Regression

Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA.
The American Journal of Human Genetics (Impact Factor: 10.93). 08/2011; 89(2):277-88. DOI: 10.1016/j.ajhg.2011.07.007
Source: PubMed


Genomic association analyses of complex traits demand statistical tools that are capable of detecting small effects of common and rare variants and modeling complex interaction effects and yet are computationally feasible. In this work, we introduce a similarity-based regression method for assessing the main genetic and interaction effects of a group of markers on quantitative traits. The method uses genetic similarity to aggregate information from multiple polymorphic sites and integrates adaptive weights that depend on allele frequencies to accomodate common and uncommon variants. Collapsing information at the similarity level instead of the genotype level avoids canceling signals that have the opposite etiological effects and is applicable to any class of genetic variants without the need for dichotomizing the allele types. To assess gene-trait associations, we regress trait similarities for pairs of unrelated individuals on their genetic similarities and assess association by using a score test whose limiting distribution is derived in this work. The proposed regression framework allows for covariates, has the capacity to model both main and interaction effects, can be applied to a mixture of different polymorphism types, and is computationally efficient. These features make it an ideal tool for evaluating associations between phenotype and marker sets defined by linkage disequilibrium (LD) blocks, genes, or pathways in whole-genome analysis.

Download full-text


Available from: Duncan C Thomas, Oct 04, 2015
27 Reads
  • Source
    • "Third, the kernel approach readily allows for inclusion of prior information (such as biological plausibility or association signals from prior association studies) in the form of weights to assist in the formation of the kernel matrix. SNP set methods have proved to be more powerful than univariate testing of main genetic effects [Kwee et al., 2008; Tzeng et al., 2011; Wu et al., 2010] and we anticipate similar trends when considering joint tests of gene and geneenvironment effects. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The etiology of complex traits likely involves the effects of genetic and environmental factors, along with complicated interaction effects between them. Consequently, there has been interest in applying genetic association tests of complex traits that account for potential modification of the genetic effect in the presence of an environmental factor. One can perform such an analysis using a joint test of gene and gene-environment interaction. An optimal joint test would be one that remains powerful under a variety of models ranging from those of strong gene-environment interaction effect to those of little or no gene-environment interaction effect. To fill this demand, we have extended a kernel machine based approach for association mapping of multiple SNPs to consider joint tests of gene and gene-environment interaction. The kernel-based approach for joint testing is promising, because it incorporates linkage disequilibrium information from multiple SNPs simultaneously in analysis and permits flexible modeling of interaction effects. Using simulated data, we show that our kernel machine approach typically outperforms the traditional joint test under strong gene-environment interaction models and further outperforms the traditional main-effect association test under models of weak or no gene-environment interaction effects. We illustrate our test using genome-wide association data from the Grady Trauma Project, a cohort of highly traumatized, at-risk individuals, which has previously been investigated for interaction effects. © 2015 WILEY PERIODICALS, INC.
    Genetic Epidemiology 04/2015; 39(5). DOI:10.1002/gepi.21901 · 2.60 Impact Factor
  • Source
    • "We can use the minor allele frequency of marker m, denoted as q m , to up-weight similarities that are contributed by rare variants. Specifically, one can set a moderate weight, such as w m = q −3/4 m [Pongpanich, Neely and Tzeng (2012)] or w m = q −1 m [Tzeng et al. (2011)], to promote similarity attributed by rare alleles, or use a more extreme weight, such as w m = (1 − q m ) 24 [Wu et al. (2011)], to target rare variants only. The trait similarity, Z ij , is quantified as follows. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance-component based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy.
    The Annals of Applied Statistics 06/2014; 8(2):1232-1255. DOI:10.1214/14-AOAS735 · 1.46 Impact Factor
  • Source
    • "In contrast, few methods have been proposed for set-based G × E tests. Tzeng et al. [2011] developed a method to test for interaction between a set of markers and an environment variable by extending the set-based genetic similarity method to the G × E setting [Tzeng et al., 2011]. As there is no competing method, they compared the new method with the benchmark minimum P-value method and their method showed favorable performance. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Identification of gene-environment interaction (G × E) is important in understanding the etiology of complex diseases. However, partially due to the lack of power, there have been very few replicated G × E findings compared to the success in marginal association studies. The existing G × E testing methods mainly focus on improving the power for individual markers. In this paper, we took a different strategy and proposed a set-based gene-environment interaction test (SBERIA), which can improve the power by reducing the multiple testing burdens and aggregating signals within a set. The major challenge of the signal aggregation within a set is how to tell signals from noise and how to determine the direction of the signals. SBERIA takes advantage of the established correlation screening for G × E to guide the aggregation of genotypes within a marker set. The correlation screening has been shown to be an efficient way of selecting potential G × E candidate SNPs in case-control studies for complex diseases. Importantly, the correlation screening in case-control combined samples is independent of the interaction test. With this desirable feature, SBERIA maintains the correct type I error level and can be easily implemented in a regular logistic regression setting. We showed that SBERIA had higher power than benchmark methods in various simulation scenarios, both for common and rare variants. We also applied SBERIA to real genome-wide association studies (GWAS) data of 10,729 colorectal cancer cases and 13,328 controls and found evidence of interaction between the set of known colorectal cancer susceptibility loci and smoking.
    Genetic Epidemiology 07/2013; 37(5). DOI:10.1002/gepi.21735 · 2.60 Impact Factor
Show more