Integrating pathway analysis and genetics of gene expression for genome-wide association study of basal cell carcinoma.

Clinical Research Program, Department of Dermatology, Harvard Medical School, Brigham and Women's Hospital, Boston, MA 02115, USA.
Human Genetics (Impact Factor: 4.52). 10/2011; 131(4):615-23. DOI: 10.1007/s00439-011-1107-5
Source: PubMed

ABSTRACT Genome-wide association studies (GWASs) have primarily focused on marginal effects for individual markers and have incorporated external functional information only after identifying robust statistical associations. We applied a new approach combining the genetics of gene expression and functional classification of genes to the GWAS of basal cell carcinoma (BCC) to identify potential biological pathways associated with BCC. We first identified 322,324 expression-associated single-nucleotide polymorphisms (eSNPs) from two existing GWASs of global gene expression in lymphoblastoid cell lines (n = 955), and evaluated the association of these functionally annotated SNPs with BCC among 2,045 BCC cases and 6,013 controls in Caucasians. We then grouped them into 99 KEGG pathways for pathway analysis and identified two pathways associated with BCC with p value <0.05 and false discovery rate (FDR) <0.5: the autoimmune thyroid disease pathway (mainly HLA class I and II antigens, p < 0.001, FDR = 0.24) and Janus kinase-signal transducer and activator of transcription (JAK-STAT) signaling pathway (p = 0.02, FDR = 0.49). Seventy-nine (25.7%) out of 307 significant eSNPs in the JAK-STAT pathway were associated with BCC risk (p < 0.05) in an independent replication set of 278 BCC cases and 1,262 controls. In addition, the association of JAK-STAT signaling pathway was marginally validated using 16,691 eSNPs identified from 110 normal skin samples (p = 0.08). Based on the evidence of biological functions of the JAK-STAT pathway on oncogenesis, it is plausible that this pathway is involved in BCC pathogenesis.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Increasing evidence suggests that single nucleotide polymorphisms (SNPs) associated with complex traits are more likely to be expression quantitative trait loci (eQTLs). Incorporating eQTL information hence has potential to increase power of genome-wide association studies (GWAS). In this paper, we propose using eQTL weights as prior information in SNP based association tests to improve test power while maintaining control of the family-wise error rate (FWER) or the false discovery rate (FDR). We apply the proposed methods to the analysis of a GWAS for childhood asthma consisting of 1296 unrelated individuals with German ancestry. The results confirm that eQTLs are enriched for previously reported asthma SNPs. We also find that some SNPs are insignificant using procedures without eQTL weighting, but become significant using eQTL-weighted Bonferroni or Benjamini-Hochberg procedures, while controlling the same FWER or FDR level. Some of these SNPs have been reported by independent studies in recent literature. The results suggest that the eQTL-weighted procedures provide a promising approach for improving power of GWAS. We also report the results of our methods applied to the large-scale European GABRIEL consortium data.
    Frontiers in Genetics 01/2013; 4:103. DOI:10.3389/fgene.2013.00103
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies (GWAS) have rapidly become a powerful tool in genetic studies of complex diseases and traits. Traditionally, single marker-based tests have been used prevalently in GWAS and have uncovered tens of thousands of disease-associated SNPs. Network-assisted analysis (NAA) of GWAS data is an emerging area in which network-related approaches are developed and utilized to perform advanced analyses of GWAS data in order to study various human diseases or traits. Progress has been made in both methodology development and applications of NAA in GWAS data, and it has already been demonstrated that NAA results may enhance our interpretation and prioritization of candidate genes and markers. Inspired by the strong interest in and high demand for advanced GWAS data analysis, in this review article, we discuss the methodologies and strategies that have been reported for the NAA of GWAS data. Many NAA approaches search for subnetworks and assess the combined effects of multiple genes participating in the resultant subnetworks through a gene set analysis. With no restriction to pre-defined canonical pathways, NAA has the advantage of defining subnetworks with the guidance of the GWAS data under investigation. In addition, some NAA methods prioritize genes from GWAS data based on their interconnections in the reference network. Here, we summarize NAA applications to various diseases and discuss the available options and potential caveats related to their practical usage. Additionally, we provide perspectives regarding this rapidly growing research area.
    Human Genetics 10/2013; 133(2). DOI:10.1007/s00439-013-1377-1 · 4.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genetic association studies have been a popular approach for assessing the association between common Single Nucleotide Polymorphisms (SNPs) and complex diseases. However, other genomic data involved in the mechanism from SNPs to disease, e.g., gene expressions, are usually neglected in these association studies. In this paper, we propose to exploit gene expression information to more powerfully test the association between SNPs and diseases by jointly modeling the relations among SNPs, gene expressions and diseases. We propose a variance component test for the total effect of SNPs and a gene expression on disease risk. We cast the test within the causal mediation analysis framework with the gene expression as a potential mediator. For eQTL SNPs, the use of gene expression information can enhance power to test for the total effect of a SNP-set, which are the combined direct and indirect effects of the SNPs mediated through the gene expression, on disease risk. We show that the test statistic under the null hypothesis follows a mixture of χ (2) distributions, which can be evaluated analytically or empirically using the resampling-based perturbation method. We construct tests for each of three disease models that is determined by SNPs only, SNPs and gene expression, or includes also their interactions. As the true disease model is unknown in practice, we further propose an omnibus test to accommodate different underlying disease models. We evaluate the finite sample performance of the proposed methods using simulation studies, and show that our proposed test performs well and the omnibus test can almost reach the optimal power where the disease model is known and correctly specified. We apply our method to re-analyze the overall effect of the SNP-set and expression of the ORMDL3 gene on the risk of asthma.
    The Annals of Applied Statistics 03/2014; 8(1):352-376. DOI:10.1214/13-AOAS690 · 1.69 Impact Factor

Full-text (2 Sources)

Available from
Jun 1, 2014