Article

Pervasive sharing of genetic effects in autoimmune disease.

Center For Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America.
PLoS Genetics (Impact Factor: 8.17). 08/2011; 7(8):e1002254. DOI: 10.1371/journal.pgen.1002254
Source: PubMed

ABSTRACT Genome-wide association (GWA) studies have identified numerous, replicable, genetic associations between common single nucleotide polymorphisms (SNPs) and risk of common autoimmune and inflammatory (immune-mediated) diseases, some of which are shared between two diseases. Along with epidemiological and clinical evidence, this suggests that some genetic risk factors may be shared across diseases-as is the case with alleles in the Major Histocompatibility Locus. In this work we evaluate the extent of this sharing for 107 immune disease-risk SNPs in seven diseases: celiac disease, Crohn's disease, multiple sclerosis, psoriasis, rheumatoid arthritis, systemic lupus erythematosus, and type 1 diabetes. We have developed a novel statistic for Cross Phenotype Meta-Analysis (CPMA) which detects association of a SNP to multiple, but not necessarily all, phenotypes. With it, we find evidence that 47/107 (44%) immune-mediated disease risk SNPs are associated to multiple-but not all-immune-mediated diseases (SNP-wise P(CPMA)<0.01). We also show that distinct groups of interacting proteins are encoded near SNPs which predispose to the same subsets of diseases; we propose these as the mechanistic basis of shared disease risk. We are thus able to leverage genetic data across diseases to construct biological hypotheses about the underlying mechanism of pathogenesis.

1 Bookmark
 · 
269 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC), only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD) decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.
    PLoS ONE 01/2014; 9(7):e97282. · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Most complex disease-associated genetic variants are located in non-coding regions and are therefore thought to be regulatory in nature. Association mapping of differential allelic expression (AE) is a powerful method to identify SNPs with direct cis-regulatory impact (cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found 40–60% of these cis-rSNPs to be shared across cell types. We uncover a new class of cis-rSNPs, which disrupt footprint-derived de novo motifs that are predominantly bound by repressive factors and are implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new approach for genome-wide functional validation of transcription factor–SNP interactions. By perturbing NFκB action in lymphoblasts, we identified 489 cis-regulated transcripts with altered AE after NFκB perturbation. Altogether, we perform a comprehensive analysis of cis-variation in four cell populations and provide new tools for the identification of functional variants associated to complex diseases.
    Molecular Systems Biology 10/2014; 10(10). · 14.10 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies (GWASs) have recently revealed many genetic associations that are shared between different diseases. We propose a method, disPCA, for genome-wide characterization of shared and distinct risk factors between and within disease classes. It flips the conventional GWAS paradigm by analyzing the diseases themselves, across GWAS datasets, to explore their "shared pathogenetics". The method applies principal component analysis (PCA) to gene-level significance scores across all genes and across GWASs, thereby revealing shared pathogenetics between diseases in an unsupervised fashion. Importantly, it adjusts for potential sources of heterogeneity present between GWAS which can confound investigation of shared disease etiology. We applied disPCA to 31 GWASs, including autoimmune diseases, cancers, psychiatric disorders, and neurological disorders. The leading principal components separate these disease classes, as well as inflammatory bowel diseases from other autoimmune diseases. Generally, distinct diseases from the same class tend to be less separated, which is in line with their increased shared etiology. Enrichment analysis of genes contributing to leading principal components revealed pathways that are implicated in the immune system, while also pointing to pathways that have yet to be explored before in this context. Our results point to the potential of disPCA in going beyond epidemiological findings of the co-occurrence of distinct diseases, to highlighting novel genes and pathways that unsupervised learning suggest to be key players in the variability across diseases.
    PLoS Computational Biology 09/2014; 10(9):e1003820. · 4.83 Impact Factor

Full-text (2 Sources)

Download
53 Downloads
Available from
May 20, 2014

Similar Publications