[Show abstract][Hide abstract] ABSTRACT: Despite three decades of successful, predominantly phenotype-driven discovery of the genetic causes of monogenic disorders, up to half of children with severe developmental disorders of probable genetic origin remain without a genetic diagnosis. Particularly challenging are those disorders rare enough to have eluded recognition as a discrete clinical entity, those with highly variable clinical manifestations, and those that are difficult to distinguish from other, very similar, disorders. Here we demonstrate the power of using an unbiased genotype-driven approach to identify subsets of patients with similar disorders. By studying 1,133 children with severe, undiagnosed developmental disorders, and their parents, using a combination of exome sequencing and array-based detection of chromosomal rearrangements, we discovered 12 novel genes associated with developmental disorders. These newly implicated genes increase by 10% (from 28% to 31%) the proportion of children that could be diagnosed. Clustering of missense mutations in six of these newly implicated genes suggests that normal development is being perturbed by an activating or dominant-negative mechanism. Our findings demonstrate the value of adopting a comprehensive strategy, both genome-wide and nationwide, to elucidate the underlying causes of rare genetic disorders.
[Show abstract][Hide abstract] ABSTRACT: To systematically investigate the impact of immune stimulation upon regulatory variant activity, we exposed primary monocytes from 432 healthy Europeans to interferon-γ (IFN-γ) or differing durations of lipopolysaccharide and mapped expression quantitative trait loci (eQTLs). More than half of cis-eQTLs identified, involving hundreds of genes and associated pathways, are detected specifically in stimulated monocytes. Induced innate immune activity reveals multiple master regulatory trans-eQTLs including the major histocompatibility complex (MHC), coding variants altering enzyme and receptor function, an IFN-β cytokine network showing temporal specificity, and an interferon regulatory factor 2 (IRF2) transcription factor-modulated network. Induced eQTL are significantly enriched for genome-wide association study loci, identifying context-specific associations to putative causal genes including CARD9, ATM, and IRF8. Thus, applying pathophysiologically relevant immune stimuli assists resolution of functional genetic variants.
[Show abstract][Hide abstract] ABSTRACT: The nuclear phase of the gene expression pathway culminates in the export of mature messenger RNAs (mRNAs) to the cytoplasm through nuclear pore complexes. GANP (germinal- centre associated nuclear protein) promotes the transfer of mRNAs bound to the transport factor NXF1 to nuclear pore complexes. Here, we demonstrate that GANP, subunit of the TRanscription-EXport-2 (TREX-2) mRNA export complex, promotes selective nuclear export of a specific subset of mRNAs whose transport depends on NXF1. Genome-wide gene expression profiling showed that half of the transcripts whose nuclear export was impaired following NXF1 depletion also showed reduced export when GANP was depleted. GANP-dependent transcripts were highly expressed, yet short-lived, and were highly enriched in those encoding central components of the gene expression machinery such as RNA synthesis and processing factors. After injection into Xenopus oocyte nuclei, representative GANP-dependent transcripts showed faster nuclear export kinetics than representative transcripts that were not influenced by GANP depletion. We propose that GANP promotes the nuclear export of specific classes of mRNAs that may facilitate rapid changes in gene expression.
Nucleic Acids Research 02/2014; 42(8). DOI:10.1093/nar/gku095 · 9.11 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Using the ImmunoChip custom genotyping array, we analyzed 14,498 subjects with multiple sclerosis and 24,091 healthy controls for 161,311 autosomal variants and identified 135 potentially associated regions (P < 1.0 × 10(-4)). In a replication phase, we combined these data with previous genome-wide association study (GWAS) data from an independent 14,802 subjects with multiple sclerosis and 26,703 healthy controls. In these 80,094 individuals of European ancestry, we identified 48 new susceptibility variants (P < 5.0 × 10(-8)), 3 of which we found after conditioning on previously identified variants. Thus, there are now 110 established multiple sclerosis risk variants at 103 discrete loci outside of the major histocompatibility complex. With high-resolution Bayesian fine mapping, we identified five regions where one variant accounted for more than 50% of the posterior probability of association. This study enhances the catalog of multiple sclerosis risk variants and illustrates the value of fine mapping in the resolution of GWAS signals.
[Show abstract][Hide abstract] ABSTRACT: Combining data from genome-wide association studies (GWAS) conducted at different locations, using genotype imputation and fixed-effects meta-analysis, has been a powerful approach for dissecting complex disease genetics in populations of European ancestry. Here we investigate the feasibility of applying the same approach in Africa, where genetic diversity, both within and between populations, is far more extensive. We analyse genome-wide data from approximately 5,000 individuals with severe malaria and 7,000 population controls from three different locations in Africa. Our results show that the standard approach is well powered to detect known malaria susceptibility loci when sample sizes are large, and that modern methods for association analysis can control the potential confounding effects of population structure. We show that pattern of association around the haemoglobin S allele differs substantially across populations due to differences in haplotype structure. Motivated by these observations we consider new approaches to association analysis that might prove valuable for multicentre GWAS in Africa: we relax the assumptions of SNP-based fixed effect analysis; we apply Bayesian approaches to allow for heterogeneity in the effect of an allele on risk across studies; and we introduce a region-based test to allow for heterogeneity in the location of causal alleles.
[Show abstract][Hide abstract] ABSTRACT: We used the Immunochip array to analyze 2,816 individuals with juvenile idiopathic arthritis (JIA), comprising the most common subtypes (oligoarticular and rheumatoid factor-negative polyarticular JIA), and 13,056 controls. We confirmed association of 3 known JIA risk loci (the human leukocyte antigen (HLA) region, PTPN22 and PTPN2) and identified 14 loci reaching genome-wide significance (P < 5 x 10(-8)) for the first time. Eleven additional new regions showed suggestive evidence of association with JIA (P < 1 x 10(-6)). Dense mapping of loci along with bioinformatics analysis refined the associations to one gene in each of eight regions, highlighting crucial pathways, including the interleukin (IL)-2 pathway, in JIA disease pathogenesis. The entire Immunochip content, the HLA region and the top 27 loci (P < 1 x 10(-6)) explain an estimated 18, 13 and 6% of the risk of JIA, respectively. In summary, this is the largest collection of JIA cases investigated so far and provides new insight into the genetic basis of this childhood autoimmune disease.
[Show abstract][Hide abstract] ABSTRACT: Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
[Show abstract][Hide abstract] ABSTRACT: To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of 3 genome-wide association studies (GWAS) and 2 independent data sets genotyped on the Immunochip, including 10,588 cases and 22,806 controls. We identified 15 new susceptibility loci, increasing to 36 the number associated with psoriasis in European individuals. We also identified, using conditional analyses, five independent signals within previously known loci. The newly identified loci shared with other autoimmune diseases include candidate genes with roles in regulating T-cell function (such as RUNX3, TAGAP and STAT3). Notably, they included candidate genes whose products are involved in innate host defense, including interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C) and nuclear factor (NF)-κB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense.
[Show abstract][Hide abstract] ABSTRACT: Using the Immunochip custom SNP array, which was designed for dense genotyping of 186 loci identified through genome-wide association studies (GWAS), we analyzed 11,475 individuals with rheumatoid arthritis (cases) of European ancestry and 15,870 controls for 129,464 markers. We combined these data in a meta-analysis with GWAS data from additional independent cases (n = 2,363) and controls (n = 17,872). We identified 14 new susceptibility loci, 9 of which were associated with rheumatoid arthritis overall and five of which were specifically associated with disease that was positive for anticitrullinated peptide antibodies, bringing the number of confirmed rheumatoid arthritis risk loci in individuals of European ancestry to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at 6 loci and identified association to low-frequency variants at 4 loci. Bioinformatic analyses generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations.
[Show abstract][Hide abstract] ABSTRACT: Trimethylation of histone H3 Lys 4 (H3K4me3) is a mark of active and poised promoters. The Set1 complex is responsible for most somatic H3K4me3 and contains the conserved subunit CxxC finger protein 1 (Cfp1), which binds to unmethylated CpGs and links H3K4me3 with CpG islands (CGIs). Here we report that Cfp1 plays unanticipated roles in organizing genome-wide H3K4me3 in embryonic stem cells. Cfp1 deficiency caused two contrasting phenotypes: drastic loss of H3K4me3 at expressed CGI-associated genes, with minimal consequences for transcription, and creation of "ectopic" H3K4me3 peaks at numerous regulatory regions. DNA binding by Cfp1 was dispensable for targeting H3K4me3 to active genes but was required to prevent ectopic H3K4me3 peaks. The presence of ectopic peaks at enhancers often coincided with increased expression of nearby genes. This suggests that CpG targeting prevents "leakage" of H3K4me3 to inappropriate chromatin compartments. Our results demonstrate that Cfp1 is a specificity factor that integrates multiple signals, including promoter CpG content and gene activity, to regulate genome-wide patterns of H3K4me3.
Genes & development 08/2012; 26(15):1714-28. DOI:10.1101/gad.194209.112 · 10.80 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of 3 genome-wide association studies (GWAS) and 2 independent data sets genotyped on the Immunochip, including 10,588 cases and 22,806 controls. We identified 15 new susceptibility loci, increasing to 36 the number associated with psoriasis in European individuals. We also identified, using conditional analyses, five independent signals within previously known loci. The newly identified loci shared with other autoimmune diseases include candidate genes with roles in regulating T-cell function (such as RUNX3, TAGAP and STAT3). Notably, they included candidate genes whose products are involved in innate host defense, including interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C) and nuclear factor (NF)-kappaB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense.
[Show abstract][Hide abstract] ABSTRACT: The fish swimbladder is a unique organ in vertebrate evolution and it functions for regulating buoyancy in most teleost species. It has long been postulated as a homolog of the tetrapod lung, but the molecular evidence is scarce. In order to understand the molecular function of swimbladder as well as its relationship with lungs in tetrapods, transcriptomic analyses of zebrafish swimbladder were carried out by RNA-seq. Gene ontology classification showed that genes in cytoskeleton and endoplasmic reticulum were enriched in the swimbladder. Further analyses depicted gene sets and pathways closely related to cytoskeleton constitution and regulation, cell adhesion, and extracellular matrix. Several prominent transcription factor genes in the swimbladder including hoxc4a, hoxc6a, hoxc8a and foxf1 were identified and their expressions in developing swimbladder during embryogenesis were confirmed. By comparison of enriched transcripts in the swimbladder with those in human and mouse lungs, we established the resemblance of transcriptome of the zebrafish swimbladder and mammalian lungs. Based on the transcriptomic data of zebrafish swimbladder, the predominant functions of swimbladder are in its epithelial and muscular tissues. Our comparative analyses also provide molecular evidence of the relatedness of the fish swimbladder and mammalian lung.
PLoS ONE 08/2011; 6(8):e24019. DOI:10.1371/journal.pone.0024019 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Coordinated regulation of gene expression is a hallmark of the Plasmodium falciparum asexual blood-stage development cycle. We report that carbon catabolite repressor protein 4 (CCR4)-associated factor 1 (CAF1) is critical in regulating more than 1,000 genes during malaria parasites' intraerythrocytic stages, especially egress and invasion proteins. CAF1 knockout results in mistimed expression, aberrant accumulation and localization of proteins involved in parasite egress, and invasion of new host cells, leading to premature release of predominantly half-finished merozoites, drastically reducing the intraerythrocytic growth rate of the parasite. This study demonstrates that CAF1 of the CCR4-Not complex is a significant gene regulatory mechanism needed for Plasmodium development within the human host.
[Show abstract][Hide abstract] ABSTRACT: Female mammals produce milk to feed their newborn offspring before teeth develop and permit the consumption of solid food. Intestinal enterocytes dramatically alter their biochemical signature during the suckling-to-weaning transition. The transcriptional repressor Blimp1 is strongly expressed in immature enterocytes in utero, but these are gradually replaced by Blimp1(-) crypt-derived adult enterocytes. Here we used a conditional inactivation strategy to eliminate Blimp1 function in the developing intestinal epithelium. There was no noticeable effect on gross morphology or formation of mature cell types before birth. However, survival of mutant neonates was severely compromised. Transcriptional profiling experiments reveal global changes in gene expression patterns. Key components of the adult enterocyte biochemical signature were substantially and prematurely activated. In contrast, those required for processing maternal milk were markedly reduced. Thus, we conclude Blimp1 governs the developmental switch responsible for postnatal intestinal maturation.
Proceedings of the National Academy of Sciences 06/2011; 108(26):10585-90. DOI:10.1073/pnas.1105852108 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Human and mouse genomes contain a similar number of CpG islands (CGIs), which are discrete CpG-rich DNA sequences associated with transcription start sites. In both species, ∼50% of all CGIs are remote from annotated promoters but, nevertheless, often have promoter-like features. To determine the role of CGI methylation in cell differentiation, we analyzed DNA methylation at a comprehensive CGI set in cells of the mouse hematopoietic lineage. Using a method that potentially detects ∼33% of genomic CpGs in the methylated state, we found that large differences in gene expression were accompanied by surprisingly few DNA methylation changes. There were, however, many DNA methylation differences between hematopoietic cells and a distantly related tissue, brain. Altered DNA methylation in the immune system occurred predominantly at CGIs within gene bodies, which have the properties of cell type-restricted promoters, but infrequently at annotated gene promoters or CGI flanking sequences (CGI "shores"). Unexpectedly, elevated intragenic CGI methylation correlated with silencing of the associated gene. Differentially methylated intragenic CGIs tended to lack H3K4me3 and associate with a transcriptionally repressive environment regardless of methylation state. Our results indicate that DNA methylation changes play a relatively minor role in the late stages of differentiation and suggest that intragenic CGIs represent regulatory sites of differential gene expression during the early stages of lineage specification.
Genome Research 05/2011; 21(7):1074-86. DOI:10.1101/gr.118703.110 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: CpG islands (CGIs) are vertebrate genomic landmarks that encompass the promoters of most genes and often lack DNA methylation. Querying their apparent importance, the number of CGIs is reported to vary widely in different species and many do not co-localise with annotated promoters. We set out to quantify the number of CGIs in mouse and human genomes using CXXC Affinity Purification plus deep sequencing (CAP-seq). We also asked whether CGIs not associated with annotated transcripts share properties with those at known promoters. We found that, contrary to previous estimates, CGI abundance in humans and mice is very similar and many are at conserved locations relative to genes. In each species CpG density correlates positively with the degree of H3K4 trimethylation, supporting the hypothesis that these two properties are mechanistically interdependent. Approximately half of mammalian CGIs (>10,000) are "orphans" that are not associated with annotated promoters. Many orphan CGIs show evidence of transcriptional initiation and dynamic expression during development. Unlike CGIs at known promoters, orphan CGIs are frequently subject to DNA methylation during development, and this is accompanied by loss of their active promoter features. In colorectal tumors, however, orphan CGIs are not preferentially methylated, suggesting that cancer does not recapitulate a developmental program. Human and mouse genomes have similar numbers of CGIs, over half of which are remote from known promoters. Orphan CGIs nevertheless have the characteristics of functional promoters, though they are much more likely than promoter CGIs to become methylated during development and hence lose these properties. The data indicate that orphan CGIs correspond to previously undetected promoters whose transcriptional activity may play a functional role during development.
[Show abstract][Hide abstract] ABSTRACT: It has recently been shown that nucleosome distribution, histone modifications and RNA polymerase II (Pol II) occupancy show preferential association with exons ("exon-intron marking"), linking chromatin structure and function to co-transcriptional splicing in a variety of eukaryotes. Previous ChIP-sequencing studies suggested that these marking patterns reflect the nucleosomal landscape. By analyzing ChIP-chip datasets across the human genome in three cell types, we have found that this marking system is far more complex than previously observed. We show here that a range of histone modifications and Pol II are preferentially associated with exons. However, there is noticeable cell-type specificity in the degree of exon marking by histone modifications and, surprisingly, this is also reflected in some histone modifications patterns showing biases towards introns. Exon-intron marking is laid down in the absence of transcription on silent genes, with some marking biases changing or becoming reversed for genes expressed at different levels. Furthermore, the relationship of this marking system with splicing is not simple, with only some histone modifications reflecting exon usage/inclusion, while others mirror patterns of exon exclusion. By examining nucleosomal distributions in all three cell types, we demonstrate that these histone modification patterns cannot solely be accounted for by differences in nucleosome levels between exons and introns. In addition, because of inherent differences between ChIP-chip array and ChIP-sequencing approaches, these platforms report different nucleosome distribution patterns across the human genome. Our findings confound existing views and point to active cellular mechanisms which dynamically regulate histone modification levels and account for exon-intron marking. We believe that these histone modification patterns provide links between chromatin accessibility, Pol II movement and co-transcriptional splicing.
PLoS ONE 08/2010; 5(8):e12339. DOI:10.1371/journal.pone.0012339 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The SCL (TAL1) transcription factor is a critical regulator of haematopoiesis and its expression is tightly controlled by multiple cis-acting regulatory elements. To elaborate further the DNA elements which control its regulation, we used genomic tiling microarrays covering 256 kb of the human SCL locus to perform a concerted analysis of chromatin structure and binding of regulatory proteins in human haematopoietic cell lines. This approach allowed us to characterise further or redefine known human SCL regulatory elements and led to the identification of six novel elements with putative regulatory function both up and downstream of the SCL gene. They bind a number of haematopoietic transcription factors (GATA1, E2A LMO2, SCL, LDB1), CTCF or components of the transcriptional machinery and are associated with relevant histone modifications, accessible chromatin and low nucleosomal density. Functional characterisation shows that these novel elements are able to enhance or repress SCL promoter activity, have endogenous promoter function or enhancer-blocking insulator function. Our analysis opens up several areas for further investigation and adds new layers of complexity to our understanding of the regulation of SCL expression.
PLoS ONE 02/2010; 5(2):e9059. DOI:10.1371/journal.pone.0009059 · 3.23 Impact Factor