[Show abstract][Hide abstract] ABSTRACT: Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types, to produce an atlas of active, in vivo-transcribed enhancers. We show that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, to identify disease-associated regulatory single nucleotide polymorphisms, and to classify cell-type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell-type-specific enhancers and gene regulation.
[Show abstract][Hide abstract] ABSTRACT: Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
[Show abstract][Hide abstract] ABSTRACT: Histone modifications play an important role in gene regulation. Acetylation of histone 3 lysine 9 (H3K9ac) is generally associated with transcription initiation and unfolded chromatin, thereby positively influencing gene expression. Deep sequencing of the 5' ends of gene transcripts using DeepCAGE delivers detailed information about the architecture and expression level of gene promoters. The combination of H3K9ac ChIP-chip and DeepCAGE in a myeloid leukemia cell line (THP-1) allowed us to study the spatial distribution of H3K9ac around promoters using a novel clustering approach. The promoter classes were analyzed for association with relevant genomic sequence features.
We performed a clustering of 4,481 promoters according to their surrounding H3K9ac signal and analyzed the clustered promoters for association with different sequence features. The clustering revealed three groups with major H3K9ac signal upstream, centered and downstream of the promoter. Narrow single peak promoters tend to have a concentrated activity of H3K9ac in the upstream region, while broad promoters tend to have a concentrated activity of H3K9ac and RNA polymerase II binding in the centered and downstream regions. A subset of promoters with high gene expression level, compared to subsets with low and medium gene expression, shows dramatic increase in H3K9ac activity in the upstream cluster only; this may indicate that promoters in the centered and downstream clusters are predominantly regulated at post-initiation steps. Furthermore, the upstream cluster is depleted in CpG islands and more likely to regulate un-annotated genes.
Clustering core promoters according to their surrounding acetylation signal is a promising approach for the study of histone modifications. When examining promoters clustered into groups according to their surrounding H3K9 acetylation signal, we find that the relative localization and intensity of H3K9ac is very specific depending on characteristic sequence features of the promoter. Experimental data from DeepCAGE and ChIP-chip experiments using undifferentiated (monocyte) and differentiated (macrophage) THP-1 cells leads us to the same conclusions.
[Show abstract][Hide abstract] ABSTRACT: Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.
[Show abstract][Hide abstract] ABSTRACT: It has been reported that relatively short RNAs of heterogeneous sizes are derived from sequences near the promoters of eukaryotic genes. In conjunction with the FANTOM4 project, we have identified tiny RNAs with a modal length of 18 nt that map within 60 to +120 nt of transcription start sites (TSSs) in human, chicken and Drosophila. These transcription initiation RNAs (tiRNAs) are derived from sequences on the same strand as the TSS and are preferentially associated with G+C-rich promoters. The 5′ ends of tiRNAs show peak density 10-30 nt downstream of TSSs, indicating that they are processed. tiRNAs are generally, although not exclusively, associated with highly expressed transcripts and sites of RNA polymerase II binding. We suggest that tiRNAs may be a general feature of transcription in metazoa and possibly all eukaryotes.
[Show abstract][Hide abstract] ABSTRACT: Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.
[Show abstract][Hide abstract] ABSTRACT: It has been reported that relatively short RNAs of heterogeneous sizes are derived from sequences near the promoters of eukaryotic genes. In conjunction with the FANTOM4 project, we have identified tiny RNAs with a modal length of 18 nt that map within -60 to +120 nt of transcription start sites (TSSs) in human, chicken and Drosophila. These transcription initiation RNAs (tiRNAs) are derived from sequences on the same strand as the TSS and are preferentially associated with G+C-rich promoters. The 5' ends of tiRNAs show peak density 10-30 nt downstream of TSSs, indicating that they are processed. tiRNAs are generally, although not exclusively, associated with highly expressed transcripts and sites of RNA polymerase II binding. We suggest that tiRNAs may be a general feature of transcription in metazoa and possibly all eukaryotes.
[Show abstract][Hide abstract] ABSTRACT: Although repetitive elements pervade mammalian genomes, their overall contribution to transcriptional activity is poorly defined. Here, as part of the FANTOM4 project, we report that 6-30% of cap-selected mouse and human RNA transcripts initiate within repetitive elements. Analysis of approximately 250,000 retrotransposon-derived transcription start sites shows that the associated transcripts are generally tissue specific, coincide with gene-dense regions and form pronounced clusters when aligned to full-length retrotransposon sequences. Retrotransposons located immediately 5' of protein-coding loci frequently function as alternative promoters and/or express noncoding RNAs. More than a quarter of RefSeqs possess a retrotransposon in their 3' UTR, with strong evidence for the reduced expression of these transcripts relative to retrotransposon-free transcripts. Finally, a genome-wide screen identifies 23,000 candidate regulatory regions derived from retrotransposons, in addition to more than 2,000 examples of bidirectional transcription. We conclude that retrotransposon transcription has a key influence upon the transcriptional output of the mammalian genome.
[Show abstract][Hide abstract] ABSTRACT: Finding and characterizing mRNAs, their transcription start sites (TSS), and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5-10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high-throughput sequencing of 5' cDNA tags-DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue.
[Show abstract][Hide abstract] ABSTRACT: Sequences encoding specific domains which may have been excluded from FL-cDNA clones. The figures were modified from those found at TIGR OSA1. (a) AK058349 in which the sequence encoding the AP2 domain (IPR001471) was excluded; (b) AK062487 in which the sequence encoding the MYB domain (IPR001005) was excluded.
(8.85 MB DOC)
[Show abstract][Hide abstract] ABSTRACT: Signal intensity data for the five samples. a): NB: “Nipponbare”, IR: IR64 b): no-signal: number of loci on which signal intensity was not significant higher than backgroud intensity
(0.02 MB XLS)