Landscape of transcription in human cells

Centre for Genomic Regulation and UPF, Doctor Aiguader 88, Barcelona 08003, Catalonia, Spain.
Nature (Impact Factor: 41.46). 09/2012; 489(7414):101-8. DOI: 10.1038/nature11233
Source: PubMed


Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

Download full-text


Available from: Igor Antoshechkin, Feb 17, 2014
  • Source
    • "We found multiple PubMed references for most protein-coding genes; in contrast, most non-coding RNAs did not have any reference in PubMed M. de Hoon et al.: Paradigm shifts in genomics through the FANTOM projects coding transcripts, which are predominantly cytosolic, noncoding RNA tends to be localized to the nucleus (Djebali et al. 2012), where the transcriptome is particular complex (Cheng et al. 2005, Kapranov et al. 2007, Fort et al. 2014). On average, lncRNAs have lower expression levels than protein-coding transcripts, with more than 80 % of lncRNAs detected in the ENCODE cell lines present at less than 1 copy per cell (Djebali et al. 2012). We note though that the expression varies by orders of magnitude levels between non-coding RNAs; as an example, the lncRNA MALAT1 was one of the most abundant RNAs across the FANTOM5 samples (Forrest et al. 2014). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Big leaps in science happen when scientists from different backgrounds interact. In the past 15 years, the FANTOM Consortium has brought together scientists from different fields to analyze and interpret genomic data produced with novel technologies, including mouse full-length cDNAs and, more recently, expression profiling at single-nucleotide resolution by cap-analysis gene expression. The FANTOM Consortium has provided the most comprehensive mouse cDNA collection for functional studies and extensive maps of the human and mouse transcriptome comprising promoters, enhancers, as well as the network of their regulatory interactions. More importantly, serendipitous observations of the FANTOM dataset led us to realize that the mammalian genome is pervasively transcribed, even from retrotransposon elements, which were previously considered junk DNA. The majority of products from the mammalian genome are long non-coding RNAs (lncRNAs), including sense-antisense, intergenic, and enhancer RNAs. While the biological function has been elucidated for some lncRNAs, more than 98 % of them remain without a known function. We argue that large-scale studies are urgently needed to address the functional role of lncRNAs.
    Mammalian Genome 08/2015; 26(9). DOI:10.1007/s00335-015-9593-8 · 3.07 Impact Factor
  • Source
    • "Mammalian genomes are more extensively transcribed than expected, giving rise to thousands of long non-coding RNAs (lncRNAs), which are defined as RNA transcripts non-coding for protein and longer than 200 nt (Bertone et al., 2004; Birney et al., 2007; Carninci et al., 2005; Cheng et al., 2005; Djebali et al., 2012; Kapranov et al., 2007; Yelin et al., 2003). Among lncRNAs, NATs have emerged as a large class of regulatory long ncRNAs (Faghihi and Wahlestedt, 2009; Magistri et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Long non-coding RNAs (lncRNAs), including natural antisense transcripts (NATs), are expressed more extensively than previously anticipated and have widespread roles in regulating gene expression. Nevertheless, the molecular mechanisms of action of the majority of NATs remain largely unknown. Here, we identify a NAT of low-density lipoprotein receptor-related protein 1 (Lrp1), referred to as Lrp1-AS, that negatively regulates Lrp1 expression. We show that Lrp1-AS directly binds to high-mobility group box 2 (Hmgb2) and inhibits the activity of Hmgb2 to enhance Srebp1a-dependent transcription of Lrp1. Short oligonucleotides targeting Lrp1-AS inhibit the interaction of antisense transcript and Hmgb2 protein and increase Lrp1 expression by enhancing Hmgb2 activity. Quantitative RT-PCR analysis of brain tissue samples from Alzheimer's disease patients and aged-matched controls revealed upregulation of LRP1-AS and downregulation of LRP1. Our data suggest a regulatory mechanism whereby a NAT interacts with a ubiquitous chromatin-associated protein to modulate its activity in a locus-specific fashion. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
    Cell Reports 04/2015; 11(6). DOI:10.1016/j.celrep.2015.04.011 · 8.36 Impact Factor
  • Source
    • "Thus, it has been speculated, that ncRNAs might regulate development and consequently also brain function in humans (Mattick 2011). While ∼75% of the human genome is transcribed into RNA, the majority of these RNA transcripts lack proteincoding potential (Djebali et al. 2012) and thus might represent regulatory ncRNAs (Birney et al. 2007; Washietl et al. 2007). In the past, various ncRNA species have been shown to exhibit essential functions in the regulation of gene expression , thereby also playing key roles in neural development, neural plasticity, and brain aging (Mattick 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We have generated a novel, neuro-specific ncRNA microarray, covering 1472 ncRNA species, to investigate their expression in different mouse models for central nervous system diseases. Thereby, we analyzed ncRNA expression in two mouse models with impaired calcium channel activity, implicated in Epilepsy or Parkinson's disease, respectively, as well as in a mouse model mimicking pathophysiological aspects of Alzheimer's disease. We identified well over a hundred differentially expressed ncRNAs, either from known classes of ncRNAs, such as miRNAs or snoRNAs or which represented entirely novel ncRNA species. Several differentially expressed ncRNAs in the calcium channel mouse models were assigned as miRNAs and target genes involved in calcium signaling, thus suggesting feedback regulation of miRNAs by calcium signaling. In the Alzheimer mouse model, we identified two snoRNAs, whose expression was deregulated prior to amyloid plaque formation. Interestingly, the presence of snoRNAs could be detected in cerebral spine fluid samples in humans, thus potentially serving as early diagnostic markers for Alzheimer's disease. In addition to known ncRNAs species, we also identified 63 differentially expressed, entirely novel ncRNA candidates, located in intronic or intergenic regions of the mouse genome, genomic locations, which previously have been shown to harbor the majority of functional ncRNAs.
    RNA 10/2014; 20(12). DOI:10.1261/rna.047225.114 · 4.94 Impact Factor
Show more