Article

Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs.

Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.
Nature Biotechnology (Impact Factor: 39.08). 05/2010; 28(5):503-10. DOI: 10.1038/nbt.1633
Source: PubMed

ABSTRACT Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5' start sites, 3' ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.

1 Bookmark
 · 
360 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-seq studies have an important role for both large-scale analysis of gene expression and for transcriptome reconstruction. However, the lack of software specifically developed for the analysis of the transcriptome structure in lower eukaryotes, has so far limited the comparative studies among different species and strains. In order to fill this gap, an innovative software called ORA (Overlapped Reads Assembler) was developed. This software allows a simple and reliable analysis of the transcriptome structure in organisms with a low number of introns. It can also determine the size and the position of the untranslated regions (UTR) and of polycistronic transcripts. As a case study, we analyzed the transcriptional landscape of six S. cerevisiae strains in two different key steps of the fermentation process. This comparative analysis revealed differences in the UTR regions of transcripts. By extending the transcriptome analysis to yeast species belonging to the Saccharomyces genus, it was possible to examine the conservation level of unknown non-coding RNAs and their putative functional role. By comparing the results obtained using ORA with previous studies and with the transcriptome structure determined with other software, it was proven that ORA has a remarkable reliability. The results obtained from the training set made it possible to detect the presence of transcripts with variable UTRs between S. cerevisiae strains. Finally, we propose a regulatory role for some non-coding transcripts conserved within the Saccharomyces genus and localized in the antisense strand to genes involved in meiosis and cell wall biosynthesis.
    BMC Genomics 12/2014; 15(1):1045. · 4.04 Impact Factor
  • Conference Paper: SpliceGrapherXT
    the International Conference; 01/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Long noncoding RNAs (lncRNAs) play important roles in a wide range of biological processes in mammals and plants. However, the systematic examination of lncRNAs in plants lags behind that in mammals. Recently, lncRNAs have been identified in Arabidopsis and wheat; however, no systematic screening of potential lncRNAs has been reported for the rice genome. Results: In this study, we perform whole transcriptome strand-specific RNA sequencing (ssRNA-seq) of samples from rice anthers, pistils, and seeds 5 days after pollination and from shoots 14 days after germination. Using these data, together with 40 available rice RNA-seq datasets, we systematically analyze rice lncRNAs and definitively identify lncRNAs that are involved in the reproductive process. The results show that rice lncRNAs have some different characteristics compared to those of Arabidopsis and mammals and are expressed in a highly tissue-specific or stage-specific manner. We further verify the functions of a set of lncRNAs that are preferentially expressed in reproductive stages and identify several lncRNAs as competing endogenous RNAs (ceRNAs), which sequester miR160 or miR164 in a type of target mimicry. More importantly, one lncRNA, XLOC_057324, is demonstrated to play a role in panicle development and fertility. We also develop a source of rice lncRNA-associated insertional mutants. Conclusions: Genome-wide screening and functional analysis enabled the identification of a set of lncRNAs that are involved in the sexual reproduction of rice. The results also provide a source of lncRNAs and associated insertional mutants in rice.
    Genome Biology 12/2014; · 10.30 Impact Factor

Full-text (2 Sources)

Download
175 Downloads
Available from
Jun 3, 2014