Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs.

Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.
Nature Biotechnology (Impact Factor: 39.08). 05/2010; 28(5):503-10. DOI: 10.1038/nbt.1633
Source: PubMed

ABSTRACT Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5' start sites, 3' ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.

Download full-text


Available from: Manuel Garber, Jun 30, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Changes in exon-intron structures and splicing patterns represent an important mechanism for the evolution of gene functions and species-specific regulatory networks. While exon creation is widespread during primate and human evolution and has been studied extensively, much less is known about the scope and potential impact of human-specific exon loss events. Historically, transcriptome data and exon annotations are significantly biased towards humans over nonhuman primates. This ascertainment bias makes it challenging to discover human-specific exon loss events. We carried out a transcriptome-wide search of human-specific exon loss events, by taking advantage of RNA-seq as a powerful and unbiased tool for exon discovery and annotation. Using RNA-seq data of humans, chimpanzees, and other primates, we reconstructed and compared transcript structures across the primate phylogeny. We discovered 33 candidate human-specific exon loss events, among which 6 exons passed stringent experimental filters for the complete loss of splicing activities in diverse human tissues. These events may result from human-specific deletion of genomic DNA, or small-scale sequence changes that inactivated splicing signals. The impact of human-specific exon loss events is predominantly regulatory. Three of the 6 events occurred in the 5'-UTR and affected cis regulatory elements of mRNA translation. In SLC7A6, a gene encoding an amino acid transporter, luciferase reporter assays showed that both a human-specific exon loss event and an independent human-specific single nucleotide substitution in the 5'-UTR increased mRNA translational efficiency. Our study provides novel insights into the molecular mechanisms and evolutionary consequences of exon loss during human evolution.
    Molecular Biology and Evolution 11/2014; 32(2). DOI:10.1093/molbev/msu317 · 14.31 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The current annotation of the human genome includes more than 12,000 long intergenic noncoding RNAs (lincRNA). While a handful of lincRNA have been shown to play important regulatory roles, the functionality of most remains unclear. Here, we examined the expression conservation and putative functionality of lincRNA in human and macaque prefrontal cortex (PFC) development and maturation. We analyzed transcriptome sequence (RNA-seq) data from 38 human and 40 macaque individuals covering the entire postnatal development interval. Using the human data set, we detected the expression of 5835 lincRNA annotated in GENCODE and further identified 1888 novel lincRNA. Most of these lincRNA show low DNA sequence conservation, as well as low expression levels. Remarkably, developmental expression patterns of these lincRNA were as conserved between humans and macaques as those of protein-coding genes. Transfection of development-associated lincRNA into human SH-SY5Y cells affected gene expression, indicating their regulatory potential. In brain, expression of these putative target genes correlated with the expression of the corresponding lincRNA during human and macaque PFC development. These results support the potential functionality of lincRNA in primate PFC development.
    RNA 05/2014; 20(7). DOI:10.1261/rna.043075.113 · 4.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The immune system of the horse has not been well studied, despite the fact that the horse displays several features such as sensitivity to bacterial lipopolysaccharide that make them in many ways a more suitable model of some human disorders than the current rodent models. The difficulty of working with large animal models has however limited characterisation of gene expression in the horse immune system with current annotations for the equine genome restricted to predictions from other mammals and the few described horse proteins. This paper outlines sequencing of 184 million transcriptome short reads from immunologically active tissues of three horses including the genome reference "Twilight". In a comparison with the Ensembl horse genome annotation, we found 8,763 potentially novel isoforms.
    05/2014; 2:e382. DOI:10.7717/peerj.382