Computational analysis of noncoding RNAs

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
WIREs RNA (Impact Factor: 6.02). 11/2012; 3(6):759-78. DOI: 10.1002/wrna.1134
Source: PubMed


Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources. These authors contributed equally WIREs RNA 2012. doi: 10.1002/wrna.1134
For further resources related to this article, please visit the WIREs website.

Download full-text


Available from: Loyal Andrew Goff, Oct 07, 2015
38 Reads
  • Source
    • "Next-generation sequencing technologies have reshaped our understanding of the molecular constituents of cells and their regulatory elements. The majority of the mammalian genome is transcribed generating a vast repertoire of transcripts that includes protein-coding RNAs and a surprisingly similar number of non-coding RNAs (ncRNAs), the latter category harboring transcripts that can greatly differ in size and biogenesis and whose biological activities remain largely unexplored (Carninci and Hayashizaki, 2007; Forrest and Carninci, 2009; Mercer et al., 2009; Washietl et al., 2012). Furthermore, the combination of technologies to isolate discrete cell types or tissues with the information gathered with modern sequencing platforms has critically improved the resolution of genome-wide transcriptional profiling thus revealing new scenarios in which biological paradigms had often to be adapted and reformulated. "
    [Show abstract] [Hide abstract]
    ABSTRACT: By coupling laser capture microdissection to nanoCAGE technology and next-generation sequencing we have identified the genome-wide collection of active promoters in the mouse Main Olfactory Epithelium (MOE). Transcription start sites (TSSs) for the large majority of Olfactory Receptors (ORs) have been previously mapped increasing our understanding of their promoter architecture. Here we show that in our nanoCAGE libraries of the mouse MOE we detect a large number of tags mapped in loci hosting Type-1 and Type-2 Vomeronasal Receptors genes (V1Rs and V2Rs). These loci also show a massive expression of Long Interspersed Nuclear Elements (LINEs). We have validated the expression of selected receptors detected by nanoCAGE with in situ hybridization, RT-PCR and qRT-PCR. This work extends the repertory of receptors capable of sensing chemical signals in the MOE, suggesting intriguing interplays between MOE and VNO for pheromone processing and positioning transcribed LINEs as candidate regulatory RNAs for VRs expression.
    Frontiers in Cellular Neuroscience 02/2014; 8:41. DOI:10.3389/fncel.2014.00041 · 4.29 Impact Factor
  • Source
    • "Whole genome transcriptome sequencing, also known as RNA-seq, coupled with ab initio assembly has become an effective approach to discover novel lincRNAs [6]. To this end, RNAs are converted to cDNAs and subjected to high throughput sequencing; the obtained raw reads are then aligned to a reference genome and compared to known gene annotations to generate a list of novel transcripts. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Ab initio assembly of transcriptome sequencing data has been widely used to identify large intergenic non-coding RNAs (lincRNAs), a novel class of gene regulators involved in many biological processes. To differentiate real lincRNA transcripts from thousands of assembly artifacts, a series of filtering steps such as filters of transcript length, expression level and coding potential, need to be applied. However, an easy-to-use and publicly available bioinformatics pipeline that integrates these filters is not yet available. Hence, we implemented sebnif, an integrative bioinformatics pipeline to facilitate the discovery of bona fide novel lincRNAs that are suitable for further functional characterization. Specifically, sebnif is the only pipeline that implements an algorithm for identifying high-quality single-exonic lincRNAs that were often omitted in many studies. To demonstrate the usage of sebnif, we applied it on a real biological RNA-seq dataset from Human Skeletal Muscle Cells (HSkMC) and built a novel lincRNA catalog containing 917 highly reliable lincRNAs. Sebnif is available at
    PLoS ONE 01/2014; 9(1):e84500. DOI:10.1371/journal.pone.0084500 · 3.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The recent advent of high-throughput approaches has revealed widespread transcription of the human genome, leading to a new appreciation of transcription regulation, especially from noncoding regions. Distinct from most coding and small noncoding RNAs, long noncoding RNAs (lncRNAs) are generally expressed at low levels, are less conserved and lack protein-coding capacity. These intrinsic features of lncRNAs have not only hampered their full annotation in the past several years, but have also generated controversy concerning whether many or most of these lncRNAs are simply the result of transcriptional noise. Here, we assess these intrinsic features that have challenged lncRNA discovery and further summarize recent progress in lncRNA discovery with integrated methodologies, from which new lessons and insights can be derived to achieve better characterization of lncRNA expression regulation. Full annotation of lncRNA repertoires and the implications of such annotation will provide a fundamental basis for comprehensive understanding of pervasive functions of lncRNAs in biological regulation.
    03/2013; 3(1):226-41. DOI:10.3390/biom3010226
Show more