Computational analysis of noncoding RNAs

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
WIREs RNA (Impact Factor: 6.02). 11/2012; 3(6):759-78. DOI: 10.1002/wrna.1134
Source: PubMed


Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources. These authors contributed equally WIREs RNA 2012. doi: 10.1002/wrna.1134
For further resources related to this article, please visit the WIREs website.


Available from: Loyal Andrew Goff
  • Source
    • "For highly conserved structural RNAs, comparative data allow the secondary structure to be deduced (Gutell et al. 2002). In the absence of conservation , computational methods based on minimization of free energy or stochastic context-free grammars can be used to confidently predict the secondary structure (Washietl et al. 2012). For many RNAs, however, computational identification of the biologically relevant secondary structure remains challenging. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA-RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing. © 2015 Poulsen et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
    RNA 03/2015; 21(5). DOI:10.1261/rna.047068.114 · 4.94 Impact Factor
  • Source
    • "Next-generation sequencing technologies have reshaped our understanding of the molecular constituents of cells and their regulatory elements. The majority of the mammalian genome is transcribed generating a vast repertoire of transcripts that includes protein-coding RNAs and a surprisingly similar number of non-coding RNAs (ncRNAs), the latter category harboring transcripts that can greatly differ in size and biogenesis and whose biological activities remain largely unexplored (Carninci and Hayashizaki, 2007; Forrest and Carninci, 2009; Mercer et al., 2009; Washietl et al., 2012). Furthermore, the combination of technologies to isolate discrete cell types or tissues with the information gathered with modern sequencing platforms has critically improved the resolution of genome-wide transcriptional profiling thus revealing new scenarios in which biological paradigms had often to be adapted and reformulated. "
    [Show abstract] [Hide abstract]
    ABSTRACT: By coupling laser capture microdissection to nanoCAGE technology and next-generation sequencing we have identified the genome-wide collection of active promoters in the mouse Main Olfactory Epithelium (MOE). Transcription start sites (TSSs) for the large majority of Olfactory Receptors (ORs) have been previously mapped increasing our understanding of their promoter architecture. Here we show that in our nanoCAGE libraries of the mouse MOE we detect a large number of tags mapped in loci hosting Type-1 and Type-2 Vomeronasal Receptors genes (V1Rs and V2Rs). These loci also show a massive expression of Long Interspersed Nuclear Elements (LINEs). We have validated the expression of selected receptors detected by nanoCAGE with in situ hybridization, RT-PCR and qRT-PCR. This work extends the repertory of receptors capable of sensing chemical signals in the MOE, suggesting intriguing interplays between MOE and VNO for pheromone processing and positioning transcribed LINEs as candidate regulatory RNAs for VRs expression.
    Frontiers in Cellular Neuroscience 02/2014; 8:41. DOI:10.3389/fncel.2014.00041 · 4.29 Impact Factor
  • Source
    • "Whole genome transcriptome sequencing, also known as RNA-seq, coupled with ab initio assembly has become an effective approach to discover novel lincRNAs [6]. To this end, RNAs are converted to cDNAs and subjected to high throughput sequencing; the obtained raw reads are then aligned to a reference genome and compared to known gene annotations to generate a list of novel transcripts. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Ab initio assembly of transcriptome sequencing data has been widely used to identify large intergenic non-coding RNAs (lincRNAs), a novel class of gene regulators involved in many biological processes. To differentiate real lincRNA transcripts from thousands of assembly artifacts, a series of filtering steps such as filters of transcript length, expression level and coding potential, need to be applied. However, an easy-to-use and publicly available bioinformatics pipeline that integrates these filters is not yet available. Hence, we implemented sebnif, an integrative bioinformatics pipeline to facilitate the discovery of bona fide novel lincRNAs that are suitable for further functional characterization. Specifically, sebnif is the only pipeline that implements an algorithm for identifying high-quality single-exonic lincRNAs that were often omitted in many studies. To demonstrate the usage of sebnif, we applied it on a real biological RNA-seq dataset from Human Skeletal Muscle Cells (HSkMC) and built a novel lincRNA catalog containing 917 highly reliable lincRNAs. Sebnif is available at
    PLoS ONE 01/2014; 9(1):e84500. DOI:10.1371/journal.pone.0084500 · 3.23 Impact Factor
Show more