Article

RIPSeeker: a statistical package for identifying protein-associated transcripts from RIP-seq experiments.

Department of Computer Science, University of Toronto, Toronto, Ontario, M5S 2E4, Canada The Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada, Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A4, Canada and Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada.
Nucleic Acids Research (Impact Factor: 8.81). 02/2013; DOI: 10.1093/nar/gkt142
Source: PubMed

ABSTRACT RIP-seq has recently been developed to discover genome-wide RNA transcripts that interact with a protein or protein complex. RIP-seq is similar to both RNA-seq and ChIP-seq, but presents unique properties and challenges. Currently, no statistical tool is dedicated to RIP-seq analysis. We developed RIPSeeker (http://www.bioconductor.org/packages/2.12/bioc/html/RIPSeeker.html), a free open-source Bioconductor/R package for de novo RIP peak predictions based on HMM. To demonstrate the utility of the software package, we applied RIPSeeker and six other published programs to three independent RIP-seq datasets and two PAR-CLIP datasets corresponding to six distinct RNA-binding proteins. Based on receiver operating curves, RIPSeeker demonstrates superior sensitivity and specificity in discriminating high-confidence peaks that are consistently agreed on among a majority of the comparison methods, and dominated 9 of the 12 evaluations, averaging 80% area under the curve. The peaks from RIPSeeker are further confirmed based on their significant enrichment for biologically meaningful genomic elements, published sequence motifs and association with canonical transcripts known to interact with the proteins examined. While RIPSeeker is specifically tailored for RIP-seq data analysis, it also provides a suite of bioinformatics tools integrated within a self-contained software package comprehensively addressing issues ranging from post-alignments' processing to visualization and annotation.

0 Bookmarks
 · 
168 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite the prevalent studies of DNA/Chromatin related epigenetics, such as, histone modifications and DNA methylation, RNA epigenetics has not drawn deserved attention until a new affinity-based sequencing approach MeRIP-Seq was developed and applied to survey the global mRNA N6-methyladenosine (m(6)A) in mammalian cells. As a marriage of ChIP-Seq and RNA-Seq, MeRIP-Seq has the potential to study the transcriptome-wide distribution of various post-transcriptional RNA modifications. We have previously developed an R/Bioconductor package 'exomePeak' for detecting RNA methylation sites under a specific experimental condition or the identifying the differential RNA methylation sites in a case control study from MeRIP-Seq data. Compared with other relatively well studied data types such as ChIP-Seq and RNA-Seq, the study of MeRIP-Seq data is still at very early stage, and existing protocols are not optimized for dealing with the intrinsic characteristic of MeRIP-Seq data. We therein provide here a detailed and easy-to-use protocol of using exomePeak R/Bioconductor package along with other software programs for analysis of MeRIP-Seq data, which covers raw reads alignment, RNA methylation site detection, motif discovery, differential RNA methylation analysis, and functional analysis. Particularly, the rationales behind each processing step as well as the specific method used, the best practice, and possible alternative strategies are briefly discussed. The exomePeak R/Bioconductor package is freely available from Bioconductor: http://www.bioconductor.org/packages/release/bioc/html/exomePeak.html.
    Methods 06/2014; · 3.22 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A number of long noncoding RNAs (lncRNAs) have been identified by deep sequencing methods, but their molecular and cellular functions are known only for a limited number of lncRNAs. Current databases on lncRNAs are mostly for cataloguing purpose without providing in-depth information required to infer functions. A comprehensive resource on lncRNA function is an immediate need. We present a database for functional investigation of lncRNAs that encompasses annotation, sequence analysis, gene expression, protein binding, and phylogenetic conservation. We have compiled lncRNAs for 6 species (human, mouse, zebrafish, fruit fly, worm, yeast) from ENSEMBL, HGNC, MGI, and lncRNAdb. Each lncRNA was analyzed for coding potential and phylogenetic conservation in different lineages. Gene expression data of 208 RNA-Seq studies (4995 samples), collected from GEO, ENCODE, modENCODE, and TCGA databases, were used to provide expression profiles in various tissues, diseases, and developmental stages. Importantly, we analyzed RNA-Seq data to identify co-expressed mRNAs that would provide ample insights on lncRNA functions. The resulting gene list can be subject to enrichment analysis such as Gene Ontology or KEGG pathways. Furthermore, we compiled protein-lncRNA interactions by collecting and analyzing publicly available CLIP-seq or PAR-CLIP sequencing data. Finally, we explored evolutionarily conserved lncRNAs with correlated expression between human and six other organisms to identify functional lncRNAs. The whole contents are provided in a user-friendly web interface. lncRNAtor is available at http://lncrnator.ewha.ac.kr/. sanghyuk@ewha.ac.kr.
    Bioinformatics 05/2014; · 4.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The pervasive transcription of the genome creates many types of non-coding RNAs (ncRNAs). However, we know very little regarding the functions and the regulatory mechanisms of these ncRNAs. Exploring the interactions of RNA and RNA binding proteins (RBPs) is vital because it can allow us to truly understand how these ncRNAs behave in vivo. High-throughput sequencing of RNA isolated by cross-linking immunoprecipitation (HITS-CLIP or CLIP-seq) and its variants have been successfully used as systemic techniques to study RBP binding sites. In this review, we will explain the major differences between the CLIP techniques, summarize successful applications of these techniques, discuss limitations of CLIP, present some suggested solutions and project their promising future roles in studying the RNA world.
    Science China. Life sciences 01/2015; 58(1):75-88. · 1.51 Impact Factor

Full-text (2 Sources)

Download
9 Downloads
Available from
Oct 17, 2014