Sequence-specific binding of single-stranded RNA: Is there a code for recognition?

Department of Biology, Institute for Molecular Biology and Biophysics, ETH Zürich, CH-8093 Zürich, Switzerland.
Nucleic Acids Research (Impact Factor: 9.11). 02/2006; 34(17):4943-59. DOI: 10.1093/nar/gkl620
Source: PubMed


A code predicting the RNA sequence that will be bound by a certain protein based on its amino acid sequence or its structure would provide a useful tool for the design of RNA binders with desired sequence-specificity. Such de novo designed RNA binders could be of extraordinary use in both medical and basic research applications. Furthermore, a code could help to predict the cellular functions of RNA-binding proteins that have not yet been extensively studied. A comparative analysis of Pumilio homology domains, zinc-containing RNA binders, hnRNP K homology domains and RNA recognition motifs is performed in this review. Based on this, a set of binding rules is proposed that hints towards a code for RNA recognition by these domains. Furthermore, we discuss the intermolecular interactions that are important for RNA binding and summarize their importance in providing affinity and specificity.

Download full-text


Available from: Sigrid D Auweter, Aug 07, 2014
  • Source
    • "Other RBPs do not bind sequence motifs but instead recognize secondary structures, which we also included as features. The three-dimensional structure of the protein and the accessibility of the binding site influence the RNA binding [6]. The accessible surface area can be determined by inspecting three-dimensional structures, but there is no high-throughput approach to parse such three-dimensional information. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-binding proteins interact with specific RNA molecules to regulate important cellular processes. It is therefore necessary to identify the RNA interaction partners in order to understand the precise functions of such proteins. Protein-RNA interactions are typically characterized using in vivo and in vitro experiments but these may not detect all binding partners. Therefore, computational methods that capture the protein-dependent nature of such binding interactions could help to predict potential binding partners in silico. We have developed three methods to predict whether an RNA can interact with a particular RNAbinding protein using support vector machines and different features based on the sequence (the Oli method), the motif score (the OliMo method) and the secondary structure (the OliMoSS method). We applied these approaches to different experimentally-derived datasets and compared the predictions with RNAcontext and RPISeq. Oli outperformed OliMoSS and RPISeq, confirming our protein-specific predictions and suggesting that tetranucleotide frequencies are appropriate discriminative features. Oli and RNAcontext were the most competitive methods in terms of the area under curve. A precisionrecall curve analysis achieved higher precision values for Oli. On a second experimental dataset including real negative binding information, Oli outperformed RNAcontext with a precision of 0.73 vs. 0.59. Our experiments showed that features based on primary sequence information are sufficiently discriminating to predict specific RNA-protein interactions. Sequence motifs and secondary structure information were not necessary to improve these predictions. Finally we confirmed that proteinspecific experimental data concerning RNA-protein interactions are valuable sources of information that can be used for the efficient training of models for in silico predictions. The scripts are available upon request to the corresponding author.
    BMC Bioinformatics 04/2014; 15(1):123. DOI:10.1186/1471-2105-15-123 · 2.58 Impact Factor
  • Source
    • "Studies of neuro-oncological ventral antigen 1 and 2 (NOVA1 and NOVA2), here collectively referred to as NOVA proteins, demonstrated that three or more short RNA motifs that are clustered closely together on the pre-mRNA are required for NOVA proteins to mediate splicing regulation [2]. Here we will refer to these motifs as 'multivalent RNA motifs', since they enable RBPs to achieve high-affinity binding by cooperative interactions between multiple RNA-binding domains and the clustered short RNA motifs [17,18]. Past computational methods for analysis of multivalent RNA motifs have focused on the known RNA motifs [19], or have predicted motifs based on the CLIP studies of protein-RNA interactions [17,18]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-binding proteins (RBPs) regulate splicing according to position-dependent principles, which can be exploited for analysis of regulatory motifs. Here we present RNAmotifs, a method that evaluates the sequence around differentially regulated alternative exons to identify clusters of short and degenerate sequences, referred to as multivalent RNA motifs. We show that diverse RBPs share basic positional principles, but differ in their propensity to enhance or repress exon inclusion. We assess exons differentially spliced between brain and heart, identifying known and new regulatory motifs, and predict the expression pattern of RBPs that bind these motifs. RNAmotifs is available at
    Genome biology 01/2014; 15(1):R20. DOI:10.1186/gb-2014-15-1-r20 · 10.81 Impact Factor
  • Source
    • "Different combinations of RNA-binding domains, which in isolation typically bind short, single-stranded nucleotide sequences, determine binding of RBPs to their target transcripts. However, the modular design of most RBPs allows them to recognize more complex RNA sequence and/or structural elements [4-6]. In order to increase our understanding of how these RNA binding domains work together to orchestrate binding of RBPs to defined sequence elements, it is essential to globally identify and characterize their binding preferences and target regions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-binding proteins (RBPs) mediate mRNA biogenesis, translation and decay. We recently developed an approach to profile transcriptome-wide RBP contacts on polyadenylated transcripts by next-generation sequencing. A comparison of such profiles from different biological conditions has the power to unravel dynamic changes in protein-contacted cis-regulatory mRNAs regions without a priori knowledge of the regulatory protein component. We compared protein occupancy profiles of polyadenylated transcripts in MCF7 and HEK293 cells. Briefly, we developed a bioinformatics workflow to identify differential crosslinking sites in cDNA reads of 4-thiouridine crosslinked polyadenylated RNA samples. We identified 30,000 differential crosslinking sites between MCF7 and HEK293 cells at an estimated false-discovery rate of 10%. 73% of all reported differential protein-RNA contact sites cannot be explained by local changes in exon usage as indicated by complementary RNA-seq data. The majority of differentially crosslinked positions are located in 3[prime] UTRs, show distinct secondary-structure characteristics and overlap with binding sites of known RBPs, such as ELAVL1. Importantly, mRNA transcripts with the most significant occupancy changes show elongated mRNA half-lives in MCF7 cells. We present a global comparison of protein occupancy profiles from different cell types, and provide evidence for altered mRNA metabolism as a result of differential protein-RNA contacts. Additionally, we introduce POPPI, a bioinformatics workflow for the analysis of protein occupancy profiling experiments. Our work demonstrates the value of protein occupancy profiling for assessing cis-regulatory RNA sequence space and its dynamics in growth, development and disease.
    Genome biology 01/2014; 15(1):R15. DOI:10.1186/gb-2014-15-1-r15 · 10.81 Impact Factor
Show more