PRIDB: a protein-RNA interface database

Bioinformatics and Computational Biology Program, Iowa State University, Department of Genetics, Development and Cell Biology, Iowa State University, Department of Computer Science, Iowa State University, Ames, IA 50011, Department of Biology, Elon University, Elon, NC 27244 and Computational Systems Biology Summer Institute, Iowa State University, Ames, IA 50011, USA.
Nucleic Acids Research (Impact Factor: 9.11). 11/2010; 39(Database issue):D277-82. DOI: 10.1093/nar/gkq1108
Source: PubMed


The Protein–RNA Interface Database (PRIDB) is a comprehensive database of protein–RNA interfaces extracted from complexes
in the Protein Data Bank (PDB). It is designed to facilitate detailed analyses of individual protein–RNA complexes and their
interfaces, in addition to automated generation of user-defined data sets of protein–RNA interfaces for statistical analyses
and machine learning applications. For any chosen PDB complex or list of complexes, PRIDB rapidly displays interfacial amino
acids and ribonucleotides within the primary sequences of the interacting protein and RNA chains. PRIDB also identifies ProSite
motifs in protein chains and FR3D motifs in RNA chains and provides links to these external databases, as well as to structure
files in the PDB. An integrated JMol applet is provided for visualization of interacting atoms and residues in the context
of the 3D complex structures. The current version of PRIDB contains structural information regarding 926 protein–RNA complexes
available in the PDB (as of 10 October 2010). Atomic- and residue-level contact information for the entire data set can be
downloaded in a simple machine-readable format. Also, several non-redundant benchmark data sets of protein–RNA complexes are
provided. The PRIDB database is freely available online at

  • Source
    • "We retrieved a total of 1546 protein-interacting RNA chains (RNA- 1546) of PDB from PRIDB database [29]. We used these RNA chains and created 25% non-redundant 'RNA-208' dataset of 208 RNA chains using BLASTCLUST software. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The RNA-protein interactions play diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVMlight) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew’s correlation coefficient by SVMlight based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (
    Full-text · Article · Jan 2015 · Genomics
  • Source
    • "Otherwise, the pair is non-interactive. The distance cutoff 5 Å was borrowed from the PRIDB database [10]. RNA–protein pairs in the training set can now be classified based on interactivity. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Though most of the transcripts are long non-coding RNAs (lncRNAs), little is known about their functions. lncRNAs usually function through interactions with proteins, which implies the importance of identifying the binding proteins of lncRNAs in understanding the molecular mechanisms underlying the functions of lncRNAs. Only a few approaches are available for predicting interactions between lncRNAs and proteins. In this study, we introduce a new method lncPro. By encoding RNA and protein sequences into numeric vectors, we used matrix multiplication to score each RNA--protein pair. This score can be used to measure the interactions between an RNA--protein pair. This method effectively discriminates interacting and non-interacting RNA--protein pairs and predicts RNA--protein interactions within a given complex. Applying this method on all human proteins, we found that the long non-coding RNAs we collected tend to interact with nuclear proteins and RNA-binding proteins. Compared with the existing approaches, our method shortens the time for training matrix and obtains optimal results based on the model being used. The ability of predicting the associations between lncRNAs and proteins has also been enhanced. Our method provides an idea on how to integrate different information into the prediction process.
    Full-text · Article · Sep 2013 · BMC Genomics
  • Source
    • "The Protein-RNA Interface Database [PRIDB (9)],, contains structural information on RNA–protein complexes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Nucleic acid-Protein Interaction DataBase ( contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
    Full-text · Article · Nov 2012 · Nucleic Acids Research
Show more