PRIDB: a protein-RNA interface database

Bioinformatics and Computational Biology Program, Iowa State University, Department of Genetics, Development and Cell Biology, Iowa State University, Department of Computer Science, Iowa State University, Ames, IA 50011, Department of Biology, Elon University, Elon, NC 27244 and Computational Systems Biology Summer Institute, Iowa State University, Ames, IA 50011, USA.
Nucleic Acids Research (Impact Factor: 9.11). 11/2010; 39(Database issue):D277-82. DOI: 10.1093/nar/gkq1108
Source: PubMed

ABSTRACT The Protein-RNA Interface Database (PRIDB) is a comprehensive database of protein-RNA interfaces extracted from complexes in the Protein Data Bank (PDB). It is designed to facilitate detailed analyses of individual protein-RNA complexes and their interfaces, in addition to automated generation of user-defined data sets of protein-RNA interfaces for statistical analyses and machine learning applications. For any chosen PDB complex or list of complexes, PRIDB rapidly displays interfacial amino acids and ribonucleotides within the primary sequences of the interacting protein and RNA chains. PRIDB also identifies ProSite motifs in protein chains and FR3D motifs in RNA chains and provides links to these external databases, as well as to structure files in the PDB. An integrated JMol applet is provided for visualization of interacting atoms and residues in the context of the 3D complex structures. The current version of PRIDB contains structural information regarding 926 protein-RNA complexes available in the PDB (as of 10 October 2010). Atomic- and residue-level contact information for the entire data set can be downloaded in a simple machine-readable format. Also, several non-redundant benchmark data sets of protein-RNA complexes are provided. The PRIDB database is freely available online at

  • Source
    • "We retrieved a total of 1546 protein-interacting RNA chains (RNA- 1546) of PDB from PRIDB database [29]. We used these RNA chains and created 25% non-redundant 'RNA-208' dataset of 208 RNA chains using BLASTCLUST software. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The RNA-protein interactions play diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVMlight) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew’s correlation coefficient by SVMlight based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (
    Genomics 01/2015; 105(4). DOI:10.1016/j.ygeno.2015.01.005 · 2.79 Impact Factor
  • Source
    • "For this reason, there have been many intense interests in experimental and statistical analyses of atomic contacts at RNA–Protein interfaces, and a large portion of these studies has focused on biological, chemical and physical aspects of these interactions. (Ban et al., 2000; Lewis et al., 2010; Morozova et al., 2006; Treger and Westhof, 2001; Wimberly et al., 2000). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cellular functions are mediated by various biological processes including biomolecular interactions, such as protein-protein, DNA-protein and RNA-protein interactions in which RNA-Protein interactions are indispensable for many biological processes like cell development and viral replication. Unlike the protein-protein and protein-DNA interactions, accurate mechanisms and structures of the RNA-Protein complexes are not fully understood. A large amount of theoretical evidence have shown during the past several years that computational geometry is the first pace in understanding the binding profiles and plays a key role in the study of intricate biological structures, interactions and complexes. In this paper, RNA-Protein interaction interface surface is computed via the weighted Voronoi diagram of atoms. Using two filter operations provides a natural definition for interface atoms as classic methods. Unbounded parts of Voronoi facets that are far from the complex are trimmed using modified convex hull of atom centers. This algorithm is implemented to a database with different RNA-Protein complexes extracted from Protein Data Bank (PDB). Afterward, the features of interfaces have been computed and compared with classic method. The results show high correlation coefficients between interface size in the Voronoi model and the classical model based on solvent accessibility, as well as high accuracy and precision in comparison to classical model.
    Journal of Theoretical Biology 01/2012; 293:55-64. DOI:10.1016/j.jtbi.2011.09.033 · 2.30 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The current 18th Database Issue of Nucleic Acids Research features descriptions of 96 new and 83 updated online databases covering various areas of molecular biology. It includes two editorials, one that discusses COMBREX, a new exciting project aimed at figuring out the functions of the 'conserved hypothetical' proteins, and one concerning BioDBcore, a proposed description of the 'minimal information about a biological database'. Papers from the members of the International Nucleotide Sequence Database collaboration (INSDC) describe each of the participating databases, DDBJ, ENA and GenBank, principles of data exchange within the collaboration, and the recently established Sequence Read Archive. A testament to the longevity of databases, this issue includes updates on the RNA modification database, Definition of Secondary Structure of Proteins (DSSP) and Homology-derived Secondary Structure of Proteins (HSSP) databases, which have not been featured here in >12 years. There is also a block of papers describing recent progress in protein structure databases, such as Protein DataBank (PDB), PDB in Europe (PDBe), CATH, SUPERFAMILY and others, as well as databases on protein structure modeling, protein-protein interactions and the organization of inter-protein contact sites. Other highlights include updates of the popular gene expression databases, GEO and ArrayExpress, several cancer gene databases and a detailed description of the UK PubMed Central project. The Nucleic Acids Research online Database Collection, available at:, now lists 1330 carefully selected molecular biology databases. The full content of the Database Issue is freely available online at the Nucleic Acids Research web site (
    Nucleic Acids Research 01/2011; 39(Database issue):D1-6. DOI:10.1093/nar/gkq1243 · 9.11 Impact Factor
Show more