NAPS: a residue-level nucleic acid-binding prediction server

Department of Bioengineering/Bioinformatics, University of Illinois at Chicago, Chicago, IL, USA.
Nucleic Acids Research (Impact Factor: 8.81). 07/2010; 38(Web Server issue):W431-5. DOI: 10.1093/nar/gkq361
Source: PubMed

ABSTRACT Nucleic acid-binding proteins are involved in a great number of cellular processes. Understanding the mechanisms underlying these proteins first requires the identification of specific residues involved in nucleic acid binding. Prediction of NA-binding residues can provide practical assistance in the functional annotation of NA-binding proteins. Predictions can also be used to expedite mutagenesis experiments, guiding researchers to the correct binding residues in these proteins. Here, we present a method for the identification of amino acid residues involved in DNA- and RNA-binding using sequence-based attributes. The method used in this work combines the C4.5 algorithm with bootstrap aggregation and cost-sensitive learning. Our DNA-binding model achieved 79.1% accuracy, while the RNA-binding model reached an accuracy of 73.2%. The NAPS web server is freely available at

  • [Show abstract] [Hide abstract]
    ABSTRACT: The recognition of microRNA (miRNA)-binding residues in proteins would further enhance our understanding of how miRNAs silence their target genes and some relevant biological processes. Due to the insufficient labeled examples, traditional methods such as SVMs could not work well on such problems. Thus, we propose a semi-supervised learning method, i.e., Laplacian Support Vector Machine (LapSVM) for recognizing miRNA-binding residues in proteins from sequences by making use of both labeled and unlabeled data in this article. A hybrid feature is put forward for coding instances which incorporates evolutionary information of the amino acid sequence and mutual interaction propensities in protein-miRNA complex structures. The results indicate that the LapSVM model receives good performance with a F1 score of 22.06±0.28% and an AUC (area under the ROC curve) value of 0.760±0.043. A web server called MBindR is built and freely available at http:// for academic usage.
    2013 6th International Conference on Biomedical Engineering and Informatics (BMEI); 12/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at
    PLoS ONE 05/2014; 9(5):e97725. DOI:10.1371/journal.pone.0097725 · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Computational prediction of RNA-binding residues is helpful in uncovering the mechanisms underlying protein-RNA interactions. Traditional algorithms individually applied feature- or template-based prediction strategy to recognize these crucial residues, which could restrict their predictive power. To improve RNA-binding residue prediction, herein we propose the first integrative algorithm termed RBRDetector (RNA-Binding Residue Detector) by combining these two strategies. We developed a feature-based approach that is an ensemble learning predictor comprising multiple structure-based classifiers, in which well-defined evolutionary and structural features in conjunction with sequential or structural microenvironment were used as the inputs of support vector machines. Meanwhile, we constructed a template-based predictor to recognize the putative RNA-binding regions by structurally aligning the query protein to the RNA-binding proteins with known structures. The final RBRDetector algorithm is an ingenious fusion of our feature- and template-based approaches based on a piecewise function. By validating our predictors with diverse types of structural data, including bound and unbound structures, native and simulated structures, and protein structures binding to different RNA functional groups, we consistently demonstrated that RBRDetector not only had clear advantages over its component methods, but also significantly outperformed the current state-of-the-art algorithms. Nevertheless, the major limitation of our algorithm is that it performed relatively well on DNA-binding proteins and thus incorrectly predicted the DNA-binding regions as RNA-binding interfaces. Finally, we implemented the RBRDetector algorithm as a user-friendly web server, which is freely accessible at © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
    Proteins Structure Function and Bioinformatics 10/2014; 82(10). DOI:10.1002/prot.24610 · 2.92 Impact Factor

Preview (2 Sources)

Available from