Binding Site Prediction for Protein-Protein Interactions and Novel Motif Discovery using Re-occurring Polypeptide Sequences

School of Computer Science, Carleton University, Ottawa, ON K1S5B6, Canada.
BMC Bioinformatics (Impact Factor: 2.58). 06/2011; 12(1):225. DOI: 10.1186/1471-2105-12-225
Source: PubMed


While there are many methods for predicting protein-protein interaction, very few can determine the specific site of interaction on each protein. Characterization of the specific sequence regions mediating interaction (binding sites) is crucial for an understanding of cellular pathways. Experimental methods often report false binding sites due to experimental limitations, while computational methods tend to require data which is not available at the proteome-scale. Here we present PIPE-Sites, a novel method of protein specific binding site prediction based on pairs of re-occurring polypeptide sequences, which have been previously shown to accurately predict protein-protein interactions. PIPE-Sites operates at high specificity and requires only the sequences of query proteins and a database of known binary interactions with no binding site data, making it applicable to binding site prediction at the proteome-scale.
PIPE-Sites was evaluated using a dataset of 265 yeast and 423 human interacting proteins pairs with experimentally-determined binding sites. We found that PIPE-Sites predictions were closer to the confirmed binding site than those of two existing binding site prediction methods based on domain-domain interactions, when applied to the same dataset. Finally, we applied PIPE-Sites to two datasets of 2347 yeast and 14,438 human novel interacting protein pairs predicted to interact with high confidence. An analysis of the predicted interaction sites revealed a number of protein subsequences which are highly re-occurring in binding sites and which may represent novel binding motifs.
PIPE-Sites is an accurate method for predicting protein binding sites and is applicable to the proteome-scale. Thus, PIPE-Sites could be useful for exhaustive analysis of protein binding patterns in whole proteomes as well as discovery of novel binding motifs. PIPE-Sites is available online at

Download full-text


Available from: James Robert Green
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Computational prediction of residues that participate in protein-protein interactions is a difficult task, and state of the art methods have shown only limited success in this arena. One possible problem with these methods is that they try to predict interacting residues without incorporating information about the partner protein, although it is unclear how much partner information could enhance prediction performance. To address this issue, the two following comparisons are of crucial significance: (a) comparison between the predictability of inter-protein residue pairs, i.e., predicting exactly which residue pairs interact with each other given two protein sequences; this can be achieved by either combining conventional single-protein predictions or making predictions using a new model trained directly on the residue pairs, and the performance of these two approaches may be compared: (b) comparison between the predictability of the interacting residues in a single protein (irrespective of the partner residue or protein) from conventional methods and predictions converted from the pair-wise trained model. Using these two streams of training and validation procedures and employing similar two-stage neural networks, we showed that the models trained on pair-wise contacts outperformed the partner-unaware models in predicting both interacting pairs and interacting single-protein residues. Prediction performance decreased with the size of the conformational change upon complex formation; this trend is similar to docking, even though no structural information was used in our prediction. An example application that predicts two partner-specific interfaces of a protein was shown to be effective, highlighting the potential of the proposed approach. Finally, a preliminary attempt was made to score docking decoy poses using prediction of interacting residue pairs; this analysis produced an encouraging result.
    Full-text · Article · Dec 2011 · PLoS ONE
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The molecular network sustained by different types of interactions among proteins is widely manifested as the fundamental driving force of cellular operations. Many biological functions are determined by the crosstalk between proteins rather than by the characteristics of their individual components. Thus, the searches for protein partners in global networks are imperative when attempting to address the principles of biology. We have developed a web-based tool "Sequence-based Protein Partners Search" (SPPS) to explore interacting partners of proteins, by searching over a large repertoire of proteins across many species. SPPS provides a database containing more than 60,000 protein sequences with annotations and a protein-partner search engine in two modes (Single Query and Multiple Query). Two interacting proteins of human FBXO6 protein have been found using the service in the study. In addition, users can refine potential protein partner hits by using annotations and possible interactive network in the SPPS web server. SPPS provides a new type of tool to facilitate the identification of direct or indirect protein partners which may guide scientists on the investigation of new signaling pathways. The SPPS server is available to the public at
    Full-text · Article · Jan 2012 · PLoS ONE
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Binding site prediction for protein-protein complexes is a challenging problem in the area of computational molecular biology. Using a set of double-chain complexes in Benchmark 3.0, we calculated the solvent accessible surface areas and inter-residue contact areas for each monomer and propose a division method of protein surface patches. We found that the products of the solvent accessible surface areas and internal contact areas of patches, the PSAIA values, could provide protein binding site information. In a dataset of 78 complexes, either receptors or ligands of 74 complexes had interface patches with the first or second greatest PSAIA values among all surface patches. A good docking result was achieved when the binding site information obtained with this method was applied in Target 39 of the CAPRI experiment. This patch-based protein binding site prediction method differs from traditional methods, which are based on single residue and consider only surface residues. This provides a new method for binding site prediction in protein-protein interactions.The fulltext of this article is in Chinese language.
    Full-text · Article · Dec 2012 · Acta Physico-Chimica Sinica
Show more