Accurate Prediction of Peptide Binding Sites on Protein Surfaces

European Molecular Biology Laboratory, Heidelberg, Germany.
PLoS Computational Biology (Impact Factor: 4.62). 04/2009; 5(3):e1000335. DOI: 10.1371/journal.pcbi.1000335
Source: PubMed


Many important protein-protein interactions are mediated by the binding of a short peptide stretch in one protein to a large globular segment in another. Recent efforts have provided hundreds of examples of new peptides binding to proteins for which a three-dimensional structure is available (either known experimentally or readily modeled) but where no structure of the protein-peptide complex is known. To address this gap, we present an approach that can accurately predict peptide binding sites on protein surfaces. For peptides known to bind a particular protein, the method predicts binding sites with great accuracy, and the specificity of the approach means that it can also be used to predict whether or not a putative or predicted peptide partner will bind. We used known protein-peptide complexes to derive preferences, in the form of spatial position specific scoring matrices, which describe the binding-site environment in globular proteins for each type of amino acid in bound peptides. We then scan the surface of a putative binding protein for sites for each of the amino acids present in a peptide partner and search for combinations of high-scoring amino acid sites that satisfy constraints deduced from the peptide sequence. The method performed well in a benchmark and largely agreed with experimental data mapping binding sites for several recently discovered interactions mediated by peptides, including RG-rich proteins with SMN domains, Epstein-Barr virus LMP1 with TRADD domains, DBC1 with Sir2, and the Ago hook with Argonaute PIWI domain. The method, and associated statistics, is an excellent tool for predicting and studying binding sites for newly discovered peptides mediating critical events in biology.

Download full-text


Available from: Evangelia Petsalaki, Oct 04, 2015
21 Reads
  • Source
    • "The first two steps of modeling protein-peptide interactions can also be achieved using techniques for the combined search of binding sites and peptide poses [13] [22] [23]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein-peptide interactions play essential functional roles in living organisms and their structural characterization is a hot subject of current experimental and theoretical research. Computational modeling of the structure of protein-peptide interactions is usually divided into two stages: prediction of the binding site at a protein receptor surface, and then docking (and modeling) the peptide structure into the known binding site. This paper presents a comprehensive CABS-dock method for the simultaneous search of binding sites and flexible protein-peptide docking, available as a user's friendly web server. We present example CABS-dock results obtained in the default CABS-dock mode and using its advanced options that enable the user to increase the range of flexibility for chosen receptor fragments or to exclude user-selected binding modes from docking search. Furthermore, we demonstrate a strategy to improve CABS-dock performance by assessing the quality of models with classical molecular dynamics. Finally, we discuss the promising extensions and applications of the CABS-dock method and provide a tutorial appendix for the convenient analysis and visualization of CABS-dock results. The CABS-dock web server is freely available at Copyright © 2015. Published by Elsevier Inc.
    Methods 05/2015; 19. DOI:10.1016/j.ymeth.2015.07.004 · 3.65 Impact Factor
  • Source
    • "PepSite is a computational tool that scans the surface of a given protein for patches that are likely to bind individual amino acid residues or peptides up to ten amino acids [13,14], providing a score that reflects the propensity of the peptide to bind to the protein. The PepSite score is expressed in relative units and the higher scores mean better binding. "
    [Show abstract] [Hide abstract]
    ABSTRACT: CCR5 and CXCR4 are the two membrane-standing proteins that, along with CD4, facilitate entry of HIV particles into the host cell. HIV strains differ in their ability to utilize either CCR5 or CXCR4, and this specificity, also known as viral tropism, is largely determined by the sequence of the V3 loop of the viral envelope protein gp120. With statistical and docking approaches we have computationally analyzed binding preferences of CCR5 and CXCR4 to both V3 loop sequences of virus strains of different tropism and endogenous ligands. We conclude that the tropism cannot be satisfactorily explained by amino-acid interactions alone, and suggest a two-step mechanism, by which initial coreceptor selection and approach of the ligand to the binding pocket is dominated by charge and glycosylation pattern of the viral envelope.
    Retrovirology 11/2013; 10(1):130. DOI:10.1186/1742-4690-10-130 · 4.19 Impact Factor
  • Source
    • "The lower peptide binding score represents the accuracy of the prediction because of having stronger signal. However, even if the prediction does not produce any reliable score, this method performs with the efficiency of 55% of correct prediction (Petsalaki et al., 2009) (Table 4 and Supplementary Fig. S6). Our computational biology data indicate that the duplicated proteins have passed through several insertion, deletion and key substitutions in the amino acid sequences which might have taken place independently resulting into the acquisition of new gene function, thus providing a strong support regarding the asymmetric evolution of certain duplicates of NF-YB and NF-YC subunits coupled with the asymmetric divergence in gene function (Yang et al., 2005; Yamamoto et al., 2009). "
    [Show abstract] [Hide abstract]
    ABSTRACT: NF-Y transcription factors encoded by HAP gene family, composed of three subunits (HAP2/NF-YA, HAP3/NF-YB and HAP5/NF-YC), are capable of transcriptional regulation of target genes with high specificity by binding to the CCAAT-containing promoter sequences. Here, we have characterized duplicated HAP genes in Selaginella moellendorffii and explored some features that might be involved in the regulation of gene expression and their function. Subsequently, the evolutionary relationships of LEC1-type of HAP3 genes have been studied starting from lycophytes to angiosperm to reveal the details of conservation and diversification of these genes during plant evolution. Computational analyses demonstrated the variation in length of cis-regulatory region of HAP3 duplicates in S. moellendorffii containing three thermodynamically stable and evolutionarily conserved RNA secondary structures. The homology modeling of NF-Y proteins, secondary structural details, DNA binding large positive patches, binding affinity of H2A-H2B interactive residues of NF-YC subunits on the duplicated NF-YB subunits, conserved domain analyses and protein structural alignments indicated that gene duplication process of HAP genes in S. moellendorffii, followed by structural diversification, provide specific hints about their functional specificity under various circumstances for the survival of this lycophytic plant. We have identified several conserved motifs in LEC1 proteins among all plant lineages during evolution.
Show more