Arbitrary protein-protein docking targets biologically relevant interfaces.
ABSTRACT Protein-protein recognition is of fundamental importance in the vast majority of biological processes. However, it has already been demonstrated that it is very hard to distinguish true complexes from false complexes in so-called cross-docking experiments, where binary protein complexes are separated and the isolated proteins are all docked against each other and scored. Does this result, at least in part, reflect a physical reality? False complexes could reflect possible nonspecific or weak associations.
In this paper, we investigate the twilight zone of protein-protein interactions, building on an interesting outcome of cross-docking experiments: false complexes seem to favor residues from the true interaction site, suggesting that randomly chosen partners dock in a non-random fashion on protein surfaces. Here, we carry out arbitrary docking of a non-redundant data set of 198 proteins, with more than 300 randomly chosen "probe" proteins. We investigate the tendency of arbitrary partners to aggregate at localized regions of the protein surfaces, the shape and compositional bias of the generated interfaces, and the potential of this property to predict biologically relevant binding sites. We show that the non-random localization of arbitrary partners after protein-protein docking is a generic feature of protein structures. The interfaces generated in this way are not systematically planar or curved, but tend to be closer than average to the center of the proteins. These results can be used to predict biological interfaces with an AUC value up to 0.69 alone, and 0.72 when used in combination with evolutionary information. An appropriate choice of random partners and number of docking models make this method computationally practical. It is also noted that nonspecific interfaces can point to alternate interaction sites in the case of proteins with multiple interfaces. We illustrate the usefulness of arbitrary docking using PEBP (Phosphatidylethanolamine binding protein), a kinase inhibitor with multiple partners.
An approach using arbitrary docking, and based solely on physical properties, can successfully identify biologically pertinent protein interfaces.
Article: Structures of the interacting domains from yeast glutamyl-tRNA synthetase and tRNA-aminoacylation and nuclear-export cofactor Arc1p reveal a novel function for an old fold.[show abstract] [hide abstract]
ABSTRACT: Eukaryotic aminoacyl-tRNA synthetases (aaRS) frequently contain additional appended domains that are absent from their prokaryotic counterparts which mediate complex formation between eukaryotic aaRS and cofactors of aminoacylation and translation. However, the structural basis of such interactions has remained elusive. The heteromerization domain of yeast glutamyl-tRNA synthetase (GluRS) has been cloned, expressed, purified and crystallized in space group C222(1), with unit-cell parameters a = 52, b = 107, c = 168 A. Phase information was obtained from multiple-wavelength anomalous dispersion with selenomethionine to 2.5 A resolution and the structure, comprising two monomers per asymmetric unit, was determined and refined to 1.9 A resolution. The structure of the interacting domain of its accessory protein Arc1p was determined and refined to 1.9 A resolution in a crystal form containing 20 monomers organized in five tetramers per asymmetric unit (space group C2, unit-cell parameters a = 222, b = 89, c = 127 A, beta = 99.4 degrees ). Both domains adopt a GST-like fold, demonstrating a novel role for this fold as a protein-protein interaction module.Acta Crystallographica Section D Biological Crystallography 01/2007; 62(Pt 12):1510-9. · 12.62 Impact Factor
[show abstract] [hide abstract]
ABSTRACT: MOTIVATION: Proteins function through interactions with other proteins and biomolecules. Protein-protein interfaces hold key information toward molecular understanding of protein function. In the past few years, there have been intensive efforts in developing methods for predicting protein interface residues. A review that presents the current status of interface prediction and an overview of its applications and project future developments is in order. SUMMARY: Interface prediction methods rely on a wide range of sequence, structural and physical attributes that distinguish interface residues from non-interface surface residues. The input data are manipulated into either a numerical value or a probability representing the potential for a residue to be inside a protein interface. Predictions are now satisfactory for complex-forming proteins that are well represented in the Protein Data Bank, but less so for under-represented ones. Future developments will be directed at tackling problems such as building structural models for multi-component structural complexes.Bioinformatics 10/2007; 23(17):2203-9. · 5.47 Impact Factor