Conference Paper

Structural Prediction of Protein-Protein Interactions in Saccharomyces cerevisiae

Kansas State Univ., Manhattan
DOI: 10.1109/BIBE.2007.4375729 Conference: Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Source: IEEE Xplore

ABSTRACT Protein-protein interactions (PPI) refer to the associations between proteins and the study of these associations. Several approaches have been used to address the problem of predicting PPI. Some of them are based on biological features extracted from a protein sequence (such as, amino acid composition, GO terms, etc.); others use relational and structural features extracted from the PPI network, which can be represented as a graph. Our approach falls in the second category. We adapt a general approach to graph feature extraction that has previously been applied to collaborative recommendation of friends in social networks. Several structural features are identified based on the PPI graph and used to learn classifiers for predicting new interactions. Two datasets containing Saccharomyces cerevisiae PPI are used to test the proposed approach. Both these datasets were assembled from the Database of Interacting Proteins (DIP). We assembled the first data set directly from DIP in April 2006, while the second data set has been used in previous studies, thus making it easy to compare our approach with previous approaches. Several classifiers are trained using the structural features extracted from the interactions graph. The results show good performance (accuracy, sensitivity and specificity), proving that the structural features are highly predictive with respect to PPI.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MOTIVATION: An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. The expectation is that this will provide a fuller appreciation of cellular processes and networks at the protein level, ultimately leading to a better understanding of disease mechanisms and suggesting new means for intervention. This paper addresses the question: can protein-protein interactions be predicted directly from primary structure and associated data? Using a diverse database of known protein interactions, a Support Vector Machine (SVM) learning system was trained to recognize and predict interactions based solely on primary structure and associated physicochemical properties. RESULTS: Inductive accuracy of the trained system, defined here as the percentage of correct protein interaction predictions for previously unseen test sets, averaged 80% for the ensemble of statistical experiments. Future proteomics studies may benefit from this research by proceeding directly from the automated identification of a cell's gene products to prediction of protein interaction pairs.
    Bioinformatics 06/2001; 17(5):455-60. · 5.32 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To understand the networks in living cells, it is indispensably important to identify protein-protein interactions on a genomic scale. Unfortunately, it is both time-consuming and expensive to do so solely based on experiments due to the nature of the problem whose complexity is obviously overwhelming, just like the fact that "life is complicated". Therefore, developing computational techniques for predicting protein-protein interactions would be of significant value in this regard. By fusing the approach based on the gene ontology and the approach of pseudo-amino acid composition, a predictor called "GO-PseAA" predictor was established to deal with this problem. As a showcase, prediction was performed on 6323 protein pairs from yeast. To avoid redundancy and homology bias, none of the protein pairs investigated has > or = 40% sequence identity with any other. The overall success rate obtained by jackknife cross-validation was 81.6%, indicating the GO-PseAA predictor is very promising for predicting protein-protein interactions from protein sequences, and might become a useful vehicle for studying the network biology in the postgenomic era.
    Journal of Proteome Research 02/2006; 5(2):316-22. · 5.06 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Molecular networks guide the biochemistry of a living cell on multiple levels: Its metabolic and signaling pathways are shaped by the network of interacting proteins, whose production, in turn, is controlled by the genetic regulatory network. To address topological properties of these two networks, we quantified correlations between connectivities of interacting nodes and compared them to a null model of a network, in which all links were randomly rewired. We found that for both interaction and regulatory networks, links between highly connected proteins are systematically suppressed, whereas those between a highly connected and low-connected pairs of proteins are favored. This effect decreases the likelihood of cross talk between different functional modules of the cell and increases the overall robustness of a network by localizing effects of deleterious perturbations.
    Science 06/2002; 296(5569):910-3. · 31.20 Impact Factor


Available from