Conference Paper

Structural Prediction of Protein-Protein Interactions in Saccharomyces cerevisiae

Kansas State Univ., Manhattan
DOI: 10.1109/BIBE.2007.4375729 Conference: Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Source: IEEE Xplore

ABSTRACT Protein-protein interactions (PPI) refer to the associations between proteins and the study of these associations. Several approaches have been used to address the problem of predicting PPI. Some of them are based on biological features extracted from a protein sequence (such as, amino acid composition, GO terms, etc.); others use relational and structural features extracted from the PPI network, which can be represented as a graph. Our approach falls in the second category. We adapt a general approach to graph feature extraction that has previously been applied to collaborative recommendation of friends in social networks. Several structural features are identified based on the PPI graph and used to learn classifiers for predicting new interactions. Two datasets containing Saccharomyces cerevisiae PPI are used to test the proposed approach. Both these datasets were assembled from the Database of Interacting Proteins (DIP). We assembled the first data set directly from DIP in April 2006, while the second data set has been used in previous studies, thus making it easy to compare our approach with previous approaches. Several classifiers are trained using the structural features extracted from the interactions graph. The results show good performance (accuracy, sensitivity and specificity), proving that the structural features are highly predictive with respect to PPI.

8 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MOTIVATION: An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. The expectation is that this will provide a fuller appreciation of cellular processes and networks at the protein level, ultimately leading to a better understanding of disease mechanisms and suggesting new means for intervention. This paper addresses the question: can protein-protein interactions be predicted directly from primary structure and associated data? Using a diverse database of known protein interactions, a Support Vector Machine (SVM) learning system was trained to recognize and predict interactions based solely on primary structure and associated physicochemical properties. RESULTS: Inductive accuracy of the trained system, defined here as the percentage of correct protein interaction predictions for previously unseen test sets, averaged 80% for the ensemble of statistical experiments. Future proteomics studies may benefit from this research by proceeding directly from the automated identification of a cell's gene products to prediction of protein interaction pairs.
    Bioinformatics 06/2001; 17(5):455-60. DOI:10.1093/bioinformatics/17.5.455 · 4.98 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein-protein interactions play pivotal roles in various aspects of the structural and functional organization of the cell, and their complete description is indispensable to thorough understanding of the cell. As an approach toward this goal, here we report a comprehensive system to examine two-hybrid interactions in all of the possible combinations between proteins of Saccharomyces cerevisiae. We cloned all of the yeast ORFs individually as a DNA-binding domain fusion ("bait") in a MATa strain and as an activation domain fusion ("prey") in a MATalpha strain, and subsequently divided them into pools, each containing 96 clones. These bait and prey clone pools were systematically mated with each other, and the transformants were subjected to strict selection for the activation of three reporter genes followed by sequence tagging. Our initial examination of approximately 4 x 10(6) different combinations, constituting approximately 10% of the total to be tested, has revealed 183 independent two-hybrid interactions, more than half of which are entirely novel. Notably, the obtained binary data allow us to extract more complex interaction networks, including the one that may explain a currently unsolved mechanism for the connection between distinct steps of vesicular transport. The approach described here thus will provide many leads for integration of various cellular functions and serve as a major driving force in the completion of the protein-protein interaction map.
    Proceedings of the National Academy of Sciences 03/2000; 97(3):1143-7. DOI:10.1073/pnas.97.3.1143 · 9.67 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Proteins play a fundamental role in every process within the cell. Understanding how proteins interact, and the functional units they are part of, is important to furthering our knowledge of the entire biological process. There has been a growing amount of work, both experimetal and computational, on determining the protein-protein interaction network. Recently researchers have had success looking at this as a relational learning problem. Results: In this work, we further this investigation, proposing several novel relational features for predicting protein-protein interaction. These features can be used in any classifier. Our approach allows large and complex networks to be analyzed and is an alternative to using more expensive relational methods. We show that we are able to get an accuracy of 81.7% when predicting new links from noisy high throughput data.
Show more