Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat Biotechnol

Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA.
Nature Biotechnology (Impact Factor: 41.51). 01/2012; 30(2):159-64. DOI: 10.1038/nbt.2106
Source: PubMed


To better understand the molecular mechanisms and genetic basis of human disease, we systematically examine relationships between 3,949 genes, 62,663 mutations and 3,453 associated disorders by generating a three-dimensional, structurally resolved human interactome. This network consists of 4,222 high-quality binary protein-protein interactions with their atomic-resolution interfaces. We find that in-frame mutations (missense point mutations and in-frame insertions and deletions) are enriched on the interaction interfaces of proteins associated with the corresponding disorders, and that the disease specificity for different mutations of the same gene can be explained by their location within an interface. We also predict 292 candidate genes for 694 unknown disease-to-gene associations with proposed molecular mechanism hypotheses. This work indicates that knowledge of how in-frame disease mutations alter specific interactions is critical to understanding pathogenesis. Structurally resolved interaction networks should be valuable tools for interpreting the wealth of data being generated by large-scale structural genomics and disease association studies.

1 Follower
12 Reads
  • Source
    • "To take advantage of the dynamic nature of PPI data, a new three dimensional representation should be stated integrating protein structure, conformation, isoforms and spatial information. Several recent research works take advantage of this idea to incorporate atomic-level protein structure information in PPI networks (Das et al., 2014) in order to examine the structural principles of disease mutations over a PPI network, or even to elucidate the genetic and molecular mechanisms of underlying human diseases (Wang et al., 2012). One of the ultimate goals of PPI analysis should be the biomarkers' discovery. "

    Frontiers in Genetics 09/2015; 6. DOI:10.3389/fgene.2015.00289
  • Source
    • "To identify possibly different roles of P and C domains in diseases, we investigated the distribution of oncogenic mutations in the DDI network. Previous reports (Wang et al., 2012) showed that disease-related mutations tend to be "

    Protein & Cell 05/2015; 6(8). DOI:10.1007/s13238-015-0158-0 · 3.25 Impact Factor
  • Source
    • "Many disease-related mutations have been found at the interface of protein complexes [32], [33]. These mutations can disrupt protein-protein interactions, affecting signalling pathways and leading to diseases. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.
    PLoS ONE 09/2014; 9(9):e107353. DOI:10.1371/journal.pone.0107353 · 3.23 Impact Factor
Show more

Preview (2 Sources)

12 Reads
Available from