Article

Predicting disease genes using protein-protein interactions

Radboud University Nijmegen, Nymegen, Gelderland, Netherlands
Journal of Medical Genetics (Impact Factor: 5.64). 09/2006; 43(8):691-8. DOI: 10.1136/jmg.2006.041376
Source: PubMed

ABSTRACT The responsible genes have not yet been identified for many genetically mapped disease loci. Physically interacting proteins tend to be involved in the same cellular process, and mutations in their genes may lead to similar disease phenotypes.
To investigate whether protein-protein interactions can predict genes for genetically heterogeneous diseases.
72,940 protein-protein interactions between 10,894 human proteins were used to search 432 loci for candidate disease genes representing 383 genetically heterogeneous hereditary diseases. For each disease, the protein interaction partners of its known causative genes were compared with the disease associated loci lacking identified causative genes. Interaction partners located within such loci were considered candidate disease gene predictions. Prediction accuracy was tested using a benchmark set of known disease genes.
Almost 300 candidate disease gene predictions were made. Some of these have since been confirmed. On average, 10% or more are expected to be genuine disease genes, representing a 10-fold enrichment compared with positional information only. Examples of interesting candidates are AKAP6 for arrythmogenic right ventricular dysplasia 3 and SYN3 for familial partial epilepsy with variable foci.
Exploiting protein-protein interactions can greatly increase the likelihood of finding positional candidate disease genes. When applied on a large scale they can lead to novel candidate gene predictions.

0 Followers
 · 
131 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many proteins are known to be associated with cancer diseases. It is quite often that their precise functional role in disease pathogenesis remains unclear. A strategy to gain a better understanding of the function of these proteins is to make use of a combination of different aspects of proteomics data types. In this study, we extended Aragues's method by employing the protein-protein interaction (PPI) data, domain-domain interaction (DDI) data, weighted domain frequency score (DFS), and cancer linker degree (CLD) data to predict cancer proteins. Performances were benchmarked based on three kinds of experiments as follows: (I) using individual algorithm, (II) combining algorithms, and (III) combining the same classification types of algorithms. When compared with Aragues's method, our proposed methods, that is, machine learning algorithm and voting with the majority, are significantly superior in all seven performance measures. We demonstrated the accuracy of the proposed method on two independent datasets. The best algorithm can achieve a hit ratio of 89.4% and 72.8% for lung cancer dataset and lung cancer microarray study, respectively. It is anticipated that the current research could help understand disease mechanisms and diagnosis.
    01/2015; 2015:312047. DOI:10.1155/2015/312047
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A substantial proportion of Autism Spectrum Disorder (ASD) risk resides in de novo germline and rare inherited genetic variation. In particular, rare copy number variation (CNV) contributes to ASD risk in up to 10% of ASD subjects. Despite the striking degree of genetic heterogeneity, case-control studies have detected specific burden of rare disruptive CNV for neuronal and neurodevelopmental pathways. Here, we used machine learning methods to classify ASD subjects and controls, based on rare CNV data and comprehensive gene annotations. We investigated performance of different methods and estimated the percentage of ASD subjects that could be reliably classified based on presumed etiologic CNV they carry. We analyzed 1,892 Caucasian ASD subjects and 2,342 matched controls. Rare CNVs (frequency 1% or less) were detected using Illumina 1M and 1M-Duo BeadChips. Conditional Inference Forest (CF) typically performed as well as or better than other classification methods. We found a maximum AUC (area under the ROC curve) of 0.533 when considering all ASD subjects with rare genic CNVs, corresponding to 7.9% correctly classified ASD subjects and less than 3% incorrectly classified controls; performance was significantly higher when considering only subjects harboring de novo or pathogenic CNVs. We also found rare losses to be more predictive than gains and that curated neurally-relevant annotations (brain expression, synaptic components and neurodevelopmental phenotypes) outperform Gene Ontology and pathway-based annotations. CF is an optimal classification approach for case-control rare CNV data and it can be used to prioritize subjects with variants potentially contributing to ASD risk not yet recognized. The neurally-relevant annotations used in this study could be successfully applied to rare CNV case-control data-sets for other neuropsychiatric disorders.
    BMC Medical Genomics 01/2015; 8 Suppl 1:S7. DOI:10.1186/1755-8794-8-S1-S7 · 3.91 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Tumor necrosis factor–related apoptosis–inducing ligand (TRAIL) is an endogenous secreted peptide and, in preclinical studies, preferentially induces apoptosis in tumor cells rather than in normal cells. The acquisition of resistance in cells exposed to TRAIL or its mimics limits their clinical efficacy. Because ki-nases are intimately involved in the regulation of apoptosis, we systematically characterized kinases involved in TRAIL signaling. Using RNA interference (RNAi) loss-of-function and cDNA overexpression screens, we identified 169 protein kinases that influenced the dynamics of TRAIL-induced apoptosis in the colon adenocarcinoma cell line DLD-1. We classified the kinases as sensitizers or resistors or mod-ulators, depending on the effect that knockdown and overexpression had on TRAIL-induced apoptosis. Two of these kinases that were classified as resistors were PX domain–containing serine/threonine kinase (PXK) and AP2-associated kinase 1 (AAK1), which promote receptor endocytosis and may enable cells to resist TRAIL-induced apoptosis by enhancing endocytosis of the TRAIL receptors. We assembled protein interaction maps using mass spectrometry–based protein interaction analysis and quantitative phospho-proteomics. With these protein interaction maps, we modeled information flow through the networks and identified apoptosis-modifying kinases that are highly connected to regulated substrates downstream of TRAIL. The results of this analysis provide a resource of potential targets for the development of TRAIL combination therapies to selectively kill cancer cells.
    Science Signaling 04/2015; DOI:10.1126/scisignal.2005700 · 7.65 Impact Factor

Full-text (2 Sources)

Download
66 Downloads
Available from
Jun 1, 2014