ArticlePDF Available


Recent advances in high-throughput technologies have created exciting opportunities for systematically investigating the molecular basis of human disease. In addition to a growing catalogue of disease-associated genetic variations, we can now map out an increasingly detailed network diagram of the complex machinery of interacting molecules that constitutes the basis of (patho-) physiological states. The emerging field of ‘network medicine’ applies tools and concepts from network theory to interpret this diagram and elucidate the relation between perturbations on the molecular level and phenotypic disease manifestations. The interactome, i.e. the integrated network of all physical interactions within the cell, can be interpreted as a map and diseases as local perturbations. Network-based approaches can aid in identifying the specific interactome neighborhood that is perturbed in a certain disease, guide the search for therapeutic targets and reveal common molecular mechanisms between seemingly unrelated diseases.
Interactome-based approaches to human disease
Michael Caldera
, Pisanu Buphamalai
, Felix Müller and
Jörg Menche
Recent advances in high-throughput technologies have
created exciting opportunities for systematically investigating
the molecular basis of human disease. In addition to a growing
catalog of disease-associated genetic variations, we can now
map out an increasingly detailed network diagram of the
complex machinery of interacting molecules that constitutes
the basis of (patho-) physiological states. The emerging field of
network medicineapplies tools and concepts from network
theory to interpret this diagram and elucidate the relation be-
tween perturbations on the molecular level and phenotypic
disease manifestations. The interactome, i.e. the integrated
network of all physical interactions within the cell, can be
interpreted as a map and diseases as local perturbations.
Network-based approaches can aid in identifying the specific
interactome neighborhood that is perturbed in a certain dis-
ease, guide the search for therapeutic targets and reveal
common molecular mechanisms between seemingly unrelated
CeMM Research Center for Molecular Medicine of the Austrian
Academy of Sciences, Vienna, Austria
Corresponding author: Menche, Jörg (
These authors contributed equally to this work.
Current Opinion in Systems Biology 2017, 3:8894
This review comes from a themed issue on Clinical and translational
systems biology (2017)
Edited by Jesper Tegnér and David Gomez-Cabrero
For a complete overview see the Issue and the Editorial
Available online 4 May 2017
2452-3100/© 2017 The Authors. Published by Elsevier Ltd. This is an
open access article under the CC BY-NC-ND license (http://creativeco
The Online Mendelian Inheritance in Man (OMIM)
database [1] currently lists over 3700 genes with mu-
tations that are known to have a phenotypic impact, e.g.
sequence alterations that are causal for Mendelian dis-
eases or variants that increase the susceptibility to
complex diseases or cancer. Yet, despite this ever
growing wealth of data, many details of how exactly
genetic alterations contribute to the disease pathobi-
ology remain in the dark. A crucial roadblock for
translating gene-level discoveries into a mechanistic
understanding of disease pathogenesis and concrete
strategies for prevention, diagnosis, and treatment is
that gene products do not act in isolation, but in the
context of other genes and proteins. Biological processes
are ultimately the result of a highly dynamic and regu-
lated interplay of macromolecules, such as interactions
between proteins or between proteins and DNA or
RNA. The entirety of all such biologically relevant in-
teractions form a large and highly connected network,
often referred to as the ‘interactome’(Box 1). The inter-
actome can therefore be understood as a map to inves-
tigate how individual (or several) genetic alterations
propagate throughout the network and perturb the
system as a whole. The emerging field of ‘network
medicine’ applies tools and concepts from network
theory (Box 2) to interpret this map and elucidate the
relation between perturbations on the molecular level
and phenotypic disease manifestations [2]. In the last
decade, network-based approaches have been success-
fully applied to a broad range of diseases, with examples
ranging from rare Mendelian disorders [3], cancer [4] or
metabolic diseases [5], to identifying basic strategies by
which viruses hijack the host interactome [6], to name
but a few. In the following we will review the basic ideas
that underly interactome-based approaches to human
disease and highlight important recent conceptual
The interactome
The term ‘interactome’ is only loosely defined and may
refer to networks that contain rather different types of
interactions. It is instructive to distinguish between
physical and functional interactions. Physical interactions
involve actual physical contact between the partici-
pating biomolecules, for example proteins that assemble
in a complex or receptor-ligand binding. Functional
interaction, on the other hand, can refer to any kind of
biologically relevant relationship. In co-expression net-
works, for example, genes are connected if their
expression patterns are strongly correlated [7]. Another
important functional relationship are ‘genetic in-
teractions’, where two genes are linked if the effect of a
simultaneous alteration of both genes differs from the
expectation based on the individual alterations. An
extreme form is synthetic lethality, where a combined loss
of two genes leads to cell death, while the loss of each
individual gene does not [8].Synthetic viability,
conversely, occurs when the lethal effect of a mutation
in one gene is rescued by a simultaneous mutation in a
Available online at
Current Opinion in
Systems Biology
Current Opinion in Systems Biology 2017, 3:88 94
second gene [9]. While both functional and physical
interaction networks can yield important insights into
disease mechanisms, we will focus mostly on the more
narrowly defined physical interactions in the following.
A number of publicly available databases provide
comprehensive lists of physical proteineprotein in-
teractions (PPIs), as well as other relevant interactions
(e.g. protein-DNA, protein-RNA, enzyme-metabolite)
in human, but also in other species [10]. There are
three main sources for the PPIs reported therein: (i)
interactions curated from the scientific literature and
typically derived from small-scale experiments. (ii) In-
teractions from systematic, proteome-scale mapping
efforts, the two main techniques being yeast two-hybrid
assays for binary interactions [11] and binding affinity
purifications coupled to mass spectrometry for co-
complexes [12].(iii) Interactions from computational
predictions, for example based on protein structure
[13]. It is important to note that each of these sources
may introduce different kinds of noise and biases [14],
such as biases in the selection of which protein pairs
have been tested [15] or experimental biases, for
example towards highly expressed genes [11]. Another
important consideration for interactome-based analyses
is the considerable incompleteness of currently available
data. It is estimated, for example, that high-throughput
methods cover less than 20% of all potential pairwise
protein interactions in the human cell [11].Itis
therefore imperative to carefully evaluate both the
effect of potential biases, as well as the influence of
missing interactions, when analyzing and interpreting
interactome data. Box 1 summarizes the main topolog-
ical properties of a manually curated interactome from
Disease modules in the interactome
Among the first evidence for a direct correspondence
between the biological importance of a gene and the
interactome position of its product was the observation
that the phenotypic impact of deleting a gene in the
yeast Saccharomyces cerevisiae correlates with the number
of interaction partners of the corresponding protein
[17]. This trend was later confirmed also for genes that
Box 1. The interactome.
(a) A global picture of the interactome (as used in [16]) showing its highly complex and interconnected nature. It contains 13,460 proteins and
141,296 interactions that have been curated from different sources with various kinds of physical interactions, including binary interactions from
systematic yeast two-hybrid screens, protein complexes, kinasesubstrate pairs and others. (b) The overall topology is characterized by a highly
heterogeneous degree distribution that follows a power-law. The vast majority of proteins have only few connections, but there is also a
considerable number of extremely highly connected proteins, so-called hubs (33 proteins have more than 300 interactions). (c) These hubs serve
as shortcuts, so that on average, all proteins are directly connected to each other with less than four intermediate steps, a phenomenon often
called the small-worldeffect.The maximum distance between any two proteins in the interactome is 13. (d) Other impor tant structural properties
of the interactome. (e) A comparison of the distances observed among genes associated with the same disease and the respective random
expectation reveals that disease genes are not scattered randomly in the interactome, but aggregate in local, disease-specific neighborhoods, so-
called disease modules.
Interactome-based approaches to human disease Caldera et al. 89 Current Opinion in Systems Biology 2017, 3:8894
are essential for the viability of human cell lines [18].
The topological properties of disease-associated genes
are generally more diverse and may differ between dis-
ease classes (e.g. complex diseases, Mendelian diseases
or cancer), as well as inheritance modes (autosomal
dominant or recessive): cancer driver genes generally
show a strong tendency towards high network centrality
(Box 2), while recessive disease genes are often more
isolated and located at the periphery of the interactome
To further elucidate the detailed mechanisms, by which
a disease-associated perturbation contributes to the
pathobiological phenotype, it is important not only to
understand the network properties of individual asso-
ciated genes, but also their interactome environment
and emerging collective properties. This is particularly
evident for complex diseases that involve potentially
hundreds of genes. Similar to the functional coherence
of interactome neighbors (i.e., interacting proteins are
often involved in the same biological process [20]),
Box 2. Basic topological characteristics of networks.
The degree of a node is the number of links attached to it, i.e. the number of direct neighbors. The distribution of the degrees across all nodes is an
important global characteristic of a network.
Scale free networks are characterized by a heterogeneous degree distribution that follows a power-law: while most nodes have only few
neighbors, there are also a few highly connected hubswith a large number of neighbors.
Apath between two nodes is a sequence of links connecting the two. The minimum number of links needed to connect the two is called shortest
path lengthand represents their network distance.
Centrality measures exist for both nodes and for links and quantify their topological importance within the network. There are different types of
centrality measures, e.g. the degree centrality(simply given by the degree) or betweenness centrality(quantifying how many shortest paths of
the full network cross through a certain node).
Clustering describes a tendency observed in many biological (and other) networks that two neighbors of a node are often also connected to each
other, thus forming a triangle.
Motifs are small recurrent subgraphs in a network that occur particularly frequently.
Network communities are groups of tightly interconnected nodes that have more connections among themselves than to the rest of the network.
90 Clinical and translational systems biology (2017)
Current Opinion in Systems Biology 2017, 3:88 94
genes associated with the same disease have been found
to interact with each other more frequently than ex-
pected by chance [21]. This observation has been
verified systematically for a large number of diseases
[16], thus confirming a fundamental hypothesis of
interactome-based approaches to human disease,
namely that disease genes tend to cluster within so-
called disease modules. Such disease modules are
connected subgraphs of the interactome that contain all
molecular determinants of a certain disease. The first
step towards elucidating the biological mechanisms of a
disease in a network-based framework is therefore to
identify the respective disease module.
Interactome-based gene prioritization
In recent years, a plethora of disease-module identifi-
cation methods have been proposed that explore the
local network neighborhood around known disease-
associated genes (‘seed genes’) to infer likely new dis-
ease gene candidates [22]. They can roughly be classi-
fied into three main categories: (i)Path-based approaches
consider the genes along the shortest paths between the
known disease genes as potential candidate genes.
These candidate genes can then be further ranked, for
example according to the number [23] or significance
[24] of paths they participate in, or filtered such that
they form a minimal connected subgraph, a so-called
Steiner-tree [25].(ii)Dynamical approaches aim to iden-
tify candidate genes by propagating known disease as-
sociations using dynamical models, for example diffusive
processes, where the network neighborhood around
seed genes is scanned by simulating random walks along
the links [26e29]. Genes that are visited more
frequently are considered dynamically closer to the seed
genes and therefore ranked higher. (iii)Connectivity-based
approaches algorithms rank candidate genes according to
their number of links to seed genes [30e32].
Relationship between diseases
Considering the highly connected interactome, it is
apparent that diseases can rarely be understood as in-
dependent entities. Uncovering such relationships be-
tween diseases systematically can help us understand
how different pathological phenotypes are linked
together at the molecular level and shed light on disease
comorbidity, i.e. the observation that certain groups of
diseases frequently arise together [33]. Indeed, a large-
scale evaluation of shared gene associations revealed a
highly connected ‘diseaseome’, in which more than 500
diseases form a giant component and more than 800
diseases have at least one link to another disease [34].
Other diseaseedisease networks have been constructed
based on shared metabolic pathways [35], phenotype
similarity [36,37], the structure of disease ontologies
[38] or comorbidity extracted from patient records
[39,40]. In an interactome-based framework, the rela-
tionship between two diseases is represented by
overlapping disease modules, indicating that perturba-
tions causing one disease are likely to also affect the
other disease. A systematic study of over 44,000 disease
pairs revealed that the degree of this overlap is highly
predictive for the pathobiological similarity of diseases,
such that diseases with overlapping modules show sig-
nificant co-expression patterns, symptom similarity, and
comorbidity, while those that reside in separated inter-
actome neighborhoods are pathobiologically and clini-
cally distinct [16].
The considerable molecular-level overlap that has been
observed for many diseases pinpoints a limitation of
canonical disease classifications that, historically, are
largely based on clinicopathological evidence and often
categorized according to the organ system that the
disease primarily affects. Interactome-based method-
ologies could provide a more holistic framework for
disease classification based on molecular mechanism
Tissue-specific interactomes
The studies discussed above considered an integrated
interactome containing interactions that have been
identified using various techniques and were observed
under different experimental and biological conditions.
While such a global interactome provides invaluable
information for discovering general principles of disease-
associated network perturbations, it cannot account for
the cell-type or tissue-specific manifestations that
characterize many diseases. Directly measured context-
specific interactome networks are scarce, but can be
approximated by integrating more widely available
transcriptome or proteome information [42,43]. The
main idea is to use tissue-specific expression informa-
tion to filter the global interactome for interactions that
are feasible in a given tissue, i.e. both interaction part-
ners are present [44]. Consequently, the resulting
tissue-specific interactomes are generally smaller and
sparser. In line with the observation that essential genes
are more central in the global interactome, genes that
are expressed across many tissues (such as ‘house-
keeping’ genes) were found to form a core interactome
to which the more tissue-specific genes then attach,
thus forming tissue-specific peripheries [45e47].A
comparison between the global and tissue-specific
interactomes further revealed that diseases typically
manifest in those tissues, in which the corresponding
disease-module is least fragmented [48]. Tissue-specific
interactome networks can therefore shed light onto the
detailed disease-associated rewiring events [49,50] and
considerably improve disease gene prioritization
Drugs in the interactome
From a network-based perspective, the action of drugs
can be interpreted similarly to the effect of disease-
Interactome-based approaches to human disease Caldera et al. 91 Current Opinion in Systems Biology 2017, 3:8894
associated genetic variants, i.e. as a local perturbation of
the interactome. Many of the concepts and tools intro-
duced above can be therefore immediately applied in
the context of network pharmacology [53,54]. Several
studies of drug-target networks have shown that most
currently used drugs are less selective than previously
assumed and instead target multiple proteins [55,56].
These target proteins tend to be more highly connected
than random proteins, but less so than essential pro-
teins. Most drugs do not target the corresponding dis-
ease module as a whole, but only a small subset or
adjacent interactome neighborhood [57]. It was further
found that drugs whose affected interactome neigh-
borhood is closer to the disease module tend to be more
effective in the clinic. These insights could help in
selecting the most promising drug targets, for example
by prioritizing targets according to their topological
properties [58], as well as in designing multitarget drugs
that act specifically and directly on the respective dis-
ease module [54]. Another promising application of
interactome-based drugedisease relationships are ap-
proaches to drug repurposing, for example by system-
atically identifying diseases with shared molecular
mechanism that may be modulated by the same thera-
peutic intervention [59].
Interactome-based approaches to human disease have
matured considerably in the past few years, now
possessing both a firm theoretical fundament, as well as
a broad range of successful applications across all major
areas of human disease research. At the same time, the
interactome represents only one layer of relevant in-
formation. A pressing challenge on the way towards the
next generation of (network) medicine is to integrate
the ever growing amount of omics data (e.g., genomics,
epigenomics, proteomics, metabolomics, lipidomics).
Interactome-based, and more generally, network-based
approaches are inherently holistic and integrative, thus
offering unique opportunities in this endeavor.
J.M. is supported by the Vienna Science and Technology Fund, WWTF
[grant number WWTF-VRG005].
Interactome A global network representing all molecular interactions
in a cell. In most cases, the term specifically refers to
physical interaction networks consisting mostly of
proteineprotein interactions, but also of protein-DNA or
protein-RNA interactions. More generally, the term
interactome may also be used to describe functional
interactions, such as genetic interactions.
Disease Gene Gene with a known disease association. Sometimes the
term is reserved to genes with a known mutant
genotype that causes an inherited disorder. More
generally, the term is used also for genes containing a
risk variant for complex diseases or other, more indirect
associations to a particular disease.
Candidate gene Gene with suspected role in the pathobiology of a
disease based on prior evidence. The goal of disease
gene prioritization methods is to identify the most
likely candidates.
Disease module The comprehensive set of cellular components
associated with a certain disease and their
interactions. More specifically, the term refers to a
connected subgraph of the interactome, whose
perturbation causes the disease. Network-based
disease module detection methods aim to identify
this subgraph, in analogy to gene prioritization
Context-specific interactomes Contain only interactions that occur in
a given biological context, such as cell-
type, tissue, or a specific disease
condition. Such interactomes are most
commonly obtained by filtering out
proteins that are not expressed in the
respective context.
Comorbidity The tendency of certain diseases to co-occur in the same
patient, suggesting shared underlying molecular
Papers of particular interest, published within the period of review,
have been highlighted as:
* of special interest
* * of outstanding interest
1. Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A: Online Mendelian Inheritance in Man (OMIM
), an
online catalog of human genes and genetic disorders.Nucleic
Acids Res 2015, 43:D789D798.
2. Loscalzo J, Barabási A-L, Silverman EK. Network medicine:
complex systems in human disease and therapeutics. Harvard
University Press; 2017.
. Smedley D, Schubach M, Jacobsen JOB, Köhler S, Zemojtel T,
Spielmann M, et al.: A whole-genome analysis framework for
effective identification of pathogenic regulatory variants in
Mendelian disease.Am J Hum Genet 2016, 99:595 606, http://
A beautiful example for how interactome information can be success-
fully combined with a variety of additional datasets to build a predictive
tool for prioritizing variants in the non-coding region for Mendelian
4. Leiserson MDM, Vandin F, Wu H-T, Dobson JR, Eldridge JV,
Thomas JL, et al.: Pan-cancer network analysis identifies
combinations of rare somatic mutations across pathways
and protein complexes.Nat Genet 2015, 47:106114, http://
5. Chen Y, Zhu J, Lum PY, Yang X, Pinto S, MacNeil DJ, et al.:
Variations in DNA elucidate molecular networks that cause
disease.Nature 2008, 452:429 435,
6. Pichlmair A, Kandasamy K, Alvisi G, Mulhern O, Sacco R,
Habjan M, et al.: Viral immune modulators perturb the human
molecular network by common and unique strategies.Nature
2012, 487:486490,
7. Zhang B, Horvath S: A general framework for weighted gene
co-expression network analysis.Stat Appl Genet Mol Biol
2005, 4:17,
8. Srivas R, Shen JP, Yang CC, Sun SM, Li J, Gross AM, et al.:
A network of conserved synthetic lethal interactions for
exploration of precision cancer therapy.Mol Cell 2016, 63:
9. Motter AE, Gulbahce N, Almaas E, Barabási A-L: Predicting
synthetic rescues in metabolic networks.Mol Syst Biol 2008,
92 Clinical and translational systems biology (2017)
Current Opinion in Systems Biology 2017, 3:88 94
10. De Las Rivas J, Fontanillo C: Protein-protein interactions es-
sentials: key concepts to building and analyzing interactome
networks.PLoS Comput Biol 2010, 6:e1000807,
. Rolland T, Tas
¸an M, Charloteaux B, Pevzner SJ, Zhong Q,
Sahni N, et al.: A proteome-scale map of the human inter-
actome network.Cell 2014, 159:12121226,
This paper introduces the largest currently available binary interactome
map obtained from a systematic yeast two-hybrid screen. Particular
emphasis is given to a quantification of the influence of biases in
literature-curated interaction maps.
. Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP, Szpyt J,
et al.: The BioPlex network: A systematic exploration of the
human interactome.Cell 2015, 162:425440,
The largest currently available interaction map based on the affinity
purification mass spectrometry approach. A detailed topological
analysis reveals that the network architecture reflects biological orga-
nisation principles.
13. Zhang QC, Petrey D, Deng L, Qiang L, Shi Y, Thu CA, et al.:
Structure-based prediction of protein-protein interactions on
a genome-wide scale.Nature 2012, 490:556560, http://
14. Hakes L, Pinney JW, Robertson DL, Lovell SC: Protein-protein
interaction networks and biologywhats the connection?
Nat Biotechnol 2008, 26:6972,
15. Gillis J, Ballouz S, Pavlidis P: Bias tradeoffs in the creation and
analysis of proteinprotein interaction networks.J Proteom
2014, 100:4454,
. Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M,
Loscalzo J, et al.: Uncovering disease-disease relationships
through the incomplete interactome.Science 2015, 347:
A systematic study of 299 diseases revealing that current interactome
maps have reached sufficient coverage to show that genes associated
with the same disease tend to cluster in the same interactome neigh-
borhood. Disease pairs with overlapping disease modules show sig-
nificant molecular similarity, elevated coexpression of their associated
genes, similar symptoms and high comorbidity.
17. Jeong H, Mason SP, Barabási A-L, Oltvai ZN: Lethality and
centrality in protein networks.Nature 2001, 411:4142, http://
. Blomen VA, Májek P, Jae LT, Bigenzahn JW, Nieuwenhuis J,
Staring J, et al.: Gene essentiality and synthetic lethality in
haploid human cells.Science 2015, 350:10921096, http://
A first large-scale investigation of essential genes in human cell lines,
confirming their central position in interactome networks.
. Piñero J, Berenstein A, Gonzalez-Perez A, Chernomoretz A,
Furlong LI: Uncovering disease mechanisms through network
biology in the era of Next Generation Sequencing.Sci Rep
2016, 6:24570,
A thorough investigation of the topological interactome properties of
disease genes for different classes of diseases and inheritance
modes, offering a much more diverse picture than previously
appreciated that can also explain apparent contradictions in the
20. Hartwell LH, Hopfield JJ, Leibler S, Murray AW: From molecular
to modular cell biology.Nature 1999, 402:C47C52, http://
21. Feldman I, Rzhetsky A, Vitkup D: Network properties of genes
harboring inherited disease mutations.Proc Natl Acad Sci U.
S. A 2008, 105:43234328,
22. Wang X, Gulbahce N, Yu H: Network-based methods for
human disease gene prediction.Brief Funct Gen 2011, 10:
23. George RA, Liu JY, Feng LL, Bryson-Richardson RJ, Fatkin D,
Wouters MA: Analysis of protein sequence and interaction
data for candidate disease gene prediction.Nucleic Acids Res
2006, 34:e130,
24. Dezs}
o Z, Nikolsky Y, Nikolskaya T, Miller J, Cherba D, Webb C,
et al.: Identifying disease-specific genes based on their to-
pological significance in protein networks.BMC Syst Biol
2009, 3:36,
25. Bailly-Bechet M, Borgs C, Braunstein A, Chayes J,
Dagkessamanskaia A, François J-M, et al.: Finding undetected
protein associations in cell signaling by belief propagation.
Proc Natl Acad Sci U. S. A 2011, 108:882887,
26. Krauthammer M, Kaufmann CA, Gilliam TC, Rzhetsky A: Mo-
lecular triangulation: bridging linkage and molecular-network
information for identifying candidate genes in Alzheimers
disease.Proc Natl Acad Sci U. S. A 2004, 101:15148 15153,
27. Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R: Associ-
ating genes and protein complexes with disease via network
propagation.PLoS Comput Biol 2010, 6:e1000641, http://
28. Vandin F, Upfal E, Raphael BJ: Algorithms for detecting
significantly mutated pathways in cancer.J Comput Biol 2011,
29. Smedley D, Köhler S, Czeschik JC, Amberger J, Bocchini C,
Hamosh A, et al.: Walking the interactome for candidate pri-
oritization in exome sequencing studies of Mendelian dis-
eases.Bioinformatics 2014, 30:32153222,
30. Guney E, Oliva B: Exploiting protein-protein interaction net-
works for genome-wide disease-gene prioritization.PLoS
One 2012, 7:e43557,
31. Wang X-D, Huang J-L, Yang L, Wei D-Q, Qi Y-X, Jiang Z-L:
Identification of human disease genes from interactome
network using graphlet interaction.PLoS One 2014, 9:e86142,
32. Ghiassian SD, Menche J, Barabási A-L: A DIseAse MOdule
Detection (DIAMOnD) algorithm derived from a systematic
analysis of connectivity patterns of disease proteins in the
human interactome.PLoS Comput Biol 2015, 11:e1004120,
33. Hu JX, Thomas CE, Brunak S: Network biology concepts in
complex disease comorbidities.Nat Rev Genet 2016, 17:
34. Goh K-I, Cusick ME, Valle D, Childs B, Vidal M, Barabási A-L: The
human disease network.Proc Natl Acad Sci U. S. A 2007, 104:
35. Lee D-S, Park J, Kay KA, Christakis NA, Oltvai ZN, Barabási A-L:
The implications of human metabolic network topology for
disease comorbidity.Proc Natl Acad Sci U. S. A 2008, 105:
36. Van Driel MA, Bruggeman J, Vriend G, Brunner HG,
Leunissen JAM: A text-mining analysis of the human
phenome.Eur J Hum Genet 2006, 14:535542.
37. Zhou X, Menche J, Barabási A-L, Sharma A: Human symp-
tomsdisease network.Nat Commun 2014, 5:4212, http://
38. Caniza H, Romero AE, Paccanaro A: A network medicine
approach to quantify distance between hereditary disease
modules on the interactome.Sci Rep 2015, 5:17658, http://
39. Hidalgo CA, Blumm N, Barabási A-L, Christakis NA: A dynamic
network approach for the study of human phenotypes.PLoS
Comput Biol 2009, 5:e1000353,
. Klimek P, Aichberger S, Thurner S: Disentangling genetic and
environmental risk factors for individual diseases from
multiplex comorbidity networks.Sci Rep 2016, 6:39658, http://
A systematic study of over 300 diseases that integrates comorbidity
networks and molecular networks in order to dissect the role of envi-
ronmental and genetic factors in the pathogenesis of each individual
Interactome-based approaches to human disease Caldera et al. 93 Current Opinion in Systems Biology 2017, 3:8894
41. Chan SY, Loscalzo J: The emerging paradigm of network
medicine in the study of human disease.Circ Res 2012, 111:
42. Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D,
Odeberg J, et al.: Analysis of the human tissue-specific
expression by genome-wide integration of transcriptomics
and antibody-based proteomics.Mol Cell Proteom 2014, 13:
43. Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J,
Sammeth M, et al.: The human transcriptome across tissues
and individuals.Science 2015, 348:660665,
44. Yeger-Lotem E, Sharan R: Human protein interaction networks
across tissues and diseases.Front Genet 2015, 6:257, http://
45. Bossi A, Lehner B: Tissue specificity and the human protein
interaction network.Mol Syst Biol 2009, 5:260,
46. Liu W, Wang J, Wang T, Xie H: Construction and analyses of
human large-scale tissue specific networks.PLoS One 2014,
. Barshir R, Shwartz O, Smoly IY, Yeger-Lotem E: Comparative
analysis of human tissue interactomes reveals factors lead-
ing to tissue-specific manifestation of hereditary diseases.
PLoS Comput Biol 2014, 10:e1003632,
The authors examine the topological features of over 300 diseases in
tissue-specific interactomes and identify an increased number of in-
teractions as a major determinant for tissue-specific disease
48. Kitsak M, Sharma A, Menche J, Guney E, Ghiassian SD,
Loscalzo J, et al.: Tissue specificity of human disease module.
Sci Rep 2016, 6:35241,
49. Ideker T, Krogan NJ: Differential network biology.Mol Syst Biol
2012, 8:565,
50. Greene CS, Krishnan A, Wong AK, Ricciotti E, Zelaya RA,
Himmelstein DS, et al.: Understanding multicellular function
and disease with human tissue-specific networks.Nat Genet
2015, 47:569576,
51. Magger O, Waldman YY, Ruppin E, Sharan R: Enhancing the
prioritization of disease-causing genes through tissue spe-
cific protein interaction networks.PLoS Comput Biol 2012, 8:
52. Li M, Zhang J, Liu Q, Wang J, Wu F-X: Prediction of disease-
related genes based on weighted tissue-specific networks by
using DNA methylation.BMC Med Gen 2014, 7(Suppl 2):S4,
53. Hopkins AL: Network pharmacology: the next paradigm in
drug discovery.Nat Chem Biol 2008, 4:682690, http://
54. Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R:
Structure and dynamics of molecular networks: a novel
paradigm of drug discovery: a comprehensive review.Phar-
macol Ther 2013, 138:333408,
55. Yõldõrõm MA, Goh K-I, Cusick ME, Barabási A-L, Vidal M:
Drugtarget network.Nat Biotechnol 2007, 25:11191126,
56. Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ,
et al.: Predicting new molecular targets for known drugs.
Nature 2009, 462:175181,
. Guney E, Menche J, Vidal M, Barábasi A-L: Network-based in
silico drug efficacy screening.Nat Commun 2016, 7:10331,
An analysis of the interactome relation between drug targets and the
respective disease modules showed that the therapeutic effect of drugs
is localized in a small network neighborhood of the disease genes.
58. Li Z-C, Huang M-H, Zhong W-Q, Liu Z-Q, Xie Y, Dai Z, et al.:
Identification of drug-target interaction from interactome
network with guilt-by-associationprinciple and topology
features.Bioinformatics 2016, 32:10571064,
59. Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z: A survey
of current trends in computational drug repositioning.Brief
Bioinform 2016, 17:212,
94 Clinical and translational systems biology (2017)
Current Opinion in Systems Biology 2017, 3:88 94
... To address this challenge, several theoretical and methodological advances have been proposed in recent years that aim to develop improved therapeutic options. These advances include a new field of medicine called Network Medicine, which applies tools and concepts from network theory to elucidate the relation between perturbations on the molecular-level and phenotypic disease manifestations [5,6]. According to this revolutionary idea, diseases are rarely caused by the deregulation of a single gene, but more typically they are the result of molecules associated with a given disease (i.e., disease genes) and co-localized in specific regions (i.e., disease module) of the human interactome (i.e., the integrated network of all physical interactions within the cell) [5,6]. ...
... These advances include a new field of medicine called Network Medicine, which applies tools and concepts from network theory to elucidate the relation between perturbations on the molecular-level and phenotypic disease manifestations [5,6]. According to this revolutionary idea, diseases are rarely caused by the deregulation of a single gene, but more typically they are the result of molecules associated with a given disease (i.e., disease genes) and co-localized in specific regions (i.e., disease module) of the human interactome (i.e., the integrated network of all physical interactions within the cell) [5,6]. The Network Medicine paradigm states that not only a given disease but also the action of a given drug can be interpreted as a perturbation within the human interactome. ...
... The Network Medicine paradigm states that not only a given disease but also the action of a given drug can be interpreted as a perturbation within the human interactome. As a consequence, for a drug to be on-target effective against a specific disease or to cause offtarget adverse effects, its targets should be within or in the immediate vicinity of the corresponding disease module in the interactome [6]. This construct has fueled the development of several computational approaches for detecting novel therapeutic targets as well as drug repurposing candidates [7][8][9][10]. ...
Full-text available
Breast cancer (BC) is a heterogeneous and complex disease characterized by different subtypes with distinct morphologies and clinical implications and for which new and effective treatment options are urgently demanded. The computational approaches recently developed for drug repurposing provide a very promising opportunity to offer tools that efficiently screen potential novel medical indications for various drugs that are already approved and used in clinical practice. Here, we started with disease-associated genes that were identified through a transcriptome-based analysis, which we used to predict potential repurposable drugs for various breast cancer subtypes by using an algorithm that we developed for drug repurposing called SAveRUNNER. Our findings were also in silico validated by performing a gene set enrichment analysis, which confirmed that most of the predicted repurposable drugs may have a potential treatment effect against breast cancer pathophenotypes.
... The result of this is a high variance of node degree and the emergence of hub nodes [2,7]. The presence of hub nodes plays a role in the so called small world effect in networks, by offering shortcuts between nodes resulting in generally short distances between any two nodes, even if the network is large [34,12,55]. ...
... Physical interaction involves actual physical interaction between biological molecules, as for example in PPI networks, where functional interaction is an umbrella term for any biological interaction. When genes are governed by the same regulators resulting in a coordinated expression pattern, the genes are said to be co-expressed and is a functional interaction [12]. ...
... In the last few years, the new paradigm of Network Medicine overcame the conventional medicine paradigm 'one gene, one drug, one disease' mainly focused on treating the symptoms rather than discovering the causes of diseases 4,5 . According to the Network Medicine paradigm, diseases are rarely caused by a single gene mutation, but more typically by the deregulation of a set of genes interconnected with each other within the human interactome. ...
... Moreover, it is becoming increasingly evident that the molecular determinants associated with a given disease (named disease genes) have a high propensity to agglomerate in specific regions of the interactome, suggesting the existence of specific disease network modules for each disease [5][6][7][8][9][10] . Thus, identifying fully these disease modules and understanding the effects of their perturbations on disease onset and progression could lead to unveil new diagnostic biomarkers as well as therapeutic targets. ...
Full-text available
Alzheimer’s disease (AD) is the most common neurodegenerative disease that currently lacks available effective therapy. Thus, identifying novel molecular biomarkers for diagnosis and treatment of AD is urgently demanded. In this study, we exploited tools and concepts of the emerging research area of Network Medicine to unveil a novel putative disease gene signature associated with AD. We proposed a new pipeline, which combines the strengths of two consolidated algorithms of the Network Medicine: DIseAse MOdule Detection (DIAMOnD), designed to predict new disease-associated genes within the human interactome network; and SWItch Miner (SWIM), designed to predict important (switch) genes within the co-expression network. Our integrated computational analysis allowed us to enlarge the set of the known disease genes associated to AD with additional 14 genes that may be proposed as new potential diagnostic biomarkers and therapeutic targets for AD phenotype.
... Targeting such interactions is a promising strategy for new drug development [28,29]. Thus, it is strongly suggested that significant changes in protein-protein interactions occur depending on the phenotype of the individual (i.e., diseased versus healthy) [30]. ...
Full-text available
Cancer hallmark genes and proteins orchestrate and drive carcinogenesis to a large extent, therefore, it is important to study these features in different cancer types to understand the process of tumorigenesis and discover measurable indicators. We performed a pan-cancer analysis to map differentially interacting hallmarks of cancer proteins (DIHCP). The TCGA transcriptome data associated with 12 common cancers were analyzed and the differential interactome algorithm was applied to determine DIHCPs and DIHCP-centric modules (i.e., DIHCPs and their interacting partners) that exhibit significant changes in their interaction patterns between the tumor and control phenotypes. The diagnostic and prognostic capabilities of the identified modules were assessed to determine the ability of the modules to function as system biomarkers. In addition, the druggability of the prognostic and diagnostic DIHCPs was investigated. As a result, we found a total of 30 DIHCP-centric modules that showed high diagnostic or prognostic performance in any of the 12 cancer types. Furthermore, from the 16 DIHCP-centric modules examined, 29% of these were druggable. Our study presents candidate systems’ biomarkers that may be valuable for understanding the process of tumorigenesis and improving personalized treatment strategies for various cancers, with a focus on their ten hallmark characteristics.
... Early integration strategies (Fig. 1a) first combine the multimodal data into a uniform intermediate representation, such as a network (Caldera et al., 2017;Wang et al., 2014), which is then used for prediction and other analysis purposes. Intermediate strategies (Fig. 1b) jointly model the multiple datasets and their elements, such as genes, proteins, etc., also through a uniform intermediate representation. ...
Full-text available
Motivation Integrating multimodal data represents an effective approach to predicting biomedical characteristics, such as protein functions and disease outcomes. However, existing data integration approaches do not sufficiently address the heterogeneous semantics of multimodal data. In particular, early and intermediate approaches that rely on a uniform integrated representation reinforce the consensus among the modalities but may lose exclusive local information. The alternative late integration approach that can address this challenge has not been systematically studied for biomedical problems. Results We propose Ensemble Integration (EI) as a novel systematic implementation of the late integration approach. EI infers local predictive models from the individual data modalities using appropriate algorithms and uses heterogeneous ensemble algorithms to integrate these local models into a global predictive model. We also propose a novel interpretation method for EI models. We tested EI on the problems of predicting protein function from multimodal STRING data and mortality due to coronavirus disease 2019 (COVID-19) from multimodal data in electronic health records. We found that EI accomplished its goal of producing significantly more accurate predictions than each individual modality. It also performed better than several established early integration methods for each of these problems. The interpretation of a representative EI model for COVID-19 mortality prediction identified several disease-relevant features, such as laboratory test (blood urea nitrogen and calcium) and vital sign measurements (minimum oxygen saturation) and demographics (age). These results demonstrated the effectiveness of the EI framework for biomedical data integration and predictive modeling. Availability and implementation Code and data are available at Supplementary information Supplementary data are available at Bioinformatics Advances online.
... Examples of common networks include, but are not limited to: (1) gene co-expression networks (the ones I explored in Chapter 3), (2) protein-protein interactions to understand (or predict) how different proteins can interact and be activated under certain conditions [181,303], (3) metabolic networks which can inform how metabolites are transformed to synthesise other substances (and therefore each edge represents a metabolic reaction) [149], and (4) the interactome which loosely represents the integrated network of all physical/molecular interactions in a cell [51], therefore allowing for a holistic integration of many "-omics" in the same representation. ...
The assumptions made before modelling real-world data greatly affect performance tasks in machine learning. It is then paramount to find a good data representation in order to successfully develop machine learning models. When no considerable prior assumption exists on the data, values are directly represented in a ``flatten'', 1-Dimensional vector space. However, it is possible to go one step further and perceive more complex relational patterns: for example, a Graph-Dimensional space is used to illustrate the more structured way to represent data and their relational inductive bias. This thesis is focused on these two computational data dimensions across two scales of human biology: the micro scale of molecular biology using gene expression data, and the macro scale of neuroscience using neuroimaging data. Different modelling approaches will be explored to understand how one can model and represent high-dimensional brain data across the specific needs in the applied fields of these two scales. Specifically, for Graph-Dimensional data two approaches will be developed. Firstly, specific and shared genetic profiles that can be generalisable to external datasets will be extracted by applying multilayer co-expression networks across 49 human tissues. Then, a novel deep learning model will be introduced to leverage the entirety of resting-state fMRI data (i.e., spatial and temporal dynamics), as opposed to previous approaches in the literature that simplify and condense this type of data, while illustrating its robustness in an external multimodal dataset and explainability capacities. For 1-Dimensional data, an interpretable model will be developed for understanding cognitive factors using multimodal brain data. Overall, the research adopted in this thesis explores explainable data-driven representations and modelling approaches across the multidisciplinary scientific fields of machine learning, molecular biology, and neuroscience. It also helps highlight the contributions of these fields when modelling the brain and its intra- and inter-dynamics across the human body.
... In this perspective, drug repurposing offers an exciting and complementary alternative to rapidly approve some medicines already approved for other indications [23,24]. To identify novel drug-repurposing opportunities in the pursuit of unconventional but more efficacious MS treatments, promising insights come from the emerging field of system network theory and its application to medicine, known as network medicine [25][26][27]. As computational methods evolve, network medicine increases its capability of capturing the genetic and molecular intricacy of human diseases, and dissecting how such complexity rules disease manifestations, prognosis and, importantly, therapy [28][29][30]. ...
Full-text available
Multiple sclerosis is an autoimmune disease with a strong neuroinflammatory component that contributes to severe demyelination, neurodegeneration and lesions formation in white and grey matter of the spinal cord and brain. Increasing attention is being paid to the signaling of the biogenic amine histamine in the context of several pathological conditions. In multiple sclerosis, histamine regulates the differentiation of oligodendrocyte precursors, reduces demyelination, and improves the remyelination process. However, the concomitant activation of histamine H1–H4 receptors can sustain either damaging or favorable effects, depending on the specifically activated receptor subtype/s, the timing of receptor engagement, and the central versus peripheral target district. Conventional drug development has failed so far to identify curative drugs for multiple sclerosis, thus causing a severe delay in therapeutic options available to patients. In this perspective, drug repurposing offers an exciting and complementary alternative for rapidly approving some medicines already approved for other indications. In the present work, we have adopted a new network-medicine-based algorithm for drug repurposing called SAveRUNNER, for quantifying the interplay between multiple sclerosis-associated genes and drug targets in the human interactome. We have identified new histamine drug-disease associations and predicted off-label novel use of the histaminergic drugs amodiaquine, rupatadine, and diphenhydramine among others, for multiple sclerosis. Our work suggests that selected histamine-related molecules might get to the root causes of multiple sclerosis and emerge as new potential therapeutic strategies for the disease.
... Networks are a general mathematical formalism for representing relationships (links) between objects (nodes). Important examples in biology and medicine range from protein-protein interaction (PPI) networks representing physical interactions between proteins [137] or gene regulatory networks representing transcription factors binding to DNA [138], to signalling networks of immune cells [139] or networks of organs linked by metabolism [140]. More generally speaking, we can distinguish between physical networks, where the links represent a direct physical relationship (e.g., protein interactions) and functional networks, where links represent indirect relationships (e.g., co-expression networks) [141] (Figure 4A). ...
Full-text available
The early developmental phase is of critical importance for human health and disease later in life. To decipher the molecular mechanisms at play, current biomedical research is increasingly relying on large quantities of diverse omics data. The integration and interpretation of the different datasets pose a critical challenge towards the holistic understanding of the complex biological processes that are involved in early development. In this review, we outline the major transcriptomic and epigenetic processes and the respective datasets that are most relevant for studying the periconceptional period. We cover both basic data processing and analysis steps, as well as more advanced data integration methods. A particular focus is given to network-based methods. Finally, we review the medical applications of such integrative analyses.
Frontotemporal dementia (FTD) is a primary cause of dementia encompassing a broad range of clinical phenotypes and cellular pathologies. Genetic discoveries in FTD have largely been driven by linkage studies in well-documented extended families, explaining most of the patients with a known pathogenic mutation. In the context of complex diseases, it is hypothesized that mutations with reduced penetrance or a combination of low-effect size variants with environmental factors drive disease. Furthermore, these genes are likely to be part of the interaction networks of known FTD genes, contributing to converging cellular processes. In this review, we examine gene discovery approaches in FTD and introduce network biology concepts as tools to assist gene identification studies in genetically complex disease.
Background and Objective: With the advent of bioinformatics, biological databases have been constructed to computerize data. Biological systems can be described as interactions and relationships between elements constituting the systems, and they are organized in various biomedical open databases. These open databases have been used in approaches to predict functional interactions such as protein-protein interactions (PPI), drug-drug interactions (DDI) and disease-disease relationships (DDR). However, just combining interaction data has limited effectiveness in predicting the complex relationships occurring in a whole context. Each contributing source contains information on each element in a specific field of knowledge but there is a lack of inter-disciplinary insight in combining them. Methods: In this study, we propose the RWD Integrated platform for Discovering Associations in Biomedical research (RIDAB) to predict interactions between biomedical entities. RIDAB is established as a graph network to construct a platform that predicts the interactions of target entities. Biomedical open database is combined with EMRs each representing a biomedical network and a real-world data. To integrate databases from different domains to build the platform, mapping of the vocabularies was required. In addition, the appropriate structure of the network and the graph embedding method to be used were needed to be selected to fit the tasks. Results: The feasibility of the platform was evaluated using node similarity and link prediction for drug repositioning task, a commonly used task for biomedical network. In addition, we compared the US Food and Drug Administration (FDA)-approved repositioned drugs with the predicted result. By integrating EMR database with biomedical networks, the platform showed increased f1 score in predicting repositioned drugs, from 45.62% to 57.26%, compared to platforms based on biomedical networks alone. Conclusions: This study demonstrates that the elements of biomedical research findings can be reflected by integrating EMR data with open-source biomedical networks. In addition, showed the feasibility of using the established platform to represent the integration of biomedical networks and reflected the relationship between real world networks.
Full-text available
Genes carrying mutations associated with genetic diseases are present in all human cells; yet, clinical manifestations of genetic diseases are usually highly tissue-specific. Although some disease genes are expressed only in selected tissues, the expression patterns of disease genes alone cannot explain the observed tissue specificity of human diseases. Here we hypothesize that for a disease to manifest itself in a particular tissue, a whole functional subnetwork of genes (disease module) needs to be expressed in that tissue. Driven by this hypothesis, we conducted a systematic study of the expression patterns of disease genes within the human interactome. We find that genes expressed in a specific tissue tend to be localized in the same neighborhood of the interactome. By contrast, genes expressed in different tissues are segregated in distinct network neighborhoods. Most important, we show that it is the integrity and the completeness of the expression of the disease module that determines disease manifestation in selected tissues. This approach allows us to construct a disease-tissue network that confirms known and predicts unexpected disease-tissue associations.
Full-text available
Most disorders are caused by a combination of multiple genetic and/or environmental factors. If two diseases are caused by the same molecular mechanism, they tend to co-occur in patients. Here we provide a quantitative method to disentangle how much genetic or environmental risk factors contribute to the pathogenesis of 358 individual diseases, respectively. We pool data on genetic, pathway-based, and toxicogenomic disease-causing mechanisms with disease co-occurrence data obtained from almost two million patients. From this data we construct a multilayer network where nodes represent disorders that are connected by links that either represent phenotypic comorbidity of the patients or the involvement of a certain molecular mechanism. From the similarity of phenotypic and mechanism-based networks for each disorder we derive measure that allows us to quantify the relative importance of various molecular mechanisms for a given disease. We find that most diseases are dominated by genetic risk factors, while environmental influences prevail for disorders such as depressions, cancers, or dermatitis. Almost never we find that more than one type of mechanisms is involved in the pathogenesis of diseases.
Full-text available
Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules.
Full-text available
The increasing cost of drug development together with a significant drop in the number of new drug approvals raises the need for innovative approaches for target identification and efficacy prediction. Here, we take advantage of our increasing understanding of the network-based origins of diseases to introduce a drug-disease proximity measure that quantifies the interplay between drugs targets and diseases. By correcting for the known biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78 diseases indicates that the therapeutic effect of drugs is localized in a small network neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson and several inflammatory disorders. Finally, network-based proximity allows us to predict novel drug-disease associations that offer unprecedented opportunities for drug repurposing and the detection of adverse effects.
Full-text available
We introduce a MeSH-based method that accurately quantifies similarity between heritable diseases at molecular level. This method effectively brings together the existing information about diseases that is scattered across the vast corpus of biomedical literature. We prove that sets of MeSH terms provide a highly descriptive representation of heritable disease and that the structure of MeSH provides a natural way of combining individual MeSH vocabularies. We show that our measure can be used effectively in the prediction of candidate disease genes. We developed a web application to query more than 28.5 million relationships between 7,574 hereditary diseases (96% of OMIM) based on our similarity measure.
The interpretation of non-coding variants still constitutes a major challenge in the application of whole-genome sequencing in Mendelian disease, especially for single-nucleotide and other small non-coding variants. Here we present Genomiser, an analysis framework that is able not only to score the relevance of variation in the non-coding genome, but also to associate regulatory variants to specific Mendelian diseases. Genomiser scores variants through either existing methods such as CADD or a bespoke machine learning method and combines these with allele frequency, regulatory sequences, chromosomal topological domains, and phenotypic relevance to discover variants associated to specific Mendelian disorders. Overall, Genomiser is able to identify causal regulatory variants as the top candidate in 77% of simulated whole genomes, allowing effective detection and discovery of regulatory variants in Mendelian disease.
The co-occurrence of diseases can inform the underlying network biology of shared and multifunctional genes and pathways. In addition, comorbidities help to elucidate the effects of external exposures, such as diet, lifestyle and patient care. With worldwide health transaction data now often being collected electronically, disease co-occurrences are starting to be quantitatively characterized. Linking network dynamics to the real-life, non-ideal patient in whom diseases co-occur and interact provides a valuable basis for generating hypotheses on molecular disease mechanisms, and provides knowledge that can facilitate drug repurposing and the development of targeted therapeutic strategies.
An emerging therapeutic strategy for cancer is to induce selective lethality in a tumor by exploiting interactions between its driving mutations and specific drug targets. Here we use a multi-species approach to develop a resource of synthetic lethal interactions relevant to cancer therapy. First, we screen in yeast ∼169,000 potential interactions among orthologs of human tumor suppressor genes (TSG) and genes encoding drug targets across multiple genotoxic environments. Guided by the strongest signal, we evaluate thousands of TSG-drug combinations in HeLa cells, resulting in networks of conserved synthetic lethal interactions. Analysis of these networks reveals that interaction stability across environments and shared gene function increase the likelihood of observing an interaction in human cancer cells. Using these rules, we prioritize ∼105 human TSG-drug combinations for future follow-up. We validate interactions based on cell and/or patient survival, including topoisomerases with RAD17 and checkpoint kinases with BLM.
Motivation: Identifying drug-target protein interaction is a crucial step in the process of drug research and development. Wet-lab experiment are laborious, time-consuming and expensive. Hence, there is a strong demand for the development of a novel theoretical method to identify potential interaction between drug and target protein. Results: We use all known proteins and drugs to construct a nodes-and edges-weighted biological relevant interactome network. On the basis of the "guilt-by-association" principle, novel network topology features are proposed to characterize interaction pairs and random forest algorithm is employed to identify potential drug-protein interaction. Accuracy of 92.53% derived from the 10-fold cross-validation is about 10% higher than that of the existing method. We identify 2,272 potential drug-target interactions, some of which are associated with diseases, such as Torg-Winchester syndrome and rhabdomyosarcoma. The proposed method can not only accurately predict the interaction between drug molecule and target protein, but also help disease treatment and drug discovery. Contacts:; SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.