ArticleLiterature Review

The shortest path is not the one you know: Application of biological network resources in precision oncology research

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Several decades of molecular biology research have delivered a wealth of detailed descriptions of molecular interactions in normal and tumour cells. This knowledge has been functionally organised and assembled into dedicated biological pathway resources that serve as an invaluable tool, not only for structuring the information about molecular interactions but also for making it available for biological, clinical and computational studies. With the advent of high-throughput molecular profiling of tumours, close to complete molecular catalogues of mutations, gene expression and epigenetic modifications are available and require adequate interpretation. Taking into account the information about biological signalling machinery in cells may help to better interpret molecular profiles of tumours. Making sense out of these descriptions requires biological pathway resources for functional interpretation of the data. In this review, we describe the available biological pathway resources, their characteristics in terms of construction mode, focus, aims and paradigms of biological knowledge representation. We present a new resource that is focused on cancer-related signalling, the Atlas of Cancer Signalling Networks. We briefly discuss current approaches for data integration, visualisation and analysis, using biological networks, such as pathway scoring, guilt-by-association and network propagation. Finally, we illustrate with several examples the added value of data interpretation in the context of biological networks and demonstrate that it may help in analysis of high-throughput data like mutation, gene expression or small interfering RNA screening and can guide in patients stratification. Finally, we discuss perspectives for improving precision medicine using biological network resources and tools. Taking into account the information about biological signalling machinery in cells may help to better interpret molecular patterns of tumours and enable to put precision oncology into general clinical practice. © The Author 2015. Published by Oxford University Press on behalf of the UK Environmental Mutagen Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Edges connecting signaling network nodes are activating or inhibitory physical connections of participating proteins, microRNAs and DNA-sequences including enzymatic actions, like phosphorylation, or dephosphorylation [1][2][3][4][5]. ...
... There are several signaling network resources (such as the curated and multi-layered SignaLink database, http://signalink.org [2,5]), among which the Atlas of Cancer Signaling Networks [4] is primarily focused on signaling components important in cancer. Proteins with cancer-related mutations are often hubs of the signaling network, which are becoming enriched in positive regulatory loops during cancer development [6,7]. ...
... Incorporation of personalized data, such as mutation, single nucleotide polymorphism, transcriptional, proteome, signalome (e.g. phosphoproteome) and epigenetic profiles to signaling networks significantly enhance patient-and disease stage-specific drug targeting in anti-cancer therapies [1][2][3][4][5]8,9]. Patient specificity can differentiate network behavior in at least four different levels: A.) at the level of the genetic background (e.g., single-nucleotide polymorphisms and cancer-related mutations, copy-number changes or chromatin rearrangements); B.) at the level of gene expression and translational changes (caused by e.g. ...
Article
Cancer initiation and development are increasingly perceived as systems-level phenomena, where intra- and inter-cellular signaling networks of the ecosystem of cancer and stromal cells offer efficient methodologies for outcome prediction and intervention design. Within this framework, RAS emerges as a 'contextual signaling hub', i.e. the final result of RAS activation or inhibition is determined by the signaling network context. Current therapies often train' cancer cells shifting them to a novel attractor, which has increased metastatic potential and drug resistance. The few therapy-surviving cancer cells are surrounded by massive cell death triggering a primordial adaptive and reparative general wound healing response. Overall, dynamic analysis of patient- and disease-stage specific intracellular and intercellular signaling networks may open new areas of anticancer therapy using multitarget drugs, drugs combinations, edgetic drugs, as well as help design 'gentler', differentiation and maintenance therapies.
... Data visualisation: omics data visualisation (Kuperstein et al., 2015b;Satagopam et al., 2016;Mazein et al., 2021a); analysis of cell-specific mechanisms using single-cell expression data ; -RNA-Seq-based analysis of the activity of transcription factors ; Network-based analysis: network analysis and representing map-based molecular signatures for sub-groups of patients in cancer (Kuperstein et al., 2015b;Jdey et al., 2016); structural network analysis together with omics data to rationalise the synergistic effects of drugs towards the design of complex disease Frontiers in Bioinformatics frontiersin.org 09 stage-specific druggable interventions in preclinical studies (Monraz Gomez et al., 2019); network-based analysis and prediction of epithelial-to-mesenchymallike transition mechanisms followed by experimental validation with the use of an animal model, a transgenic mice (Chanrion et al., 2014); Computational modelling: creating computational models based on disease maps, for example, for rheumatoid arthritis (Miagoux et al., 2021;Aghakhani et al., 2022) and atherosclerosis (Parton et al., 2019); creating causal-interaction networks based on a disease map (Touré et al., 2021a); In preclinical studies with follow-up validation experiments: in preclinical studies in cancer (Chanrion et al., 2014;Jdey et al., 2016;Monraz Gomez et al., 2019) with validation of the proposed hypothesis by follow-up experiments; Fusing disease maps with other solutions: integrating disease maps with machine learning inferred networks (Miagoux et al., 2021); identifying new crosstalks between pathways from combined analysis of interactions and text mining datasets . ...
... Data visualisation: omics data visualisation (Kuperstein et al., 2015b;Satagopam et al., 2016;Mazein et al., 2021a); analysis of cell-specific mechanisms using single-cell expression data ; -RNA-Seq-based analysis of the activity of transcription factors ; Network-based analysis: network analysis and representing map-based molecular signatures for sub-groups of patients in cancer (Kuperstein et al., 2015b;Jdey et al., 2016); structural network analysis together with omics data to rationalise the synergistic effects of drugs towards the design of complex disease Frontiers in Bioinformatics frontiersin.org 09 stage-specific druggable interventions in preclinical studies (Monraz Gomez et al., 2019); network-based analysis and prediction of epithelial-to-mesenchymallike transition mechanisms followed by experimental validation with the use of an animal model, a transgenic mice (Chanrion et al., 2014); Computational modelling: creating computational models based on disease maps, for example, for rheumatoid arthritis (Miagoux et al., 2021;Aghakhani et al., 2022) and atherosclerosis (Parton et al., 2019); creating causal-interaction networks based on a disease map (Touré et al., 2021a); In preclinical studies with follow-up validation experiments: in preclinical studies in cancer (Chanrion et al., 2014;Jdey et al., 2016;Monraz Gomez et al., 2019) with validation of the proposed hypothesis by follow-up experiments; Fusing disease maps with other solutions: integrating disease maps with machine learning inferred networks (Miagoux et al., 2021); identifying new crosstalks between pathways from combined analysis of interactions and text mining datasets . ...
Article
Full-text available
As a conceptual model of disease mechanisms, a disease map integrates available knowledge and is applied for data interpretation, predictions and hypothesis generation. It is possible to model disease mechanisms on different levels of granularity and adjust the approach to the goals of a particular project. This rich environment together with requirements for high-quality network reconstruction makes it challenging for new curators and groups to be quickly introduced to the development methods. In this review, we offer a step-by-step guide for developing a disease map within its mainstream pipeline that involves using the CellDesigner tool for creating and editing diagrams and the MINERVA Platform for online visualisation and exploration. We also describe how the Neo4j graph database environment can be used for managing and querying efficiently such a resource. For assessing the interoperability and reproducibility we apply FAIR principles.
... (page number not for citation purposes) capture the multiple cross-talks and interactions occurring between different cell processes (1). Analysis and visualization of omics data in the context of signalling network maps can help to detect patterns in the data projected onto the molecular mechanisms there represented. ...
... Transforming text to diagram: role of p53 and NOTCH in induction of EMT. The following statements were used for diagram construction: (1). Control of EMT program is performed by SNAIL and TWIST, the major transcription factors that can induce the executors EMT program (49). ...
Article
Full-text available
Generation and usage of high-quality molecular signalling network maps can be augmented by standardizing notations, establishing curation workflows and application of computational biology methods to exploit the knowledge contained in the maps. In this manuscript, we summarize the major aims and challenges of assembling information in the form of comprehensive maps of molecular interactions. Mainly, we share our experience gained while creating the Atlas of Cancer Signalling Network. In the step-by-step procedure, we describe the map construction process and suggest solutions for map complexity management by introducing a hierarchical modular map structure. In addition, we describe the NaviCell platform, a computational technology using Google Maps API to explore comprehensive molecular maps similar to geographical maps and explain the advantages of semantic zooming principles for map navigation. We also provide the outline to prepare signalling network maps for navigation using the NaviCell platform. Finally, several examples of cancer high-throughput data analysis and visualization in the context of comprehensive signalling maps are presented.
... This allows understanding the global picture and connectivity between processes that is very difficult to keep in mind just from reading multiple scientific papers. Once the processes are depicted together as diagrams, the relationship between molecular circuits in cells can be appreciated, which makes signaling network maps also didactic tools (1). ...
... From text to modelRepresentation of biochemical reactions from the following text from a molecular biology manuscript.Numbers correspond to the reactions in the diagram: « BRCA1 transcription(1) and translation(2)is positively regulated by E2F1/BRIT1* complex(3) and inhibited by p53(4). BRCA1 protein is transported into nucleus(5), where CHEK2 kinase activates it by spesific phosphorylation(6)and(7). ...
Chapter
Full-text available
Graphical representation of biological knowledge in the form of interactive diagrams became widely used in molecular and computational biology. It enables the scientific community to exchange and discuss information on cellular processes described in numerous scientific publications and to interpret high-throughput data. Constructing a signaling network map is a laborious process, therefore application of consistent procedures for representation of molecular processes and accurately organized annotation is essential for generation of a high-quality signaling network map that can be used by various computational tools. We summarize here the major aims and challenges of assembling information in a form of comprehensive maps of molecular interactions and suggest an optimized workflow. We share our experience gained while creating a biological network resource Atlas of Cancer Signaling Network (ACSN) that was successfully applied in several studies. We explain the map construction process. Then we address the problem of user interaction with large signaling maps and suggest to facilitate navigation by hierarchical organization of map structure and by application of semantic zooming principles. In addition, we describe a computational technology using Google Maps API to explore signaling networks in the manner similar to global geographical maps and provide the outline for preparing a biological network for this type of navigation. Nowadays the most demanded application of signaling maps is integration and functional interpretation of high-throughput data. We demonstrate several examples of cancer data visualization in the context of comprehensive signaling network maps.
... ( Figure 5). We present this map in the form of a web-based atlas which is hierarchical and interconnected collection of maps browsable online [50]. The atlas depicts molecular mechanisms of Cell Cycle [8], DNA Repair, Cell Survival, Apoptosis, Epithelial-to-Mesenchymal Transition and Cell Motility and beyond. ...
... A screenshot of the cell cycle territory in the map of Atlas of Cancer Signalling Network (ACSN, http://acsn.curie.fr), with profiles of expression in several tumour samples shown on top of the protein icons[7,50]. ...
Article
The problem of dealing with complexity arises when we fail to achieve a desired behavior of biological systems (for example, in cancer treatment). In this review I formulate the problem of tackling biological complexity at the level of large-dimensional datasets and complex mathematical models of reaction networks. I show that in many cases the complexity can be reduced by using approximation by simpler objects (for example, using principal graphs for data dimension reduction, and using dominant systems for reducing complex models). Examples of dealing with complexity from various fields of molecular systems biology are used, in particular, from the analysis of cancer transcriptomes, mathematical modeling of protein synthesis and of cell fate decisions between death and life.
... Signatures and functional enrichment studies using pathways databases are suitable for stratifying cancers and understanding what molecular mechanisms are implicated in M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT 5 various cancer types, but these approaches still do not provide the clues on mechanistic basis of the disease and do not address the question of signaling network rewiring during cancer initiation and development. The step forward is to use molecular information detailed in pathways and signaling network resources as Panther [23], Spike [24], Kegg Pathway [25], Reactome [26], ACSN [27] (Table 1). These resources provide a more global picture of cell signaling with sufficient granularity of molecular detail description, capturing crosstalks and feedback loops between molecular circuits. ...
... Defining a "field of influence", the distance in the vicinity of the deregulated node, where players are assumed to have the similar impact on the pathology, is a source of potential targets for interference. Exploiting the notion of "network distance" between proteins, namely if several affected proteins create a compact group when mapped on the signaling network and therefore can be related functionally, they may represent together a set for intervention [27]. ...
Article
Signaling pathways implicated in cancer create a complex network with numerous regulatory loops and redundant pathways. This complexity explains frequent failure of one-drug-one-target paradigm of treatment, resulting in drug resistance in patients. To overcome the robustness of cell signaling network, cancer treatment should be extended to a combination therapy approach. Integrating and analyzing patient high-throughput data together with the information about biological signaling machinery may help deciphering molecular patterns specific to each patient and finding the best combinations of candidates for therapeutic targeting. We review state of the art in the field of targeted cancer medicine from the computational systems biology perspective. We summarize major signaling network resources and describe their characteristics with respect to applicability for drug response prediction and intervention targets suggestion. Thus discuss methods for prediction of drug sensitivity and intervention combinations using signaling networks together with high-throughput data. Gradual integration of these approaches into clinical routine will improve prediction of response to standard treatments and adjustment of intervention schemes. Copyright © 2015. Published by Elsevier Inc.
... Yet, when exploring previously undiscovered MoA within vast biological networks, this approach proves relevant. Shortest path analysis has already been a common method for biological research (Kuperstein et al. 2015), including PPI network analysis and gene co-expression network analysis. In PPI networks, the shortest path analysis is used for indicating proteins with similar functions or functional protein complexes in biological separate disease modules (Garcia-Vaquero et al. 2018). ...
Article
Full-text available
Motivation Many approaches in systems biology have been applied in drug repositioning due to the increased availability of the omics data and computational biology tools. Using a multi-omics integrated network, which contains information of various biological interactions, could offer a more comprehensive inspective and interpretation for the drug mechanism of action (MoA). Results We developed a computational pipeline for dissecting the hidden MoAs of drugs (Open MoA). Our pipeline computes confidence scores to edges that represent connections between genes/proteins in the integrated network. The interactions showing the highest confidence score could indicate potential drug targets and infer the underlying molecular MoAs. Open MoA was also validated by testing some well-established targets. Additionally, we applied Open MoA to reveal the MoA of a repositioned drug (JNK-IN-5A) that modulates the PKLR expression in HepG2 cells and found STAT1 is the key transcription factor. Overall, Open MoA represents a first-generation tool that could be utilized for predicting the potential MoA of repurposed drugs and dissecting de novo targets for developing effective treatments. Availability and implementation Source code is available at https://github.com/XinmengLiao/Open_MoA.
... This paper uses "cost" to indicate the distance, time, and other physical attributes of a network. It is applied in a variety of domains (both as a stand-alone model and as a subproblem in complex problems), such as transportation [1][2][3], biological networks [4], social networks [5], circuit board layout [6], and robotic search/navigation processes [7][8][9][10][11][12][13]. In recent years, the rapid development of sensing and Internet of Things (IoT) technology has allowed us to obtain environmental changes rapidly [14,15]. Since such changes impact SPP solutions, it becomes increasingly important for us to find SPP solutions efficiently. ...
Article
Full-text available
Shortest path problems are encountered in many engineering applications, e.g., intelligent transportation, robot path planning, and smart logistics. The environmental changes as sensed and transmitted via the Internet of Things make the shortest path change frequently, thus posing ever-increasing difficulty for traditional methods to meet the real-time requirements of many applications. Therefore, developing more efficient solutions has become particularly important. This paper presents an improved discrete Jaya algorithm (IDJaya) to solve the shortest path problem. A local search operation is applied to expand the scope of solution exploration and improve solution quality. The time complexity of IDJaya is analyzed. Experiments are carried out on seven real road networks and dense graphs in transportation-related processes. IDJaya is compared with the Dijkstra and ant colony optimization (ACO) algorithms. The results verify the superiority of the IDJaya over its peers. It can thus be well utilized to meet real-time application requirements.
... • network analysis and representing map-based molecular signatures for sub-groups of patients in cancer [64,65]; • structural network analysis together with omics data to rationalise the synergistic effects of drugs towards the design of complex disease stage-specific druggable interventions in preclinical studies [66]; • network-based analysis and prediction of epithelial-to-mesenchymal-like transition mechanisms followed by experimental validation with the use of an animal model, a transgenic mice [67]; ...
Preprint
Full-text available
As a conceptual model of disease mechanisms, a disease map integrates available knowledge and is applied for data interpretation, predictions and hypothesis generation. It is possible to model disease mechanisms on different levels of granularity and adjust the approach to the goals of a particular project. This rich environment together with requirements for high-quality network reconstruction makes it challenging for new curators and groups to be quickly introduced to the development methods. In this review, we offer a step-by-step guide for developing a disease map within its mainstream pipeline that involves using the CellDesigner tool for creating and editing diagrams and the MINERVA Platform for online visualisation and exploration. We also describe how the Neo4j graph database environment can be used for managing and querying efficiently such a resource. For assessing the interoperability and reproducibility we apply FAIR principles.
... For example, VANTED tool [5] creates a classification tree according to the KEGG pathway hierarchy and shows a biological network with omics data as barplots or pie-charts attached to the nodes which allows to visualize more complex data than by simple node coloring. NaviCell [6] and related pathway database Atlas of Cancer Signalling Network (ACSN) together with standard heat maps and barplots provide more flexible data visualization tools such as glyphs (symbols with configurable shape, size and color) and map staining (using the network background for visualization) [7]. An interesting approach for data visualization using biological networks was developed in NetGestalt online tool [8] which uses a NetSAM R package to create modules by hierarchical ordering of the network in one dimension and visualizes high-throughput data accordingly to a chosen track as a combination of barplots and heat maps. ...
Preprint
Full-text available
Visualization and analysis of molecular profiling data together with biological networks are able to provide new mechanistical insights into biological functions. Currently, high-throughput data are usually visualized on top of predefined network layouts which are not always adapted to a given data analysis task. We developed a Cytoscape app which allows to construct biological network layouts based on the data from molecular profiles imported as values of nodes attributes. DeDaL is a Cytoscape 3.0 app which uses linear and non-linear algorithms of dimension reduction to produce data-driven network layouts based on multidimensional data (typically gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another one by rotating and mirroring. Combining these possibilities facilitates creating insightful network layouts representing both structural network features and the correlation patterns in multivariate data. DeDaL is the first method allowing to construct biological network layouts from high-throughput data. DeDaL is freely available for downloading together with step-by-step tutorial at http://bioinfo-out.curie.fr/projects/dedal/.
... Tthe shortest path algorithm. The shortest path algorithm, one of network link algorithm, is used to intelligently identify the shortest connection between two genes or proteins in a graphical model that represents a cellular network 100,101 . The algorithm is illustrated in Fig. 3 and Algorithm 1. ...
Article
Full-text available
Artificial intelligence is an advanced method to identify novel anticancer targets and discover novel drugs from biology networks because the networks can effectively preserve and quantify the interaction between components of cell systems underlying human diseases such as cancer. Here, we review and discuss how to employ artificial intelligence approaches to identify novel anticancer targets and discover drugs. First, we describe the scope of artificial intelligence biology analysis for novel anticancer target investigations. Second, we review and discuss the basic principles and theory of commonly used network-based and machine learning-based artificial intelligence algorithms. Finally, we showcase the applications of artificial intelligence approaches in cancer target identification and drug discovery. Taken together, the artificial intelligence models have provided us with a quantitative framework to study the relationship between network characteristics and cancer, thereby leading to the identification of potential anticancer targets and the discovery of novel drug candidates.
... Detailed descriptions of disease mechanisms on the level of molecular processes have recently become available [1,2], with many examples of practical applications in the field of cancer research [3][4][5][6][7]. These disease maps are needed for integrating scattered knowledge and for advanced data interpretation and hypothesis generation [1,2]. ...
Article
Full-text available
Detailed maps of the molecular basis of the disease are powerful tools for interpreting data and building predictive models. Modularity and composability are considered necessary network features for large-scale collaborative efforts to build comprehensive molecular descriptions of disease mechanisms. An effective way to create and manage large systems is to compose multiple subsystems. Composable network components could effectively harness the contributions of many individuals and enable teams to seamlessly assemble many individual components into comprehensive maps. We examine manually built versions of the RAS–RAF–MEK–ERK cascade from the Atlas of Cancer Signalling Network, PANTHER and Reactome databases and review them in terms of their reusability and composability for assembling new disease models. We identify design principles for managing complex systems that could make it easier for investigators to share and reuse network components. We demonstrate the main challenges including incompatible levels of detail and ambiguous representation of complexes and highlight the need to address these challenges.
... For example, some cancers induce higher mortality in men [19], whereas other tumors have shown significant differences in response to treatment in female patients [20]. The advent of high-throughput gene expression technologies has increased understanding of molecular correlates of malignancy, providing novel ways to stratify patients, determine prognosis, and predict sensitivity to therapeutic treatments (reviewed in [21]). Molecular signatures associated to cancer have demonstrated that some types of cancers have sex-biased gene expression [22]. ...
Article
Full-text available
Simple Summary Sex differences in tumor incidence and mortality have been documented for many different cancer types. In malignant pleural mesothelioma, a deadly disease, many studies have shown that women not only develop this cancer less frequently than men, but those who do are likely to live longer after surgery. These differences have been postulated to reflect circulating estrogen levels and tumor expression of estrogen receptors that may influence tumor progression. We identified high expression of the RAS like estrogen regulated growth inhibitor gene (RERG), to correlate with longer survival after surgery among women. Survival in men was not associated with RERG expression. Additionally, we found no association between survival and tumor expression of estrogen receptor genes. Additional studies are needed to elucidate any role RERG may play in mesothelioma, and whether estrogen may be involved. Abstract Sex differences in incidence, prognosis, and treatment response have been described for many cancers. In malignant pleural mesothelioma (MPM), a lethal disease associated with asbestos exposure, men outnumber women 4 to 1, but women consistently live longer than men following surgery-based therapy. This study investigated whether tumor expression of genes associated with estrogen signaling could potentially explain observed survival differences. Two microarray datasets of MPM tumors were analyzed to discover estrogen-related genes associated with survival. A validation cohort of MPM tumors was selected to balance the numbers of men and women and control for competing prognostic influences. The RAS like estrogen regulated growth inhibitor (RERG) gene was identified as the most differentially-expressed estrogen-related gene in these tumors and predicted prognosis in discovery datasets. In the sex-matched validation cohort, low RERG expression was significantly associated with increased risk of death among women. No association between RERG expression and survival was found among men, and no relationship between estrogen receptor protein or gene expression and survival was found for either sex. Additional investigations are needed to elucidate the molecular mechanisms underlying this association and its sex specificity.
... Detailed descriptions of disease mechanisms on the level of molecular processes have recently become available [1,2] , with many examples of practical applications in the field of cancer research [3][4][5][6][7] . These disease maps are needed for integrating scattered knowledge and for advanced data interpretation and hypothesis generation [1,2] . ...
Preprint
Full-text available
Detailed maps of the molecular basis of the disease are powerful tools for interpreting data and building predictive models. Modularity and composability are considered necessary network features for large-scale collaborative efforts to build comprehensive molecular descriptions of disease mechanisms. An effective way to create and manage large systems is to compose multiple subsystems. Composable network components could effectively harness the contributions of many individuals and enable teams to seamlessly assemble many individual components into comprehensive maps. We examine manually-built versions of the RAS-RAF-MEK-ERK cascade from the Atlas of Cancer Signalling Network, PANTHER and Reactome databases and review them in terms of their reusability and composability for assembling new disease models. We identify design principles for managing complex systems that could make it easier for investigators to share and reuse network components. We demonstrate the main challenges including incompatible levels of detail and ambiguous representation of complexes and highlight the need to address these challenges.
... Beyond identifying therapeutic targets, multi-omic data have enhanced the understanding of tumor biology, providing novel ways to stratify patients, determining prognosis and predicting sensitivity to existing treatments (reviewed in [80]). ...
Chapter
Full-text available
Malignant pleural mesothelioma (MPM) is a highly aggressive tumor that arises from the mesothelial cells lining the pleural cavity. Asbestos is considered the major factor in the pathogenesis of this malignancy, with more than 80% of patients with a history of asbestos exposure. MPM is characterized by a long latency period, typically 20–40 years from the time of asbestos exposure to diagnosis, suggesting that multiple somatic genetic alterations are required for the tumorigenic conversion of a mesothelial cell. In the last few years, advancements in next-generation sequencing and “–omics” technologies have revolutionized the field of genomics and medical diagnosis. The focus of this chapter is to summarize recent studies which explore the molecular mechanisms underlying this disease and identify potential therapeutic targets in MPM.
... Representation of biochemical reactions from the following statements. Numbers correspond to the reactions in the diagram: « BRCA1 transcription (1) and translation (2) is positively regulated by E2F1/BRIT1* complex (3) and inhibited by p53 (4). BRCA1 protein is transported into nucleus (5), where CHEK2 kinase activates it by specific phosphorylation (6) and (7). ...
Preprint
Full-text available
Generation and usage of high-quality molecular signalling network maps can be augmented by standardising notations, establishing curation workflows and application of computational biology methods to exploit the knowledge contained in the maps. In this manuscript, we summarize the major aims and challenges of assembling information in the form of comprehensive maps of molecular interactions. Mainly, we share our experience gained while creating the Atlas of Cancer Signalling Network. In the step-by-step procedure, we describe the map construction process and suggest solutions for map complexity management by introducing a hierarchical modular map structure. In addition, we describe the NaviCell platform, a computational technology using Google Maps API to explore comprehensive molecular maps similar to geographical maps, and explain the advantages of semantic zooming principles for map navigation. We also provide the outline to prepare signalling network maps for navigation using the NaviCell platform. Finally, several examples of cancer high-throughput data analysis and visualization in the context of comprehensive signalling maps are presented.
... To understand the functional basis of network merging, relationships between drug-target and host-pathogen components were investigated on the basis of the shortest path parameter (a proxy to a 'functional distance' between proteins ( Kuperstein et al., 2015)) connecting them within DPI network. The shortest paths were calculated using Dijkstra algorithm (Dijkstra, 1959), and it was observed that these followed a normal distribution as indicated by the Shapiro-Wilk test (p-value ¼ 0.01). ...
Article
Nipah Virus (NiV) is a newly emergent paramyxovirus that has caused various outbreaks in Asian countries. Despite its acute pathogenicity and lack of approved therapeutics for human use, there is an urgent need to determine inhibitors against NiV. Hence, this work includes prospection of potential entry inhibitors by implementing an integrative structure- and network-based drug discovery approach. FDA-approved drugs were screened against attachment glycoprotein (NiV-G, PDB: 2VSM), one of the prime targets to inhibit viral entry, using a molecular docking approach that was benchmarked both on CCDC/ASTEX and known NIV-G inhibitor set. The predicted small molecules were prioritized on the basis of topological analysis of the chemical-protein interaction network, which was inferred by integrating the drug-target network, NiV-human interaction network, and human protein-protein interaction network. A total of 17 drugs were predicted to be NiV-G inhibitors using molecular docking studies that were further prioritized to 3 novel leads − Nilotinib, Deslanoside and Acetyldigitoxin − on the basis of topological analysis of inferred chemical-protein interaction network. While Deslanoside and Acetyldigitoxin belong to an already known class of anti-NiV inhibitors, Nilotinib belongs to Benzenoids chemical class that has not been reported hitherto for developing anti-NiV inhibitors. These identified drugs are expected to be successful in further experimental evaluation and therefore could be used for anti-Nipah drug discovery. Apart, we also obtained various insights into the underlying chemical-protein interaction network, based on which several important network nodes were predicted. The applicability of our proposed approach was also demonstrated by prospecting for anti-NiV phytochemicals on an independent dataset. Communicated by Ramaswamy H. Sarma
... The distribution of gene weights from s k vectors can be projected on top of genome-wide biological network reconstructions where the network edges represent different types of interactions or regulations between genes and/or proteins. This can be further used for various types of network-based analyses, leading to the determination of biological network "hotspot" areas and eliminating the need of having a reference gene set collection [50]. The s k vectors (resulting from the analysis of transcriptomic or methylome data) can be projected onto genome and be a subject of peak-calling analysis, which can sometimes lead to associating a component to genomic alterations [33]. ...
Article
Full-text available
Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.
... The MAPK signalling network is coordinated with various processes implicated in cell survival, and currently included into the Cell Survival map of ACSN. The strategy involves two steps: (i) identification of tumour stage-specific active functional modules, i.e. sets of MAPK signalling network components that are transcriptionally deregulated in bladder cancer [37] compared with normal samples, and (ii) computation of intervention sets of MAPK map components, whose disruption block all the proliferative paths fostered by the identified active functional modules in bladder cancer [38] ( Figure 8A). ...
Article
Cancer initiation and progression are associated with multiple molecular mechanisms. The knowledge of these mechanisms is expanding and should be converted into guidelines for tackling the disease. Here, we discuss the formalization of biological knowledge into a comprehensive resource: the Atlas of Cancer Signalling Network (ACSN) and the Google Maps-based tool NaviCell, which supports map navigation. The application of ACSN for omics data visualization, in the context of signalling maps, is possible via the NaviCell Web Service module and through the NaviCom tool. It allows generation of network-based molecular portraits of cancer using multilevel omics data. We review how these resources and tools are applied for cancer preclinical studies. Structural analysis of the maps together with omics data helps to rationalize the synergistic effects of drugs and allows design of complex disease stage-specific druggable interventions. The use of ACSN modules and maps as signatures of biological functions can help in cancer data analysis and interpretation. In addition, they empowered finding of associations between perturbations in particular molecular mechanisms and the risk to develop a specific type of cancer. These approaches are helpful, among others, to study the interplay between molecular mechanisms of cancer. It opens an opportunity to decipher how gene interactions govern the hallmarks of cancer in specific contexts. We discuss a perspective to develop a flexible methodology and a pipeline to enable systematic omics data analysis in the context of signalling network maps, for stratifying patients and suggesting interventions points and drug repositioning in cancer and other diseases.
... A popular method to leverage this prior knowledge consists in using a diffusion process on the gene network. This technique first appeared for the analysis of gene expression and GWAS data [Köhler et al., 2008;Kuperstein et al., 2015;Qian et al., 2014;Rapaport et al., 2007;Vanunu et al., 2010], and has more recently been used for mutation profiles [Babaei et al., 2013;Hofree et al., 2013;Hou and Ma, 2014;Jia and Zhao, 2014;Vandin et al., 2011]. Network diffusion processes allow smoothing binary vectors of somatic gene mutations into non-negative real-valued vectors of mutational statuses, where the mutational status of a gene increases when it is close to mutated genes in the network. ...
Thesis
Since the first sequencing of the human genome in the early 2000s, large endeavours have set out to map the genetic variability among individuals, or DNA alterations in cancer cells. They have laid foundations for the emergence of precision medicine, which aims at integrating the genetic specificities of an individual with its conventional medical record to adapt treatment, or prevention strategies.Translating DNA variations and alterations into phenotypic predictions is however a difficult problem. DNA sequencers and microarrays measure more variables than there are samples, which poses statistical issues. The data is also subject to technical biases and noise inherent in these technologies. Finally, the vast and intricate networks of interactions among proteins obscure the impact of DNA variations on the cell behaviour, prompting the need for predictive models that are able to capture a certain degree of complexity. This thesis presents novel methodological contributions to address these challenges. First, we define a novel representation for tumour mutation profiles that exploits prior knowledge on protein-protein interaction networks. For certain cancers, this representation allows improving survival predictions from mutation data as well as stratifying patients into meaningful subgroups. Second, we present a new learning framework to jointly handle data normalisation with the estimation of a linear model. Our experiments show that it improves prediction performances compared to handling these tasks sequentially. Finally, we propose a new algorithm to scale up sparse linear models estimation with two-way interactions. The obtained speed-up makes this estimation possible and efficient for datasets with hundreds of thousands of main effects, thereby extending the scope of such models to the data from genome-wide association studies.
... Cluster of Cluster Assignments (COCA) [137] in breast cancer [138] and iCluster [139] in application to prostate cancer [42] and hepatocellular carcinoma [140]. Alternatively, network-based approaches [141][142][143] to data analysis have the potential to integrate data from disparate sources, while providing clinically relevant results. Multidisciplinary initiatives such as molecular tumour boards [144,145], which bring together bioinformaticians, biologists and clinicians, can also help address the issue of translating complex data to be relevant to clinical care providers and patients. ...
Article
Full-text available
There has been an exponential growth in the performance and output of sequencing technologies (omics data) with full genome sequencing now producing gigabases of reads on a daily basis. These data may hold the promise of personalized medicine, leading to routinely available sequencing tests that can guide patient treatment decisions. In the era of high-throughput sequencing (HTS), computational considerations, data governance and clinical translation are the greatest rate-limiting steps. To ensure that the analysis, management and interpretation of such extensive omics data is exploited to its full potential, key factors, including sample sourcing, technology selection and computational expertise and resources, need to be considered, leading to an integrated set of high-performance tools and systems. This article provides an up-to-date overview of the evolution of HTS and the accompanying tools, infrastructure and data management approaches that are emerging in this space, which, if used within in a multidisciplinary context, may ultimately facilitate the development of personalized medicine.
... The shortest path problem is a classical problem [1], [2], but is also a tough issue especially in large-scale network [3], [4]. It appears in many practical applications, such as transportation networks [5], [6], [7], isometric feature mapping [8], biological networks analysis [9], subgraph similarity matching [10], pattern mining [11], [12], [13], RDF clustering [14], and social networks [15]. Motivated by these applications, a variety of shortest path problems have been investigated. ...
Article
Full-text available
The A-star algorithm is an efficient classical algorithm for solving the shortest path problem. The efficiency of the algorithm depends on the evaluation function, which is used to estimate the heuristic value of the shortest path from the current vertex to the target. When the vertex coordinates are known, the heuristic value of the shortest path is usually generated by the distance. In this paper, we present an Index-Based A-Star algorithm, IBAS, which aims to solve the shortest path problem in a weighted directed acyclic graph with unknown vertex coordinates. This paper constructs three indexes for each vertex, i.e., the earliest arrival index, reverse earliest arrival index, and latest arrival index. We can compute the lower bound and the upper bound of the shortest distance from the source vertex to the target based on the three indexes and prune the intermediate vertice which are not in shortest path according to the lower and upper bounds. The IBAS algorithm not only makes use of the earliest arrival index to construct the evaluation function of the A-star algorithm, but also utilizes the three indexes to prune useless vertices, so as to improve the performance of the algorithm. Compared with the A-star algorithm, the additional time complexity and space complexity of the IBAS algorithm are O(|V| + |E|) and O(|V|), respectively. A real road network and benchmark datasets with large-scale network are selected to verify the performance of IBAS. Experimental results verify the effectiveness of the proposed algorithm.
... To our knowledge, the Google matrix approach or related ideas have been applied before in systems biology mostly to undirected networks, with few exceptions (e.g., [20]), in order to find activated network modules or to "smooth" high-throughput data [21][22][23], to establish connection of genes to diseases [24,25], to improve interpretability of genome-wide analyses [26][27][28] and to compute network-based cancer biomarkers [29,30]. PageRank approach has been used to quantify the functional proximity in undirected protein-protein interactions networks [31]. ...
Article
Full-text available
Signaling pathways represent parts of the global biological molecular network which connects them into a seamless whole through complex direct and indirect (hidden) crosstalk whose structure can change during development or in pathological conditions. We suggest a novel methodology, called Googlomics, for the structural analysis of directed biological networks using spectral analysis of their Google matrices, using parallels with quantum scattering theory, developed for nuclear and mesoscopic physics and quantum chaos. We introduce analytical “reduced Google matrix” method for the analysis of biological network structure. The method allows inferring hidden causal relations between the members of a signaling pathway or a functionally related group of genes. We investigate how the structure of hidden causal relations can be reprogrammed as a result of changes in the transcriptional network layer during cancerogenesis. The suggested Googlomics approach rigorously characterizes complex systemic changes in the wiring of large causal biological networks in a computationally efficient way.
... A popular method to leverage this prior knowledge consists in using a diffusion process on the gene network. This technique first appeared for the analysis of gene expression and GWAS data [21][22][23][24][25], and has more recently been used for mutation profiles [26][27][28][29][30][31]. Network diffusion processes allow smoothing binary vectors of somatic gene mutations into non-negative real-valued vectors of mutational statuses, where the mutational status of a gene increases when it is close to mutated genes in the network. ...
Article
Full-text available
Author summary The transition from a normal cell to a cancer cell is driven by genetic alterations, such as mutations, that induce uncontrolled cell proliferation. With the advent of next-generation sequencing technologies (NGS) in the last decade, thousands of tumours have been sequenced and their mutation profiles determined. However, the statistical analysis of these mutation profiles remains challenging. Indeed, two patients usually do not share the same set of mutations and can even have none in common. Moreover, it is difficult to distinguish the few disease-causing mutations from the dozens, often hundreds of mutations observed in a tumour. To alleviate these challenges, it has been proposed to use gene-gene interaction networks as prior knowledge, with the idea that if a gene is mutated and non-functional, then its interacting neighbours might not be able to fulfil their function as well. Here we propose NetNorM, a method that transforms mutation data using gene networks so as to make mutation profiles more amenable to statistical learning. We show that NetNorM significantly improves the prognostic power of mutation data compared to previous approaches, and allows defining meaningful groups of patients based on their mutation profiles.
... This problem can be addressed with a systems biology approach -how to interpret expression and mutation data and take them to a higher level of understanding. Here we will briefly describe basic applications of systems biology and data integration, while more details on this topic could be found in other publications [146][147][148]. ...
Article
Full-text available
Nowadays, the personalized approach to health care and cancer care in particular is becoming more and more popular and is taking an important place in the translational medicine paradigm. In some cases, detection of the patient-specific individual mutations that point to a targeted therapy has already become a routine practice for clinical oncologists. Wider panels of genetic markers are also on the market which cover a greater number of possible oncogenes including those with lower reliability of resulting medical conclusions. In light of the large availability of high-throughput technologies, it is very tempting to use complete patient-specific New Generation Sequencing (NGS) or other "omics" data for cancer treatment guidance. However, there are still no gold standard methods and protocols to evaluate them. Here we will discuss the clinical utility of each of the data types and describe a systems biology approach adapted for single patient measurements. We will try to summarize the current state of the field focusing on the clinically relevant case-studies and practical aspects of data processing.
... In order to explain the 17 LST hi cases with neither BRCA1/2 nor RAD51C inactivation 317 genes involved in DNA damage signaling and repair were explored. 44,45 No deleterious mutation associated with LOH was found in the 17 LST hi unexplained cases. Strikingly, cases with bi-allelic inactivation of WRN (1 case) and ATM (six cases) belonged to the LST lo subgroup. ...
Article
Full-text available
Therapeutic strategies targeting Homologous Recombination Deficiency (HRD) in breast cancer requires patient stratification. The LST (Large-scale State Transitions) genomic signature previously validated for triple-negative breast carcinomas (TNBC) was evaluated as biomarker of HRD in luminal (hormone receptor positive) and HER2-overexpressing (HER2+) tumors. The LST genomic signature related to the number of large-scale chromosomal breakpoints in SNP-array tumor profile was applied to identify HRD in in-house and TCGA sets of breast tumors, in which the status of BRCA1/2 and other genes was also investigated. In the in-house dataset, HRD was predicted in 5% (20/385) of sporadic tumors luminal or HER2+ by the LST genomic signature and the inactivation of BRCA1, BRCA2 or RAD51C confirmed this prediction in 75% (12/16) of the tested cases. In 14% (6/43) of tumors occurring in BRCA1/2 mutant carriers, the corresponding wild-type allele was retained emphasizing the importance of determining the tumor status. In the TCGA luminal and HER2+ subtypes HRD incidence was estimated at 5% (18/329, 95%CI:5-8%) and 2% (1/59, 95%CI:2-9%), respectively. In TNBC cisplatin-based neo-adjuvant clinical trials, HRD is shown to be a necessary condition for cisplatin sensitivity. This analysis demonstrates the high performance of the LST genomic signature for HRD detection in breast cancers, which suggests its potential as a biomarker for genetic testing and patient stratification for clinical trials evaluating platinum salts and PARP inhibitors. This article is protected by copyright. All rights reserved. © 2014 Wiley Periodicals, Inc. © 2015 UICC.
... web service.html. In Supplementary Materials, we provide two case studies demonstrating visualization of ovary cancer data obtained from The Cancer Genome Atlas (21) on the large map of Atlas of Cancer Signalling Network (22) and an example of using the non-CellDesigner network map of the Ewing's sarcoma signalling network (23) for visualizing transcriptomic time series data. ...
Article
Full-text available
Data visualization is an essential element of biological research, required for obtaining insights and formulating new hypotheses on mechanisms of health and disease. NaviCell Web Service is a tool for network-based visualization of 'omics' data which implements several data visual representation methods and utilities for combining them together. NaviCell Web Service uses Google Maps and semantic zooming to browse large biological network maps, represented in various formats, together with different types of the molecular data mapped on top of them. For achieving this, the tool provides standard heatmaps, barplots and glyphs as well as the novel map staining technique for grasping large-scale trends in numerical values (such as whole transcriptome) projected onto a pathway map. The web service provides a server mode, which allows automating visualization tasks and retrieving data from maps via RESTful (standard HTTP) calls. Bindings to different programming languages are provided (Python and R). We illustrate the purpose of the tool with several case studies using pathway maps created by different research groups, in which data visualization provides new insights into molecular mechanisms involved in systemic diseases such as cancer and neurodegenerative diseases. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
... For example, VANTED tool [5] creates a classification tree according to the KEGG pathway hierarchy and shows a biological network with omics data as barplots or pie-charts attached to the nodes which allows to visualize more complex data than by simple node coloring. NaviCell [6] and related pathway database Atlas of Cancer Signalling Network (ACSN) together with standard heat maps and barplots provide more flexible data visualization tools such as glyphs (symbols with configurable shape, size and color) and map staining (using the network background for visualization) [7]. An interesting approach for data visualization using biological networks was developed in NetGestalt online tool [8] which uses a NetSAM R package to create modules by hierarchical ordering of the network in one dimension and visualizes high-throughput data accordingly to a chosen track as a combination of barplots and heat maps. ...
Article
Visualization and analysis of molecular profiling data together with biological networks are able to provide new mechanistic insights into biological functions. Currently, it is possible to visualize high-throughput data on top of pre-defined network layouts, but they are not always adapted to a given data analysis task. A network layout based simultaneously on the network structure and the associated multidimensional data might be advantageous for data visualization and analysis in some cases. We developed a Cytoscape app, which allows constructing biological network layouts based on the data from molecular profiles imported as values of node attributes. DeDaL is a Cytoscape 3 app, which uses linear and non-linear algorithms of dimension reduction to produce data-driven network layouts based on multidimensional data (typically gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another one by rotating and mirroring. The combination of all these functionalities facilitates the creation of insightful network layouts representing both structural network features and correlation patterns in multivariate data. We demonstrate the added value of applying DeDaL in several practical applications, including an example of a large protein-protein interaction network. DeDaL is a convenient tool for applying data dimensionality reduction methods and for designing insightful data displays based on data-driven layouts of biological networks, built within Cytoscape environment. DeDaL is freely available for downloading at http://bioinfo-out.curie.fr/projects/dedal/ .
Article
Plants are under the influence of various stresses that negatively impact their growth and development. Despite vast understanding of stress-related pathways and genes, significant success in developing stress-resistant crops is not achieved yet. At molecular-level, this is attributed to incomplete or partial understanding of regulatory interactions among key pathway genes that might provide new insights into the molecular pathways/mechanisms, thus could help developing new stress-resistant plant varieties. Therefore, in this work, by taking into account the interactions among stress-related genes, an integrated computational pipe-line was implemented to predict the most potential ‘key genes’ in model plant Arabidopsis thaliana during the three most common abiotic stress conditions—salt, heat and cold. Overall, the sequential gene selection approach is comprised of (i) differential expression studies among stress and control samples (ii) inferring stress-related gene co-expression networks, and (iii) logistic regression-based gene prioritization. During the analyses, various key candidate genes were predicted in salt, heat and cold stress including a significant number of cross-talking genes among stresses. Comprehensive analyses also provided various systems-level insights into the topological characteristics of stress-related networks. Gene ontology enrichment analyses also indicated potential roles of identified candidate genes in stress. Applicability of this approach in selecting the key stress-related genes was also demonstrated in tomato (Solanum lycopersicum) that identified gene Solyc01g097190 to be coordinating among salt, heat and cold stress; this yet uncharacterised Arabidopsis homolog gene might provide ample scope for exploring its roles in multiple stresses. Overall, this computational approach could be advanced (by including other omics data) and implemented to predict candidate stress-related genes in other crops or economically important plants as well.
Article
Hematopoietic stem cells (HSCs) undergo functional deterioration with increasing age that causes loss of their self-renewal and regenerative potential. Despite various efforts, significant success in identifying molecular regulators of HSC aging has not been achieved, one prime reason being the non-availability of appropriate human HSC samples. To demonstrate the scope of integrating and re-analyzing the HSC transcriptomics data available, we used existing tools and databases to structure a sequential data analysis pipeline to predict potential candidate genes, transcription factors, and microRNAs simultaneously. This sequential approach comprises (i) collecting matched young and aged mice HSC sample datasets, (ii) identifying differentially expressed genes, (iii) identifying human homologs of differentially expressed genes, (iv) inferring gene co-expression network modules, and (v) inferring the microRNA-transcription factor-gene regulatory network. Systems-level analyses of HSC interaction networks provided various insights based on which several candidates were predicted. For example, 16 HSC aging-related candidate genes were predicted (e.g., CD38, BRCA1, AGTR1, GSTM1, etc.) from GCN analysis. Following this, the shortest path distance-based analyses of the regulatory network predicted several novel candidate miRNAs and TFs. Among these, miR-124-3p was a common regulator in candidate gene modules, while TFs MYC and SP1 were identified to regulate various candidate genes. Based on the regulatory interactions among candidate genes, TFs, and miRNAs, a potential regulation model of biological processes in each of the candidate modules was predicted, which provided systems-level insights into the molecular complexity of each module to regulate HSC aging.
Article
Delaying the human aging process and thus eliminating the risk factors for age-related diseases is one of the prime objectives. While various aging-associated genes and proteins have been characterized, which provide a significant understanding of the human aging process, a significant success in regulating aging is not achieved yet. Understanding how aging proteins interact with each other and also with other proteins could provide important insights into the underlying mechanisms governing aging. Therefore, in this work, information of gene expression was included to the static aging-related protein interactome to understand the network-based relationships among aging-related essential (AE) proteins, aging-related non-essential (ANE) proteins, and housekeeping-proteins that could regulate or influence aging. Comprehensive analyses provided various systems-level insights into the regulatory characteristics of aging; for example, (i) network-based correlation analysis predicted functional relationships among AE proteins and ANE proteins; (ii) network variability analysis predicted aging to affect different tissues in strikingly different ways by differentially regulating various regulatory interactions; (iii) cross-network comparisons identified two aging-related modules to be significantly conserved across most of the tissues. The findings obtained during this study could be helpful for researchers to delay, prevent, or even reverse various aspects of the aging.
Article
Full-text available
The development of computational approaches in systems biology has reached a state of maturity that allows their transition to systems medicine. Despite this progress, intuitive visualisation and context-dependent knowledge representation still present a major bottleneck. In this paper, we describe the Disease Maps Project, an effort towards a community-driven computationally readable comprehensive representation of disease mechanisms. We outline the key principles and the framework required for the success of this initiative, including use of best practices, standards and protocols. We apply a modular approach to ensure efficient sharing and reuse of resources for projects dedicated to specific diseases. Community-wide use of disease maps will accelerate the conduct of biomedical research and lead to new disease ontologies defined from mechanism-based disease endotypes rather than phenotypes.
Article
The fuzzy optimal path under uncertainty is one of the basic network optimization problems. Considering the uncertain environment, many fuzzy numbers are used to represent the edge weights, such as interval number and triangular fuzzy number. Then, these fuzzy numbers are converted to real numbers directly. This converting makes the optimal path the shortest path selection problem. However, much information of uncertainty get lost when converting fuzzy numbers to real numbers. In order to ensure all the origan data complete, in this paper, a fuzzy optimal path solving model based on the Monte Carlo method and adaptive amoeba algorithm is proposed. In Monte Carlo process, a random number which belongs to the fuzzy number is generated. Then, Physarum polycephalum algorithm is used to solve the shortest path every time and record the result. After many times calculation, many shortest paths have been found and recorded. At last, by analysing the characters of all the results, the optimal path can be selected. Several numerical examples are given to illustrate the effectiveness of the proposed method, the results show that the proposed method can deal with the fuzzy optimal path problems effectively.
Article
Full-text available
Background: Internal medicine is in flux because of the 'omics revolution', with cancer medicine being a good example. Molecular technologies that detect alterations in gene-based structure or function are having an impact on diagnosis, prognosis and treatment of cancer. Objective: In this article, recent advances in gene-based characterisation of cancer are presented, and illustrated where possible by clinical applications. Discussion: The research-based vision of precision medicine is now on its way to becoming a clinical reality. A key limiting factor is the small number of therapeutic options available for customisation, which contrasts with the rising abundance of omics-derived data. However, further translational progress is anticipated over the next decade.
Chapter
Full-text available
In this study, we will explore the protein interaction between Nucleotide Binding Domain (NBD) of human heat shock 70 kDa protein (Hsp70) and E1A 32 kDa motif (PNLVP) of human adenovirus serotype 5 (Ad5) in the induction of viral replication. This protein interaction may enhance tumor cell death rate in cancer treatment. Unfortunately, the specific protein interaction between NBD and PNLVP motif is still unknown. To investigate this protein interaction, you will need to construct three dimensional structures of NBD mutants (K71L and T204V) and study its physiochemical characterization using ESBRI, Cys_Recand SOPMA (Self-Optimized Prediction Method from Alignment) servers. After that, you will determine its stabilities by potential energy analysis after run the 50 ns Molecular Dynamics (MD) simulation. Then, the stable structure of NBD will be docked with the PNLVP motifusing Autodock version 4.2 and performed for 50 ns MD simulation. Finally, hydrogen bonds, Secondary Structures and Surface Accessible Solvent Area (SASA) analyses will be carried out to determine the most stable and best binding affinity with PNLVP motif among all the three protein-ligand complexes. Thus, the Hsp70 structure-based drug discovery may be potential as a cancer treatment.
Article
The knowledge of cell molecular mechanisms implicated in human diseases is expanding and should be converted into guidelines for deciphering pathological cell signaling and suggesting appropriate treatment. The basic assumption is that during a pathological transformation, the cell does not create new signaling mechanisms, but rather it hijacks the existing molecular programs. This affects not only intracellular functions, but also a crosstalk between different cell types resulting in a new, yet pathological status of the system. There is a certain combination of molecular characteristics dictating specific cell signaling states that sustains the pathological disease status. Identifying and manipulating the key molecular players controlling these cell signaling states, and shifting the pathological status toward the desired healthy phenotype, are the major challenge for molecular biology of human diseases.
Thesis
Full-text available
The knowledge of cell molecular mechanisms implicated in human diseases is expanding and should be converted into guidelines for deciphering pathological cell signaling and suggesting appropriate treatment. The basic assumption is that during a pathological transformation, the cell does not create new signaling mechanisms, but rather it hijacks the existing molecular programs. This affects not only intracellular functions, but also a crosstalk between different cell types resulting in a new, yet pathological status of the system. There is a certain combination of molecular characteristics dictating specific cell signaling states that sustains the pathological disease status. Identifying and manipulating the key molecular players controlling these cell signaling states, and shifting the pathological status toward the desired healthy phenotype, are the major challenge for molecular biology of human diseases. http://arxiv.org/abs/1512.05234
Article
Full-text available
We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at .10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.
Article
Full-text available
Treatment of BRAF(V600E) mutant melanoma by small molecule drugs that target the BRAF or MEK kinases can be effective, but resistance develops invariably1, 2. In contrast, colon cancers that harbour the same BRAF(V600E) mutation are intrinsically resistant to BRAF inhibitors, due to feedback activation of the epidermal growth factor receptor (EGFR)3, 4. Here we show that 6 out of 16 melanoma tumours analysed acquired EGFR expression after the development of resistance to BRAF or MEK inhibitors. Using a chromatin-regulator-focused short hairpin RNA (shRNA) library, we find that suppression of sex determining region Y-box 10 (SOX10) in melanoma causes activation of TGF-β signalling, thus leading to upregulation of EGFR and platelet-derived growth factor receptor-β (PDGFRB), which confer resistance to BRAF and MEK inhibitors. Expression of EGFR in melanoma or treatment with TGF-β results in a slow-growth phenotype with cells displaying hallmarks of oncogene-induced senescence. However, EGFR expression or exposure to TGF-β becomes beneficial for proliferation in the presence of BRAF or MEK inhibitors. In a heterogeneous population of melanoma cells having varying levels of SOX10 suppression, cells with low SOX10 and consequently high EGFR expression are rapidly enriched in the presence of drug, but this is reversed when the drug treatment is discontinued. We find evidence for SOX10 loss and/or activation of TGF-β signalling in 4 of the 6 EGFR-positive drug-resistant melanoma patient samples. Our findings provide a rationale for why some BRAF or MEK inhibitor-resistant melanoma patients may regain sensitivity to these drugs after a ‘drug holiday’ and identify patients with EGFR-positive melanoma as a group that may benefit from re-treatment after a drug holiday.
Article
Full-text available
High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer.
Article
Full-text available
Prior biological knowledge greatly facilitates the meaningful interpretation of gene expression data. Causal networks constructed from individual relationships curated from the literature are particularly suited for this task, since they create mechanistic hypotheses that explain the expression changes observed in datasets. We present and discuss a suite of algorithms and tools for inferring and scoring regulator networks upstream of gene expression data based on a large-scale causal network derived from the Ingenuity Knowledge Base. We extend the method to predict downstream effects on biological functions and diseases and demonstrate the validity of our approach by applying it to example data sets. The causal analytics tools "Upstream Regulator Analysis", "Mechanistic Networks", "Causal Network Analysis", and "Downstream Effects Analysis" are implemented and available within Ingenuity Pathway Analysis (IPA) (http://www.ingenuity.com). akramer@ingenuity.com SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.
Article
Full-text available
The Mitogen-Activated Protein Kinase (MAPK) network consists of tightly interconnected signalling pathways involved in diverse cellular processes, such as cell cycle, survival, apoptosis and differentiation. Although several studies reported the involvement of these signalling cascades in cancer deregulations, the precise mechanisms underlying their influence on the balance between cell proliferation and cell death (cell fate decision) in pathological circumstances remain elusive. Based on an extensive analysis of published data, we have built a comprehensive and generic reaction map for the MAPK signalling network, using CellDesigner software. In order to explore the MAPK responses to different stimuli and better understand their contributions to cell fate decision, we have considered the most crucial components and interactions and encoded them into a logical model, using the software GINsim. Our logical model analysis particularly focuses on urinary bladder cancer, where MAPK network deregulations have often been associated with specific phenotypes. To cope with the combinatorial explosion of the number of states, we have applied novel algorithms for model reduction and for the compression of state transition graphs, both implemented into the software GINsim. The results of systematic simulations for different signal combinations and network perturbations were found globally coherent with published data. In silico experiments further enabled us to delineate the roles of specific components, cross-talks and regulatory feedbacks in cell fate decision. Finally, tentative proliferative or anti-proliferative mechanisms can be connected with established bladder cancer deregulations, namely Epidermal Growth Factor Receptor (EGFR) over-expression and Fibroblast Growth Factor Receptor 3 (FGFR3) activating mutations.
Article
Full-text available
Reactome (http://www.reactome.org) is a manually curated open-source open-data resource of human pathways and reactions. The current version 46 describes 7088 human proteins (34% of the predicted human proteome), participating in 6744 reactions based on data extracted from 15 107 research publications with PubMed links. The Reactome Web site and analysis tool set have been completely redesigned to increase speed, flexibility and user friendliness. The data model has been extended to support annotation of disease processes due to infectious agents and to mutation.
Article
Full-text available
Global 'multi-omics' profiling of cancer cells harbours the potential for characterizing the signalling networks associated with specific oncogenes. Here we profile the transcriptome, proteome and phosphoproteome in a panel of non-small cell lung cancer (NSCLC) cell lines in order to reconstruct targetable networks associated with KRAS dependency. We develop a two-step bioinformatics strategy addressing the challenge of integrating these disparate data sets. We first define an 'abundance-score' combining transcript, protein and phospho-protein abundances to nominate differentially abundant proteins and then use the Prize Collecting Steiner Tree algorithm to identify functional sub-networks. We identify three modules centred on KRAS and MET, LCK and PAK1 and β-Catenin. We validate activation of these proteins in KRAS-dependent (KRAS-Dep) cells and perform functional studies defining LCK as a critical gene for cell proliferation in KRAS-Dep but not KRAS-independent NSCLCs. These results suggest that LCK is a potential druggable target protein in KRAS-Dep lung cancers.
Article
Full-text available
The goal of pathway analysis is to identify the pathways significantly impacted in a given phenotype. Many current methods are based on algorithms that consider pathways as simple gene lists, dramatically under-utilizing the knowledge that such pathways are meant to capture. During the past few years, a plethora of methods claiming to incorporate various aspects of the pathway topology have been proposed. These topology-based methods, sometimes referred to as "third generation," have the potential to better model the phenomena described by pathways. Although there is now a large variety of approaches used for this purpose, no review is currently available to offer guidance for potential users and developers. This review covers 22 such topology-based pathway analysis methods published in the last decade. We compare these methods based on: type of pathways analyzed (e.g., signaling or metabolic), input (subset of genes, all genes, fold changes, gene p-values, etc.), mathematical models, pathway scoring approaches, output (one or more pathway scores, p-values, etc.) and implementation (web-based, standalone, etc.). We identify and discuss challenges, arising both in methodology and in pathway representation, including inconsistent terminology, different data formats, lack of meaningful benchmarks, and the lack of tissue and condition specificity.
Article
Full-text available
We examined if a combination of proliferation markers and estrogen receptor (ER) activity could predict early versus late relapses in ER-positive breast cancer and inform the choice and length of adjuvant endocrine therapy. Baseline affymetrix gene-expression profiles from ER-positive patients who received no systemic therapy (n = 559) or adjuvant tamoxifen for 5 years (cohort-1: n = 683, cohort-2: n = 282) and from 58 patients treated with neoadjuvant letrozole for 3 months (gene-expression available at baseline, 14 and 90 days) were analyzed. A proliferation score based on the expression of mitotic kinases (MKS) and an ER-related score (ERS) adopted from Oncotype DX(R) were calculated. The same analysis was performed using the Genomic Grade Index as proliferation marker and the luminal gene score from the PAM50 classifier as measure of estrogen-related genes. Median values were used to define low and high marker groups and four combinations were created. Relapses were grouped into time cohorts of 0--2.5, 0--5, >5-10 years. In the overall 10 years period, the proportional hazards assumption was violated for several biomarker groups indicating time-dependent effects. In tamoxifen-treated patients Low-MKS/Low-ERS cancers had continuously increasing risk of relapse that was higher after 5 years than Low-MKS/High-ERS cancers [0 to 10 year, HR 3.36; p = 0.013]. High-MKS/High-ERS cancers had low risk of early relapse [0--2.5 years HR 0.13; p = 0.0006], but high risk of late relapse which was higher than in the High-MKS/Low-ERS group [after 5 years HR 3.86; p = 0.007]. The High-MKS/Low-ERS subset had most of the early relapses [0 to 2.5 years, HR 6.53; p < 0.0001] especially in node negative tumors and showed minimal response to neoadjuvant letrozole. These findings were qualitatively confirmed in a smaller independent cohort of tamoxifen-treated patients. Using different biomarkers provided similar results. Early relapses are highest in highly proliferative/low-ERS cancers, in particular in node negative tumors. Relapses occurring after 5 years of adjuvant tamoxifen are highest among the highly-proliferative/high-ERS tumors although their risk of recurrence is modest in the first 5 years on tamoxifen. These tumors could be the best candidates for extended endocrine therapy.
Article
Full-text available
Many forms of cancer have multiple subtypes with different causes and clinical outcomes. Somatic tumor genome sequences provide a rich new source of data for uncovering these subtypes but have proven difficult to compare, as two tumors rarely share the same mutations. Here we introduce network-based stratification (NBS), a method to integrate somatic tumor genomes with gene networks. This approach allows for stratification of cancer into informative subtypes by clustering together patients with mutations in similar network regions. We demonstrate NBS in ovarian, uterine and lung cancer cohorts from The Cancer Genome Atlas. For each tissue, NBS identifies subtypes that are predictive of clinical outcomes such as patient survival, response to therapy or tumor histology. We identify network regions characteristic of each subtype and show how mutation-derived subtypes can be used to train an mRNA expression signature, which provides similar information in the absence of DNA sequence.
Article
Full-text available
Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer. These studies involve the sequencing of matched tumour-normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour-normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.
Article
Full-text available
Background Public repositories of biological pathways and networks have greatly expanded in recent years. Such databases contain many pathways that facilitate the analysis of high-throughput experimental work and the formulation of new biological hypotheses to be tested, a fundamental principle of the systems biology approach. However, large-scale molecular maps are not always easy to mine and interpret. Results We have developed BiNoM (Biological Network Manager), a Cytoscape plugin, which provides functions for the import-export of some standard systems biology file formats (import from CellDesigner, BioPAX Level 3 and CSML; export to SBML, CellDesigner and BioPAX Level 3), and a set of algorithms to analyze and reduce the complexity of biological networks. BiNoM can be used to import and analyze files created with the CellDesigner software. BiNoM provides a set of functions allowing to import BioPAX files, but also to search and edit their content. As such, BiNoM is able to efficiently manage large BioPAX files such as whole pathway databases (e.g. Reactome). BiNoM also implements a collection of powerful graph-based functions and algorithms such as path analysis, decomposition by involvement of an entity or cyclic decomposition, subnetworks clustering and decomposition of a large network in modules. Conclusions Here, we provide an in-depth overview of the BiNoM functions, and we also detail novel aspects such as the support of the BioPAX Level 3 format and the implementation of a new algorithm for the quantification of pathways for influence networks. At last, we illustrate some of the BiNoM functions on a detailed biological case study of a network representing the G1/S transition of the cell cycle, a crucial cellular process disturbed in most human tumors.
Article
Full-text available
Large-scale cancer genome sequencing has uncovered thousands of gene mutations, but distinguishing tumor driver genes from functionally neutral passenger mutations is a major challenge. We analyzed 800 cancer genomes of eight types to find single-nucleotide variants (SNVs) that precisely target phosphorylation machinery, important in cancer development and drug targeting. Assuming that cancer-related biological systems involve unexpectedly frequent mutations, we used novel algorithms to identify genes with significant phosphorylation-associated SNVs (pSNVs), phospho-mutated pathways, kinase networks, drug targets, and clinically correlated signaling modules. We highlight increased survival of patients with TP53 pSNVs, hierarchically organized cancer kinase modules, a novel pSNV in EGFR, and an immune-related network of pSNVs that correlates with prolonged survival in ovarian cancer. Our findings include multiple actionable cancer gene candidates (FLNB, GRM1, POU2F1), protein complexes (HCF1, ASF1), and kinases (PRKCZ). This study demonstrates new ways of interpreting cancer genomes and presents new leads for cancer research.
Article
Full-text available
In mammalian cells more than 90% of double-strand breaks are repaired by NHEJ. Impairment of this pathway is associated with cell cycle arrest, cell death, genomic instability and cancer. Human diseases such as Nijmegen breakage syndrome, due to mutations in the NBS1 gene, produce defects in resection of double-strand breaks. NBS1 hypomorphic mutant mice are viable, and cells from these mice are defective in S phase and G2/M checkpoints. NBS1 polymorphisms have been associated with increased risk of breast cancer. We previously demonstrated that estradiol protected estrogen receptor (ER)-positive (+) breast cancer cell lines against double-strand breaks and cell death. We now demonstrate that protection from double-strand break damage in ER+ cells is mediated via regulation by c-myc, p53, CBP and SRC1 coactivators in intron 1 of the NBS1 gene. We concluded that NBS1 is responsible for estradiol-mediated protection from double-strand breaks in ER+ breast cancer cells.
Article
Full-text available
The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.
Article
Full-text available
Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made—particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data repositories to highly formalized pathway databases. For many applications, a global view of all the available interaction data is desirable, including lower-quality data and/or computational predictions. The STRING database (http://string-db.org/) aims to provide such a global perspective for as many organisms as feasible. Known and predicted associations are scored and integrated, resulting in comprehensive protein networks covering >1100 organisms. Here, we describe the update to version 9.1 of STRING, introducing several improvements: (i) we extend the automated mining of scientific texts for interaction information, to now also include full-text articles; (ii) we entirely re-designed the algorithm for transferring interactions from one model organism to the other; and (iii) we provide users with statistical information on any functional enrichment observed in their networks.
Article
Full-text available
We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.
Article
Full-text available
Importance of the field: Post-genome drug development has been driven by the need to study biological perturbations at the molecular system level. Systems biology visualization tools can help researchers extract hidden patterns from complex and large Omics data sets, model disease molecular mechanisms, and identify drug targets and drugs with good pharmacological and toxicological profiles. Areas covered in this review: This review covers basic concepts in developing and applying information visualization tools to systems biology. We describe a framework and basic data representation schemes for visual data analysis in systems biology. We review major application areas of these visualization tools within drug discovery by focusing on early-stage drug discovery tasks such as disease biology modeling, target identifications and lead identification. We also show case studies and summarize our experience using visualization tools as lessons to our readers. What the reader will gain: The reader will understand what visualization tools are available for diverse types of systems biology studies in drug discovery and understand how these tools can help advance drug development. Take home message: In spite of the complexity inherent in systems biology, proper use of information visualization tools may reveal emerging properties hidden in the data and enhance chances of success for drug discovery.
Article
Full-text available
Although hereditary breast cancers have defects in the DNA damage response that result in genomic instability, DNA repair abnormalities in sporadic breast cancers have not been extensively characterized. Recently, we showed that, relative to nontumorigenic breast epithelial MCF10A cells, estrogen receptor-positive (ER+) MCF7 breast cancer cells and progesterone receptor-positive (PR+) MCF7 breast cancer cells have reduced steady-state levels of DNA ligase IV, a component of the major DNA-protein kinase (PK)-dependent nonhomologous end joining (NHEJ) pathway, whereas the steady-state level of DNA ligase IIIα, a component of the highly error-prone alternative NHEJ (ALT NHEJ) pathway, is increased. Here, we show that tamoxifen- and aromatase-resistant derivatives of MCF7 cells and ER(-)/PR(-) cells have even higher steady-state levels of DNA ligase IIIα and increased levels of PARP1, another ALT NHEJ component. This results in increased dependence upon microhomology-mediated ALT NHEJ to repair DNA double-strand breaks (DSB) and the accumulation of chromosomal deletions. Notably, therapy-resistant derivatives of MCF7 cells and ER(-)/PR(-) cells exhibited significantly increased sensitivity to a combination of PARP and DNA ligase III inhibitors that increased the number of DSBs. Biopsies from ER(-)/PR(-) tumors had elevated levels of ALT NHEJ and reduced levels of DNA-PK-dependent NHEJ factors. Thus, our results show that ALT NHEJ is a novel therapeutic target in breast cancers that are resistant to frontline therapies and suggest that changes in NHEJ protein levels may serve as biomarkers to identify tumors that are candidates for this therapeutic approach.
Article
Full-text available
Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further.
Article
Full-text available
The identification of a constantly increasing number of genes whose mutations are causally implicated in tumor initiation and progression (cancer genes) requires the development of tools to store and analyze them. The Network of Cancer Genes (NCG 3.0) collects information on 1494 cancer genes that have been found mutated in 16 different cancer types. These genes were collected from the Cancer Gene Census as well as from 18 whole exome and 11 whole-genome screenings of cancer samples. For each cancer gene, NCG 3.0 provides a summary of the gene features and the cross-reference to other databases. In addition, it describes duplicability, evolutionary origin, orthology, network properties, interaction partners, microRNA regulation and functional roles of cancer genes and of all genes that are related to them. This integrated network of information can be used to better characterize cancer genes in the context of the system in which they act. The data can also be used to identify novel candidates that share the same properties of known cancer genes and may therefore play a similar role in cancer. NCG 3.0 is freely available at http://bio.ifom-ieo-campus.it/ncg.
Article
Full-text available
High-throughput studies of biological systems are rapidly accumulating a wealth of 'omics'-scale data. Visualization is a key aspect of both the analysis and understanding of these data, and users now have many visualization methods and tools to choose from. The challenge is to create clear, meaningful and integrated visualizations that give biological insight, without being overwhelmed by the intrinsic complexity of the data. In this review, we discuss how visualization tools are being used to help interpret protein interaction, gene expression and metabolic profile data, and we highlight emerging new directions.
Article
Full-text available
Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.
Article
Full-text available
Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular and intercellular network that links tissue and organ systems. The emerging tools of network medicine offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identification of disease modules and pathways, but also the molecular relationships among apparently distinct (patho)phenotypes. Advances in this direction are essential for identifying new disease genes, for uncovering the biological significance of disease-associated mutations identified by genome-wide association studies and full-genome sequencing, and for identifying drug targets and biomarkers for complex diseases.
Article
Full-text available
Despite initial and sometimes dramatic responses of specific NSCLC tumors to EGFR TKIs, nearly all will develop resistance and relapse. Gene expression analysis of NSCLC cell lines treated with the EGFR TKI, gefitinib, revealed increased levels of FGFR2 and FGFR3 mRNA. Analysis of gefitinib action on a larger panel of NSCLC cell lines verified that FGFR2 and FGFR3 expression is increased at the mRNA and protein level in NSCLC cell lines in which the EGFR is dominant for growth signaling, but not in cell lines where EGFR signaling is absent. A luciferase reporter containing 2.5 kilobases of fgfr2 5' flanking sequence was activated after gefitinib treatment, indicating transcriptional regulation as a contributing mechanism controlling increased FGFR2 expression. Induction of FGFR2 and FGFR3 protein as well as fgfr2-luc activity was also observed with Erbitux, an EGFR-specific monoclonal antibody. Moreover, inhibitors of c-Src and MEK stimulated fgfr2-luc activity to a similar degree as gefitinib, suggesting that these pathways may mediate EGFR-dependent repression of FGFR2 and FGFR3. Importantly, our studies demonstrate that EGFR TKI-induced FGFR2 and FGFR3 are capable of mediating FGF2 and FGF7 stimulated ERK activation as well as FGF-stimulated transformed growth in the setting of EGFR TKIs. In conclusion, this study highlights EGFR TKI-induced FGFR2 and FGFR3 signaling as a novel and rapid mechanism of acquired resistance to EGFR TKIs and suggests that treatment of NSCLC patients with combinations of EGFR and FGFR specific TKIs may be a strategy to enhance efficacy of single EGFR inhibitors.
Book
Cancer is a complex and heterogeneous disease that exhibits high levels of robustness against various therapeutic interventions. It is a constellation of diverse and evolving disorders that are manifested by the uncontrolled proliferation of cells that may eventually lead to fatal dysfunction of the host system. Although some of the cancer subtypes can be cured by early diagnosis and specific treatment, no effective treatment is yet established for a significant portion of cancer subtypes. In industrial countries where the average life expectancy is high, cancer is one of the major causes of death. Any contribution to an in-depth understanding of cancer shall eventually lead to better care and treatment for patients. Due to the complex, heterogeneous, and evolving nature of cancer, it is essential for a system-oriented view to be adopted for an in-depth understanding. The question is how to achieve an in-depth yet realistic understanding of cancer dynamics. Although large-scale experiments are now being deployed, there are practical limitations of how much they do to convey the reality of cancer pathology and progression within the patient’s body. Computational approaches with system-oriented thinking may complement the limitations of an experimental approach. Computational studies not only provide us with new insights from large-scale experimental data, but also enable us to perceive what are the conceivable characteristics of cancer under certain assumptions. It is an engine of thoughts and proving grounds of various hypotheses on how cancer may behave as well as how molecular mechanisms work within anomalous conditions. It is not just computing that helps us fight against cancer, but a computational approach has to be combined with a proper theoretical framework that enables us to perceive “cancer” as complex dynamical and evolvable systems that entail a robust yet fragile nature. This recognition shifts our attention from the magic bullet approach of anti-cancer drugs to a more systematic control of cancer as complex dynamical phenomena. This leads to the view that a complex system has to be controlled by complex interventions. To understand such a system and design complex interventions, it is essential that we combine experimental and computational approaches. Thus, computational systems biology of cancer is an essential discipline for cancer biology and is expected to have major impacts for clinical decision-making. This is the first book specifically focused on computational systems biology of cancer with a coherent and proper vision on how to tackle this formidable challenge. Book web-site:http://www.cancer-systems-biology.net/
Article
Many cancer-associated genes and pathways remain to be identified in order to clarify the molecular mechanisms underlying cancer progression. In this area, genome-wide loss-of-function screens appear to be powerful biological tools, allowing the accumulation of large amounts of data. However, this approach currently lacks analytical tools to exploit the data with maximum efficiency, for which systems biology methods analyzing complex cellular networks may be extremely helpful. We report such a systems biology strategy based on the construction of a network for a biological process and specific for a given cell system (cell type). The networks are created from genome-wide loss-of-function screen data sets. We also propose tools to analyze network properties. As one of the tools, we suggest a mathematical model for discrimination between two distinct cell processes that may be affected by knocking down the activity of a gene, i.e., a decreased cell number may be caused by arrested cell proliferation or enhanced cell death. Next we show how this discrimination between the two cell processes helps to construct two corresponding subnetworks. Finally, we demonstrate an application of the proposed strategy to the identification and characterization of putative novel genes and pathways significant for the control of lung cancer cell growth, based on the results of a genome-wide proliferation/viability loss-of-function screen of human lung adenocarcinoma cells.
Article
The extensive molecular characterization of tumors with high throughput technologies has led to the segmentation of different tumors into very small molecularly defined subgroups. Many ongoing clinical trials are conducted only when specific molecular alterations are identified in tumor samples. In this review, we will describe the implementation of genome analysis in the clinical setting as it has expanded over the last four years in our Precision Medicine Program. This manuscript will also highlight the main limitations and challenges related to the development of broader and deeper genome analysis.
Article
Molecular biology knowledge can be formalized and systematically represented in a computer-readable form as a comprehensive map of molecular interactions. There exist an increasing number of maps of molecular interactions containing detailed and step-wise description of various cell mechanisms. It is difficult to explore these large maps, to organize discussion of their content and to maintain them. Several efforts were recently made to combine these capabilities together in one environment, and NaviCell represents one of them. NaviCell is a web-based environment for exploiting large maps of molecular interactions, created in CellDesigner, allowing their easy exploration, curation and maintenance. It is characterized by a combination of three essential features: (1) efficient map browsing based on Google Maps engine; (2) semantic zooming for viewing different levels of details or of abstraction of the map and (3) integrated web-based blog for collecting the community feedback. NaviCell can be easily used by experts in the field of molecular biology for studying molecular entities of their interest in the context of signaling pathways and crosstalk between pathways within a global signaling network. NaviCell allows both exploration of detailed molecular mechanisms represented on the map and a more abstract view of the map up to a top-level modular representation. NaviCell greatly facilitates curation, maintenance and updating the comprehensive maps of molecular interactions in an interactive and user-friendly fashion due to an imbedded blogging system. NaviCell provides a user-friendly exploration of large-scale maps of molecular interactions, thanks to Google Maps and WordPress interfaces, with which many users are already familiar. Semantic zooming which is used for navigating geographical maps is adopted for molecular maps in NaviCell, making any level of visualization readable. In addition, NaviCell provides the framework for community-based curation of maps.
Article
The Biological Network Manager (BiNoM) is a software tool for the manipulation and analysis of biological networks. It facilitates the import and conversion of a set of well-established systems biology file formats. It also provides a large set of graph-based algorithms that allow users to analyze and extract relevant subnetworks from large molecular maps. It has been successfully used in several projects related to the analysis of large and complex biological data, or networks from databases. In this tutorial, we present a detailed and practical case study of how to use BiNoM to analyze biological networks.