Creating and analyzing pathway and protein interaction compendia for modelling signal transduction networks

Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA. .
BMC Systems Biology (Impact Factor: 2.44). 05/2012; 6(1):29. DOI: 10.1186/1752-0509-6-29
Source: PubMed


Understanding the information-processing capabilities of signal transduction networks, how those networks are disrupted in disease, and rationally designing therapies to manipulate diseased states require systematic and accurate reconstruction of network topology. Data on networks central to human physiology, such as the inflammatory signalling networks analyzed here, are found in a multiplicity of on-line resources of pathway and interactome databases (Cancer CellMap, GeneGo, KEGG, NCI-Pathway Interactome Database (NCI-PID), PANTHER, Reactome, I2D, and STRING). We sought to determine whether these databases contain overlapping information and whether they can be used to construct high reliability prior knowledge networks for subsequent modeling of experimental data.
We have assembled an ensemble network from multiple on-line sources representing a significant portion of all machine-readable and reconcilable human knowledge on proteins and protein interactions involved in inflammation. This ensemble network has many features expected of complex signalling networks assembled from high-throughput data: a power law distribution of both node degree and edge annotations, and topological features of a "bow tie" architecture in which diverse pathways converge on a highly conserved set of enzymatic cascades focused around PI3K/AKT, MAPK/ERK, JAK/STAT, NFκB, and apoptotic signaling. Individual pathways exhibit "fuzzy" modularity that is statistically significant but still involving a majority of "cross-talk" interactions. However, we find that the most widely used pathway databases are highly inconsistent with respect to the actual constituents and interactions in this network. Using a set of growth factor signalling networks as examples (epidermal growth factor, transforming growth factor-beta, tumor necrosis factor, and wingless), we find a multiplicity of network topologies in which receptors couple to downstream components through myriad alternate paths. Many of these paths are inconsistent with well-established mechanistic features of signalling networks, such as a requirement for a transmembrane receptor in sensing extracellular ligands.
Wide inconsistencies among interaction databases, pathway annotations, and the numbers and identities of nodes associated with a given pathway pose a major challenge for deriving causal and mechanistic insight from network graphs. We speculate that these inconsistencies are at least partially attributable to cell, and context-specificity of cellular signal transduction, which is largely unaccounted for in available databases, but the absence of standardized vocabularies is an additional confounding factor. As a result of discrepant annotations, it is very difficult to identify biologically meaningful pathways from interactome networks a priori. However, by incorporating prior knowledge, it is possible to successively build out network complexity with high confidence from a simple linear signal transduction scaffold. Such reduced complexity networks appear suitable for use in mechanistic models while being richer and better justified than the simple linear pathways usually depicted in diagrams of signal transduction.

Download full-text


Available from: Daniel C Kirouac, Aug 29, 2014
  • Source
    • "The GRN method used a Bayesian model consensus scheme employing the Markov Chain Monte Carlo (MCMC) Metropolis-Hastings algorithm. It incorporates a gene expression dataderived proposal matrix and a priori probability distribution method for combining prior biological knowledge from multiple sources that included KEGG[37], REACTOME[62], Bio- GRID[8], DIP[63], IntAct[64], MINT[65], GO, and predicted and known transcription factors (TF) binding sites from TRANSFAC[66]and JASPER[67]. For the second step, the output of the GRN provided a unique approach to identifying high probability relationships between the highly correlated genes and other upstream and downstream genes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Rift Valley fever Virus (RVFV), a negative-stranded RNA virus, is the etiological agent of the vector-borne zoonotic disease, Rift Valley fever (RVF). In both humans and livestock, protective immunity can be achieved through vaccination. Earlier and more recent vaccine trials in cattle and sheep demonstrated a strong neutralizing antibody and total IgG response induced by the RVF vaccine, authentic recombinant MP-12 (arMP-12). From previous work, protective immunity in sheep and cattle vaccinates normally occurs from 7 to 21 days after inoculation with arMP-12. While the serology and protective response induced by arMP-12 has been studied, little attention has been paid to the underlying molecular and genetic events occurring prior to the serologic immune response. To address this, we isolated RNA from whole blood of vaccinated calves over a time course of 21 days before and after vaccination with arMP-12. The time course RNAs were sequenced by RNASeq and bioinformatically analyzed. Our results revealed time-dependent activation or repression of numerous gene ontologies and pathways related to the vaccine induced immune response and its regulation. Additional bioinformatic analyses identified a correlative relationship between specific host immune response genes and protective immunity prior to the detection of protective serum neutralizing antibody responses. These results contribute an important proof of concept for identifying molecular and genetic components underlying the immune response to RVF vaccination and protection prior to serologic detection.
    Full-text · Article · Jan 2016 · PLoS ONE
  • Source
    • "The value of a standardized map, as opposed to an ad hoc cartoon, in depicting molecular interactions has been well appreciated: such maps can be used to organize information concisely, can be interpreted with minimal ambiguity, and can aid in logical analysis (7–11). After creation of a map, construction of a computational model can be viewed as the next level of information formalization (12). Through modeling, assumptions about molecular interactions (e.g., whether or not two interactions are competitive) are made more concrete and can thus be better assessed. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Antigen receptors play a central role in adaptive immune responses. Although the molecular networks associated with these receptors have been extensively studied, we currently lack a systems-level understanding of how combinations of non-covalent interactions and post-translational modifications are regulated during signaling to impact cellular decision-making. To fill this knowledge gap, it will be necessary to formalize and piece together information about individual molecular mechanisms to form large-scale computational models of signaling networks. To this end, we have developed an interaction library for signaling by the high-affinity IgE receptor, FcεRI. The library consists of executable rules for protein-protein and protein-lipid interactions. This library extends earlier models for FcεRI signaling and introduces new interactions that have not previously been considered in a model. Thus, this interaction library is a toolkit with which existing models can be expanded and from which new models can be built. As an example, we present models of branching pathways from the adaptor protein Lat, which influence production of the phospholipid PIP3 at the plasma membrane and the soluble second messenger IP3. We find that inclusion of a positive feedback loop gives rise to a bistable switch, which may ensure robust responses to stimulation above a threshold level. In addition, the library is visualized to facilitate understanding of network circuitry and identification of network motifs.
    Full-text · Article · Apr 2014 · Frontiers in Immunology
  • Source
    • "Major issues in modeling biological large-scale phenomena are the collection of information from the literature. While cell signaling pathways are described in numerous databases, a recent report demonstrated a high degree of inconsistencies when different databases are compared [34]. Based on the Jaccard similarity coefficient, the authors compared four well understood pathways that involve the cytokines EGF (Epidermal growth factor), TGF-β (Transforming growth factor), TNFα (Tumor necrosis factor) and the signaling protein WNT (wingless-type) described in six databases, including GeneGo (, "
    [Show abstract] [Hide abstract]
    ABSTRACT: The TGF-beta transforming growth factor is the most pleiotropic cytokine controlling a broad range of cellular responses that include proliferation, differentiation and apoptosis. The context-dependent multifunctional nature of TGF-beta is associated with complex signaling pathways. Differential models describe the dynamics of the TGF-beta canonical pathway, but modeling the non-canonical networks constitutes a major challenge. Here, we propose a qualitative approach to explore all TGF-beta-dependent signaling pathways. Using a new formalism, CADBIOM, which is based on guarded transitions and includes temporal parameters, we have built the first discrete model of TGF-beta signaling networks by automatically integrating the 137 human signaling maps from the Pathway Interaction Database into a single unified dynamic model. Temporal property-checking analyses of 15934 trajectories that regulate 145 TGF-beta target genes reveal the association of specific pathways with distinct biological processes. We identify 31 different combinations of TGF-beta with other extracellular stimuli involved in non-canonical TGF-beta pathways that regulate specific gene networks. Extensive analysis of gene expression data further demonstrates that genes sharing CADBIOM trajectories tend to be co-regulated. As applied here to TGF-beta signaling, CADBIOM allows, for the first time, a full integration of highly complex signaling pathways into dynamic models that permit to explore cell responses to complex microenvironment stimuli.
    Full-text · Article · Mar 2014 · BMC Systems Biology
Show more