Michael Schubert’s research while affiliated with EMBL-EBI and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (26)


Figure 1. SBML Level 3 (Hucka et al, 2019) consists of a core (center) and specialized SBML Level 3 packages (in blue), which provide syntactical constructs to support additional modeling approaches.
SBML Level 3: an extensible format for the exchange and reuse of biological models
  • Literature Review
  • Full-text available

August 2020

·

625 Reads

·

244 Citations

Molecular Systems Biology

Sarah M Keating

·

·

·

[...]

·

Jeremy Zucker

Abstract Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction‐based models and packages that extend the core with features suited to other model types including constraint‐based models, reaction‐diffusion models, logical network models, and rule‐based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single‐cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.

Download



Fig. 5 Response signatures outperform pathway methods for patient survival. a Pan-cancer associations between pathway scores and patient survival. Pathways on the horizontal axis and different methods on the vertical axis. Associations of survival increase (green) and decrease. Significance labels as indicated. Shades correspond to effect size, p values as indicated. b Volcano plot of cancers-specific associations between patient survival and inferred pathway score using PROGENy. Effect size on the horizontal axis. Below zero indicates increased survival (green), above decreased survival (red). FDRadjusted p values on the vertical axis. Size of the dots corresponds to number of patients in each cohort. c Kaplan-Meier curves of individual associations for kidney (KIRC), low-grade glioma (LGG), and adrenocortical carcinoma (ACC). Pathway scores are split in top and bottom quartiles and center half. Lines show the fraction of patients (vertical axis) that are alive at a given time (horizontal axis) within one year. P values for discretized scores 
Deriving pathway-response signatures for 11 pathways. a Reasoning about pathway activation. Most pathway approaches make use of either the set (top panel) or infer or incorporate structure (middle panel) of signaling molecules to make statements about a possible activation, while signature-based approaches such as PROGENy consider the genes affected by perturbing the pathway. b Workflow of the data curation and model building. (1) Finding and curation of 208 publicly available experiment series in the ArrayExpress database, (2) Extracting 556 perturbation experiments from series’ raw data, (3) Performing QC metrics and discarding failures, (4) Computing z-scores per experiment, (5) Using a linear regression model to fit genes responsive to all pathways simultaneously obtaining the z-coefficients matrix, (6) Assigning pathway scores using the coefficients matrix and basal expression data. See methods section for details. c Size of the data set compared to an individual gene expression signature experiment. The amount of experiments that comprise each pathway is shown to scale and indicated. Figure 1b (2) created by Guillaime Paumier is published under a CC-BY-SA license, sourced from https://commons.wikimedia.org/wiki/File:DNA_microarray.svg. Figure 1b (4) is an adaptation (by Chen-Pan Liao) of the original work of User:Jhguch at en.wikipedia, published under a CC-BY-SA license, sourced from https://commons.wikimedia.org/wiki/File:Boxplot_vs_PDF.svg. Figure 1b (6) is an adaptation (by User:Ogrebot) of the original work of User:Bilou at en.wikipedia, published under a CC-BY-SA license, sourced from https://commons.wikimedia.org/wiki/File:Matrix_multiplication_diagram_2.svg
Evaluation of pathway-response signatures. a Associations for PROGENy pathway scores with experimental perturbation for experiments that the model was not built with (leave-one-out cross-validation). Each pathway is strongly associated with its own perturbation, and we observe few cases of cross-talk in agreement with biological knowledge. b Pathway perturbations in HEK293 cell line activate the corresponding signaling proteins. MEK and ERK for MAPK pathway, Stat3 for Interferon-induced JAK-STAT, AKT for PI3K, Smad2 for TGFb, and IKb for TNF-alpha-induced NFkB. As expected, all increased upon stimulation except AKT that decreased upon inhibition. Activation shown relative to maximum readout per antibody, p values reported for one-sample one-sided t test. Results are significant if p < 0.05 and perturbation is at least 30% of maximum. c PROGENy correctly infers pathway activity from gene expression in the HEK293 experiments. Associations are significant if p value of two-sample one-sided t test <0.05 and experiments are at least 1.5 standard deviations above or below the control. d Stability of basal pathway scores when bootstrapping input experiments. Bars show how much more variance in pathway scores (GDSC panel) is introduced by cell line identity over using resampled perturbation experiments in model building. Variance by cell line is over five times as high for most pathways, and roughly twice as high for Trail and VEGF
Ability of pathway methods to recover well-known mutations. a Volcano plot of pan-cancer associations between driver mutations and copy number aberrations with differences in pathway score. Pathway scores calculated from basal gene expression in the TCGA for primary tumors. Size of points corresponds to occurrence of aberration. Type of aberration is indicated by superscript “mut” if mutated and “amp”/”del” if amplified or deleted, with colors as indicated. Effect sizes on the horizontal axis larger than zero indicate pathway activation and smaller than zero indicate inferred inhibition. P values on the vertical axis FDR-adjusted with a significance threshold of 5%. Associations shown without correcting for different cancer types. Associations with a black outer ring are also significant if corrected. b Comparison of pathway scores (vertical axes) across different methods (horizontal axes) for TP53 and KRAS mutations, EGFR amplifications and VHL mutations. Wald statistic shown as shades of green for downregulated and red for upregulated pathways. P value labels shown as indicated. White squares where a pathway was not available for a method
MAPK and p53 scores drive drug response across all cancer types. a Comparison of the associations obtained by different pathway methods. Number of associations on the vertical and FDR on the horizontal axis. PROGENy yield more and stronger associations than all other pathway methods. Mutation associations are only stronger for TP53/Nutlin-3a and drugs that were specifically designed to bind to a mutated protein. PARADIGM not shown because no associations <10% FDR. markers (green) and greater than zero resistance markers (red). P values FDR-corrected. b Pathway context of the strongest associations (Supplementary Fig. 10) between EGFR/MAPK pathways and their inhibitors obtained by PROGENy. c Comparison of stratification by mutations and pathway scores. MAPK pathway (BRAF, NRAS, or KRAS) mutations and Trametinib on top left panel, AZ628 bottom left, BRAF mutations and Dabrafenib top right, and p53 pathway/TP53 mutations/Nutlin-3a bottom right. For each of the four cases, the leftmost violin plot shows the distribution of IC50s across all cell lines, followed by a stratification in wild-type (green) and mutant cell lines (blue box). The three rightmost violin plots show stratification of all the cell lines by the top, the two middle, and the bottom quartile of inferred pathway score (indicated by shade of color). The two remaining violin plots in the middle show mutated (BRAF, KRAS, or NRAS; blue color) or wild-type (TP53; green color) cell lines stratified by the top- and bottom quartiles of MAPK or p53 pathways scores (Mann–Whitney U-test statistics as indicated)
Perturbation-response genes reveal signaling footprints in cancer gene expression

January 2018

·

506 Reads

·

552 Citations

Aberrant cell signaling can cause cancer and other diseases and is a focal point of drug research. A common approach is to infer signaling activity of pathways from gene expression. However, mapping gene expression to pathway components disregards the effect of post-translational modifications, and downstream signatures represent very specific experimental conditions. Here we present PROGENy, a method that overcomes both limitations by leveraging a large compendium of publicly available perturbation experiments to yield a common core of Pathway RespOnsive GENes. Unlike pathway mapping methods, PROGENy can (i) recover the effect of known driver mutations, (ii) provide or improve strong markers for drug indications, and (iii) distinguish between oncogenic and tumor suppressor pathways for patient survival. Collectively, these results show that PROGENy accurately infers pathway activity from gene expression in a wide range of conditions.





Supplementary Material 8

August 2017

·

10 Reads

Table S7. Associations between Drug Response and Molecular Data, Related to Figures 7 and S7 The predictive performance, data type and target pathway is shown for each association. The column “Corrected for ABCBs?” indicates whether the feature was predicted with the mean protein abundance of ABCB1 and ABCB11 regressed-out from drug response data.



Supplementary Material 3

August 2017

·

10 Reads

Table S2. Relative Phosphopeptide Abundances in the COREAD Cell Lines, Related to Figures 6, S2, and S4 Scaled phosphopeptide quantification values for 50 colorectal cancer cell lines. Phosphopeptides are annotated by the known regulatory kinase and by KEGG pathway where applicable. The values are not corrected or normalized for total protein levels.


Citations (9)


... PMs have been drawn based on SBGN using the CellDesigner structured diagram editor (Funahashi et al., 2008). CellDesigner is used to graphically represent SBGN-compatible networks, storing the maps in the Systems Biology Markup Language (SBML), a free and open software data format for describing models in systems biology (Keating et al., 2020). CellDesigner does not fully comply with SBGN (e.g. for the shape of the nodes), but we have ensured that the representation of the network is as close as possible to the standard. ...

Reference:

Mapping Physiology: A Systems Biology Approach for the Development of Alternative Methods in Toxicology
SBML Level 3: an extensible format for the exchange and reuse of biological models

Molecular Systems Biology

... Further analysis covered a wide range of cellular phenotypes and focused on calculating the gene signature scores of T cells [15], NK cells [17], B cells [18], macrophages [19], dendritic cells [19], and fibroblasts [20]. These scores were derived via the AddModuleScore function in the seurat framework, which incorporates gene sets (Table S2-7) curated from previous studies [15,[17][18][19][20]. Pathway-responsive genes for activity inference (Progeny) [21] analysis were also utilized to compare and assess the activation levels of specific tumour signaling pathways in epithelial cells between PL and LM. ...

Perturbation-response genes reveal signaling footprints in cancer gene expression

... CRC is a highly heterogeneous disease that comprises various phenotypes, a plastic condition greatly discouraging therapy outcomes and patient survival. The extent of intratumor diversity in CRC has been revealed through studies on transcriptomic [34,35], proteomics [36,37], metabolic status [38,39] and functional response [40][41][42]. In particular, the recognition of epigenetic diversity in CRC is well documented and extended to exploring intratumor epigenetic heterogeneity [43][44][45]. ...

Genomic Determinants of Protein Abundance Variation in Colorectal Cancer Cells

Cell Reports

... Further analysis covered a wide range of cellular phenotypes and focused on calculating the gene signature scores of T cells [15], NK cells [17], B cells [18], macrophages [19], dendritic cells [19], and fibroblasts [20]. These scores were derived via the AddModuleScore function in the seurat framework, which incorporates gene sets (Table S2-7) curated from previous studies [15,[17][18][19][20]. Pathway-responsive genes for activity inference (Progeny) [21] analysis were also utilized to compare and assess the activation levels of specific tumour signaling pathways in epithelial cells between PL and LM. ...

Perturbation-response genes reveal signaling footprints in cancer gene expression

... At present, there seems to be no marked difference in technical systems of drug development between primary and metastatic tumors. As mentioned above, a vast number of potential drug targets emerge every year [79][80][81][82][83][84][85][86][87]. They are categorized as signal transduction, AMF, HGF/c-Met, TGF-β inhibitors, β-catenin inhibitors, cell movement inhibitors and other drug targets fuel for therapeutic promotion globally. ...

A Landscape of Pharmacogenomic Interactions in Cancer

Cell

... The combinatorial properties occurring among the alterations are then analyzed and used to define cost functions, for example, based on the tendency of a group of genes to be mutated in a mutually exclusive manner. On the basis of these cost functions, optimal sub-networks are identified and interpreted as novel cancer driver pathways [22][23][24] . However, at the moment there is no consensual method to rigorously define a mathematical metric for mutual exclusivity and compute its statistical significance, and a number of interpretations exist 22,23,[25][26][27] . ...

Exploiting Combinatorial Patterns in Cancer Genomic Data for Personalized Therapy and New Target Discovery

... Dataset was deposited into a publicly accessible repository (https://data.mendeley.com/preview/2rw5pz75yr). The dynamic 13 C-flux model has been deposited in the Biomodels database (https://www.ebi.ac.uk/biomodels) [45] with the identifier MODEL2310250001. ...

BioModels: Ten-year anniversary

Nucleic Acids Research

... Additionally, HRP has eight glycosylation sites, contributing to a carbohydrate content of 18% to 20% [5]. In 2014, 28 sequences encoding various HRP isoforms were extracted from a pyrosequenced transcriptome of Armoracia rusticana, revealing divergent characteristics [6]. These characteristics allow for the categorization of HRP isoforms into acidic, neutral, and basic isoenzymes, as determined by their isoelectric points [7]. ...

Peroxidase gene discovery from the horseradish transcriptome

BMC Genomics

... However, in spite of a few attempts [8,9], the model development process is far from being automatic and standardized. Parameter optimization frameworks have been implemented in a diverse manner, with different specification formats for the parameters, the experimental datasets, the parameter bounds, the objective functions and the choice of optimization methods [10][11][12][13][14][15][16][17][18][19][20][21][22][23]. ...

Path2Models: Large-scale generation of computational models from biochemical pathway maps

BMC Systems Biology