Michael P H Stumpf

Michael P H Stumpf
Imperial College London | Imperial · Department of Life Sciences

DPhil

About

319
Publications
53,784
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,491
Citations
Additional affiliations
March 2002 - September 2003
University College London
Position
  • Wellcome Trust Research Career Development Fellow
Position
  • Professor for Theoretical Systems Biology
Education
October 1995 - February 1999
University of Oxford
Field of study
  • Statistical Physics

Publications

Publications (319)
Preprint
Full-text available
Many cellular processes involve information processing and decision making. We can probe these processes at increasing molecular detail. The analysis of heterogeneous data remains a challenge that requires new ways of thinking about cells in quantitative, predictive, and mechanistic ways. We discuss the role of mathematical models in the context of...
Preprint
Cells actively regulate their size along the cell cycle to maintain volume homeostasis across generations. While various mathematical models of cell size regulation have been proposed to explain how this is achieved, relating these models to experimentally observed cell size distributions has proved challenging. In this paper we derive a simple for...
Preprint
Full-text available
Interactions and relations between objects may be pairwise or higher-order in nature, and so network-valued data are ubiquitous in the real world. The "space of networks", however, has a complex structure that cannot be adequately described using conventional statistical tools. We introduce a measure-theoretic formalism for modeling generalized net...
Article
Full-text available
Single-cell technologies allow us to gain insights into cellular processes at unprecedented resolution. In stem cell and developmental biology snapshot data allow us to characterize how the transcriptional states of cells change between successive cell types. Here, we show how approximate Bayesian computation (ABC) can be employed to calibrate math...
Article
Full-text available
Cells are the fundamental units of life, and like all life forms, they change over time. Changes in cell state are driven by molecular processes; of these many are initiated when molecule numbers reach and exceed specific thresholds, a characteristic that can be described as “digital cellular logic”. Here we show how molecular and cellular noise pr...
Preprint
Full-text available
Deep learning methods have revolutionized our ability to predict protein structures, allowing us a glimpse into the entire protein universe. As a result, our understanding of how protein structure drives function is now lagging behind our ability to determine and predict protein structure. Here, we describe how topology, the branch of mathematics c...
Preprint
Full-text available
Changes in cell state are driven by key molecular events whose timing can often be measured experimentally. Of particular interest is the time taken for the levels of RNA or protein molecules to reach a critical threshold defining the triggering of a cellular event. While this mean trigger time can be estimated by numerical integration of determini...
Article
Major computational challenges exist in relation to the collection, curation, processing and analysis of large genomic and imaging datasets, as well as the simulation of larger and more realistic models in systems biology. Here we discuss how a relative newcomer among programming languages-Julia-is poised to meet the current and emerging demands in...
Article
Full-text available
The complexity of biological systems, and the increasingly large amount of associated experimental data, necessitates that we develop mathematical models to further our understanding of these systems. Because biological systems are generally not well understood, most mathematical models of these systems are based on experimental data, resulting in...
Article
Biology is data-rich, and it is equally rich in concepts and hypotheses. Part of trying to understand biological processes and systems is therefore to confront our ideas and hypotheses with data using statistical methods to determine the extent to which our hypotheses agree with reality. But doing so in a systematic way is becoming increasingly cha...
Article
Modelling and simulation of complex biochemical reaction networks form cornerstones of modern biophysics. Many of the approaches developed so far capture temporal fluctuations due to the inherent stochasticity of the biophysical processes, referred to as intrinsic noise. Stochastic fluctuations, however, predominantly stem from the interplay of the...
Preprint
Full-text available
Biology is data-rich, and it is equally rich in concepts and hypotheses. Part of trying to understand biological processes and systems is therefore to confront our ideas and hypotheses with data using statistical methods to determine the extent to which our hypotheses agree with reality. But doing so in a systematic way is becoming increasingly cha...
Preprint
Full-text available
Modelling and simulation of complex biochemical reaction networks form cornerstones of modern biophysics. Many of the approaches developed so far capture temporal fluctuations due to the inherent stochasticity of the biophysical processes, referred to as intrinsic noise. Stochastic fluctuations, however, predominantly stem from the interplay of the...
Article
Innovation in synthetic biology often still depends on large-scale experimental trial and error, domain expertise, and ingenuity. The application of rational design engineering methods promises to make this more efficient, faster, cheaper, and safer. However, this requires mathematical models of cellular systems. For these models, we then have to d...
Preprint
Full-text available
The complexity of biological systems, and the increasingly large amount of associated experimental data, necessitates that we develop mathematical models to further our understanding of these systems. Because biological systems are generally not well understood, most mathematical models of these systems are based on experimental data, resulting in...
Article
Full-text available
In many scientific and technological contexts, we have only a poor understanding of the structure and details of appropriate mathematical models. We often, therefore, need to compare different models. With available data we can use formal statistical model selection to compare and contrast the ability of different mathematical models to describe su...
Article
Full-text available
Single-cell expression profiling opens up new vistas on cellular processes. Extensive cell-to-cell variability at the transcriptomic and proteomic level has been one of the stand-out observations. Because most experimental analyses are destructive we only have access to snapshot data of cellular states. This loss of temporal information presents si...
Article
The Waddington epigenetic landscape has become an iconic representation of the cellular differentiation process. Recent single-cell transcriptomic data provide new opportunities for quantifying this originally conceptual tool, offering insight into the gene regulatory networks underlying cellular development. While many methods for constructing the...
Preprint
Full-text available
Increasing emphasis on data and quantitative methods in the biomedical sciences is making biological research more computational. Collecting, curating, processing, and analysing large genomic and imaging data sets poses major computational challenges, as does simulating larger and more realistic models in systems biology. Here we discuss how a rela...
Article
The formation of spatial structures lies at the heart of developmental processes. However, many of the underlying gene regulatory and biochemical processes remain poorly understood. Turing patterns constitute a main candidate to explain such processes, but they appear sensitive to fluctuations and variations in kinetic parameters, raising the quest...
Article
Full-text available
The predictive power of machine learning models often exceeds that of mechanistic modeling approaches. However, the interpretability of purely data-driven models, without any mechanistic basis is often complicated, and predictive power by itself can be a poor metric by which we might want to judge different methods. In this work, we focus on the re...
Preprint
Full-text available
Cell fate decision making is known to be a complex process and is still far from being understood. The intrinsic complexity, but also features such as molecular noise represent challenges for modelling these systems. Waddington's epigenetic landscape has become the overriding metaphor for developmental processes: it both serves as pictorial represe...
Preprint
Full-text available
The metaphor of the Waddington epigenetic landscape has become an iconic representation of the cellular differentiation process. Recent accessibility of single-cell transcriptomic data has provided new opportunities for quantifying this originally conceptual tool that could offer insight into the gene regulatory networks underlying cellular develop...
Preprint
Full-text available
The formation of spatial structures lies at the heart of developmental processes. However, many of the underlying gene regulatory and biochemical processes remain poorly understood. Turing patterns constitute a main candidate to explain such processes, but they appear sensitive to fluctuations and variations in kinetic parameters, raising the quest...
Preprint
Full-text available
Single-cell expression profiling is destructive, giving rise to only static snapshots of cellular states. This loss of temporal information presents significant challenges in inferring dynamics from population data. Here we provide a formal analysis of the extent to which dynamic variability from within individual systems (“intrinsic noise”) is dis...
Article
Stochastic models are key to understanding the intricate dynamics of gene expression. However, the simplest models that only account for active and inactive states of a gene fail to capture common observations in both prokaryotic and eukaryotic organisms. Here, we consider multistate models of gene expression that generalize the canonical Telegraph...
Article
Noise in gene expression is one of the hallmarks of life at the molecular scale. Here we derive analytical solutions to a set of models describing the molecular mechanisms underlying transcription of DNA into RNA. Our ansatz allows us to incorporate the effects of extrinsic noise—encompassing factors external to the transcription of the individual...
Article
Full-text available
Motivation: Approximate Bayesian computation (ABC) is an important framework within which to infer the structure and parameters of a systems biology model. It is especially suitable for biological systems with stochastic and nonlinear dynamics, for which the likelihood functions are intractable. However, the associated computational cost often lim...
Preprint
Full-text available
Stochastic models are key to understanding the intricate dynamics of gene expression. But the simplest models which only account for e.g. active and inactive states of a gene fail to capture common observations in both prokaryotic and eukaryotic organisms. Here we consider multistate models of gene expression which generalise the canonical Telegrap...
Article
Turing patterns (TPs) underlie many fundamental developmental processes, but they operate over narrow parameter ranges, raising the conundrum of how evolution can ever discover them. Here we explore TP design space to address this question and to distill design rules. We exhaustively analyze 2- and 3-node biological candidate Turing systems, amount...
Chapter
Single cell experimental techniques now allow us to quantify gene expression in up to thousands of individual cells. These data reveal the changes in transcriptional state that occur as cells progress through development and adopt specialized cell fates. In this chapter we describe in detail how to use our network inference algorithm (PIDC)—and the...
Article
Full-text available
One of the central tasks in systems biology is to understand how cells regulate their metabolism. Hierarchical regulation analysis (HRA) is a powerful tool to study this regulation at the metabolic, gene-expression and signaling levels. It has been widely applied to study the steady-state regulation; but analysis of the metabolic dynamics remains c...
Article
Full-text available
Background Reverse engineering of gene regulatory networks from time series gene-expression data is a challenging problem, not only because of the vast sets of candidate interactions but also due to the stochastic nature of gene expression. We limit our analysis to nonlinear differential equation based inference methods. In order to avoid the compu...
Article
Full-text available
Many components of signaling pathways are functionally pleiotropic, and signaling responses are marked with substantial cell-to-cell heterogeneity. Therefore, biochemical descriptions of signaling require quantitative support to explain how complex stimuli (inputs) are encoded in distinct activities of pathways effectors (outputs). A unique perspec...
Article
Full-text available
The construction of effective and informative landscapes for stochastic dynamical systems has proven a long-standing and complex problem. In many situations, the dynamics may be described by a Langevin equation while constructing a landscape comes down to obtaining the quasipotential, a scalar function that quantifies the likelihood of reaching eac...
Preprint
Pluripotent stem cells (PSCs) can self-renew indefinitely while maintaining the ability to generate all cell types of the body. This plasticity is proposed to require heterogeneity in gene expression, driving a metastable state which may allow flexible cell fate choices. Contrary to this, naive PSC grown in fully defined '2i' environmental conditio...
Preprint
Turing patterns (TPs) underlie many fundamental developmental processes, but they operate over narrow parameter ranges, raising the conundrum of how evolution can ever discover them. Here we explore TP design space to address this question and to distill design rules. We exhaustively analyze 2- and 3-node biological candidate Turing systems: crucia...
Preprint
Full-text available
The construction of effective and informative landscapes for stochastic dynamical systems has proven a long-standing and complex problem. In many situations, the dynamics may be described by a Langevin equation while constructing a landscape comes down to obtaining the quasi-potential, a scalar function that quantifies the likelihood of reaching ea...
Preprint
Full-text available
Models describing the process of stem-cell differentiation are plentiful, and may offer insights into the underlying mechanisms and experimentally observed behaviour. Waddington’s epigenetic landscape has been providing a conceptual framework for differentiation processes since its inception. It also allows, however, for detailed mathematical and q...
Preprint
Full-text available
Reverse engineering of gene regulatory networks from time series gene-expression data is a challenging problem, not only because of the vast sets of candidate interactions but also due to the stochastic nature of gene expression. To avoid the computational cost of large-scale simulations, a two-step Gaussian process interpolation based gradient mat...
Article
Full-text available
Motivation: Different experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, me...
Data
Table S1. PhysioSpace Annotation, Related to Figures 1C and S1, Table S2, and STAR Methods List of global gene expression microarray data sourced from public repositories to construct the similarity score.
Data
Results of Gene Ontology search for differentially expressed genes based on global gene expression microarrays.
Data
Table S4. TaqMan Probes Used and Gene Annotation, Related to Figures 1 and 2 and STAR Methods List of oligonucleotides used in this study for single-cell gene expression arrays and corresponding gene annotation based on the literature.
Preprint
Full-text available
While single-cell gene expression experiments present new challenges for data processing, the cell-to-cell variability observed also reveals statistical relationships that can be used by information theory. Here, we use multivariate information theory to explore the statistical dependencies between triplets of genes in single-cell gene expression d...
Article
Full-text available
Pluripotent stem cells can self-renew in culture and differentiate along all somatic lineages in vivo. While much is known about the molecular basis of pluripotency, the mechanisms of differentiation remain unclear. Here, we profile individual mouse embryonic stem cells as they progress along the neuronal lineage. We observe that cells pass from th...
Article
Full-text available
While single-cell gene expression experiments present new challenges for data processing, the cell-to-cell variability observed also reveals statistical relationships that can be used by information theory. Here, we use multivariate information theory to explore the statistical dependencies between triplets of genes in single-cell gene expression d...
Article
Full-text available
Controlling the behaviour of cells by rationally guiding molecular processes is an overarching aim of much of synthetic biology. Molecular processes, however, are notoriously noisy and frequently nonlinear. We present an approach to studying the impact of control measures on motifs of molecular interactions that addresses the problems faced in many...
Article
Full-text available
Dynamical systems describing whole cells are on the verge of becoming a reality. But as models of reality, they are only useful if we have realistic parameters for the molecular reaction rates and cell physiological processes. There is currently no suitable framework to reliably estimate hundreds, let alone thousands, of reaction rate parameters. H...
Article
Single cell transcriptomic data allow us to probe the transcriptional changes occurring during cell development in unprecedented detail. These complex datasets are driving the development of new computational and statistical tools that are revolutionizing our understanding of differentiation processes. Many clustering and dimensionality reduction m...
Article
Full-text available
We determine p53 protein abundances and cell to cell variation in two human cancer cell lines with single cell resolution, and show that the fractional width of the distributions is the same in both cases despite a large difference in average protein copy number. We developed a computational framework to identify dominant mechanisms controlling the...
Data
Uniform prior distributions used for Bayesian analysis. Priors were used for the first iteration, in subsequent steps the previously derived posterior provided Results of Data I are presented in the main text, Data II and Data III in S1 and S2 Figs. (PDF)
Data
Model selection results for four models. To explore further possibilities, we repeat our analysis with the inclusion of two more models, so a total of four models being compared by the model selection algorithm. The additional models are logical extensions of Model I and Model II: Model 0 corresponds to identical rates in all reactions in the two c...
Data
Framework validation on synthetic datasets and parameter values used for synthetic data generation. To validate the discriminative ability of the applied methodology in its final form, we performed model selection on synthetic datasets, generated to reflect the properties of the actual measurements and our hypotheses. We use a basic model of one ce...
Data
Results of computational analysis on the second experimental dataset. (a) Evidence supporting transcription (dark orange curves) and protein degradation (green curves) control. Different markers denote replicates with different prior parameter distributions used at initialising the algorithm. (b) Comparison of experimental and simulated distributio...
Data
Model selection results using modified (non-linear) models. Evidence supporting transcription (dark orange curves) and protein degradation (green curves) regulation. The models used in the inference and selection algorithm are identical to Model I and Model II with the exception of rate parameter of protein degradation changed from k3 to k3*p53, ma...
Data
Parameter estimation results of the protein-degradation based model. Posterior distributions of the four parameters of Model II. Horizontal and vertical axes show possible parameter values and their probability, respectively. Prior distributions of the estimation algorithm were set to uniform ranges as summarised in “Data I—repeat III” in S1 Table....
Data
Comparing relative widths of the distributions of basal p53 protein expression from MCF7 and BE cell lines. To compare the variation in the distributions we fit a gamma distribution to each data set. The shape (k) and scale (θ) parameters were 9.05 and 5.43 × 105 for MCF7 cells and 2.44 and 5.19 × 106 for BE cells, respectively. Consequently, the c...
Data
Model selection results of three synthetic (validation) data sets. Evidence supporting transcription (dark orange curves) and protein degradation (green curves) control. Parameter values used for the generation of target datasets is as indicated in S2 Table. (PDF)
Article
Full-text available
Background: Rhinovirus infection is a major cause of asthma exacerbations. Objectives: We studied nasal and bronchial mucosal inflammatory responses during experimental rhinovirus-induced asthma exacerbations. Methods: We used nasosorption on days 0, 2–5 and 7 and bronchosorption at baseline and day 4 to sample mucosal lining fluid to investigate a...
Article
Full-text available
Osteopontin is a pleiotropic cytokine that is involved in several diseases including multiple sclerosis. Secreted osteopontin is cleaved by few known proteases, modulating its pro-inflammatory activities. Here we show by in vitro experiments that secreted osteopontin can be processed by extracellular proteasomes, thereby producing fragments with no...
Preprint
Full-text available
Pluripotent stem cells are able to self-renew indefinitely in culture and differentiate into all somatic cell types in vivo . While much is known about the molecular basis of pluripotency, the molecular mechanisms of lineage commitment are complex and only partially understood. Here, using a combination of single cell profiling and mathematical mod...
Article
Full-text available
It has previously been shown that subnets differ from global networks from which they are sampled for all but a very limited number of theoretical network models. These differences are of qualitative as well as quantitative nature, and the properties of subnets may be very different from the corresponding properties in the true, unobserved network....
Article
Full-text available
The haematopoietic stem cell (HSC) niche provides essential micro-environmental cues for the production and maintenance of HSCs within the bone marrow. During inflammation, haematopoietic dynamics are perturbed, but it is not known whether changes to the HSC-niche interaction occur as a result. We visualise HSCs directly in vivo, enabling detailed...
Article
Full-text available
Controlling the behaviour of cells by rationally guiding molecular processes is an overarching aim of much of synthetic biology. Molecular processes, however, are notoriously noisy and frequently non-linear. We present an approach to studying the impact of control measures on motifs of molecular interactions, that addresses the problems faced in bi...
Data
Movie S6. In Vivo Model of the Inflammatory Response to Chronic Non-healing Wounds, Related to Figure 6 in vivo imaging of extra-large wounds that fail to heal and remain open even 24 hours post-injury. Unlike normal healing wounds, the inflammatory response to these non-healing wounds is significantly attenuated, even at the earliest stages post-...
Data
Movie S2. Conditions Used to Analyze Spatiotemporal Dynamics of Immune Cell Behavior before and after Wounding, Related to Figure 1 In vivo time-lapse movies of the dynamic behavior of Drosophila immune cells in control unwounded tissue (left) and in response to tissue wounding (center and right, for small and large wounds, respectively) generated...