ThesisPDF Available

Deterministic approximation schemes with computable errors for the distributions of Markov chains

Authors:

Abstract and Figures

http://hdl.handle.net/10044/1/59103
Content may be subject to copyright.
A preview of the PDF is not available
... bounds on ρ(w) for norm-like functions w, as in (2.5)-(2.6)). Such bounds can be obtained using either Foster-Lyapunov criteria [34,35,15,27,30] or mathematical programming [24,45,19,27,29,44,8,9,14]. ...
... bounds on ρ(w) for norm-like functions w, as in (2.5)-(2.6)). Such bounds can be obtained using either Foster-Lyapunov criteria [34,35,15,27,30] or mathematical programming [24,45,19,27,29,44,8,9,14]. ...
... Bounds of this type can be obtained through various analytical and numerical methods. For instance, the moment bounds for the applications in this paper can be computed using either Foster-Lyapunov criteria [15,35,34,27,30] or mathematical programming approaches, i.e. by solving linear or semidefinite programs [24,19,27,45,29,44,8,9,14]. ...
Article
Full-text available
We study a class of countably infinite linear programs (CILPs) whose feasible sets are bounded subsets of appropriately defined spaces of measures. The optimal value, optimal points, and minimal points of these CILPs can be approximated by solving finite-dimensional linear programs. We show how to construct finite-dimensional programs that lead to approximations with easy-to-evaluate error bounds, and we prove that the errors converge to zero as the size of the finite-dimensional programs approaches that of the original problem. We discuss the use of our methods in the computation of the stationary distributions, occupation measures, and exit distributions of Markov chains.
... Among these are methods that approximate the distribution only within a given finite subset (a truncation) of the state space, neglecting the rest. Such truncation-based schemes date back to the late 60s [149,150] and several ways to solve the truncated problem have been proposed over the last decade [12,133,36,74,106,113,115,114,86,25,24,69,108,107,34,156,99,98,95,13,35]. How to compute and control the approximation errors introduced by these schemes is a matter of ongoing research. ...
... are any increasing truncations approaching the state space (3.2). These notions of convergence form a hierarchy [95,Chap. 5]: if w is positive and norm-like, then convergence (3.8) in w-norm ⇒ w-weakly* ⇒ in 1 ⇔ in total variation ⇒ pointwise, for any sequence of points in the space 1 . If the approximations π r are probability distributions (i.e. ...
... see Section 5), truncations with small tail masses typically also result in small scheme-specific errors for the schemes studied in Section 4. Choosing a truncation with a verifiably small tail mass is a difficult task. We know of two ways to systematically generate truncations accompanied by bounds on their tail masses: using Foster-Lyapunov criteria [34,36,156] or using moment bounds [99,95,98]. For the sake of simplicity, we focus here on the latter and leave the former to Remark 3.1 below. ...
Article
Full-text available
Computing the stationary distributions of a continuous-time Markov chain (CTMC) involves solving a set of linear equations. In most cases of interest, the number of equations is infinite or too large, and the equations cannot be solved analytically or numerically. Several approximation schemes overcome this issue by truncating the state space to a manageable size. In this review, we first give a comprehensive theoretical account of the stationary distributions and their relation to the long-term behaviour of CTMCs that is readily accessible to non-experts and free of irreducibility assumptions made in standard texts. We then review truncation-based approximation schemes for CTMCs with infinite state spaces paying particular attention to the schemes' convergence and the errors they introduce, and we illustrate their performance with an example of a stochastic reaction network of relevance in biology and chemistry. We conclude by elaborating on computational trade-offs associated with error control and several open questions.
... In the linear case, we obtain analytical formulas for the mean extinction time. In the nonlinear case, we design a numerical scheme based on a rigorous finite state projection (see Munsky and Khammash 2006;Kuntz 2017) and coupling techniques to assess the mean extinction time. In both cases, we study the sensitivity of the extinction time, as well as of the proliferative cell number at extinction time, with respect to the parameter values. ...
... 2. For E [C τ ], we take, for all c ∈ N, g 0 (c) = c and α = 0. We can notice that system (13), which is similar to the Kolmogorov backward equation, is unclosed, and there exists no analytical solution. We now obtain a numerical estimate for the scalar g( f 0 , 0) using a domain truncation method, as proposed in Munsky and Khammash (2006) and Kuntz (2017). ...
... We have studied in details the extinction time of the precursor cell population, and designed an algorithm to compute numerically both the mean extinction time and mean number of proliferative cells at the extinction time. The algorithm is based on a domain truncation similar to the finite state projection (FSP) method proposed in Munsky and Khammash (2006) and Kuntz (2017). The FSP approach aims to approximate the law of the process at a given time by solving a truncated version of the Kolmogorov forward system. ...
Article
Full-text available
In mammals, female germ cells are sheltered within somatic structures called ovarian follicles, which remain in a quiescent state until they get activated, all along reproductive life. We investigate the sequence of somatic cell events occurring just after follicle activation, starting by the awakening of precursor somatic cells, and their transformation into proliferative cells. We introduce a nonlinear stochastic model accounting for the joint dynamics of the two cell types, and allowing us to investigate the potential impact of a feedback from proliferative cells onto precursor cells. To tackle the key issue of whether cell proliferation is concomitant or posterior to cell awakening, we assess both the time needed for all precursor cells to awake, and the corresponding increase in the total cell number with respect to the initial cell number. Using the probabilistic theory of first passage times, we design a numerical scheme based on a rigorous finite state projection and coupling techniques to compute the mean extinction time and the cell number at extinction time. We find that the feedback term clearly lowers the number of proliferative cells at the extinction time. We calibrate the model parameters using an exact likelihood approach. We carry out a comprehensive comparison between the initial model and a series of submodels, which helps to select the critical cell events taking place during activation, and suggests that awakening is prominent over proliferation.
... The validity of the finite state projection has already been proven in [58] and subsequent work. More recently, Thomas Kuntz's thesis work has tackled the characterization of a Markov chain exit from a domain (absorbing frontier) [59,60]. This approach is detailed in Chapter 2 where we develop a rigorous method based on it to simulate hitting times (specifically, extinction times). ...
... In the linear case (β 1 = 0), we obtain analytical formulas for the mean extinction time. In the nonlinear case, we design a numerical scheme based on a rigorous Finite State Projection (see [58,59]) and coupling techniques to assess the mean extinction time. In both cases, we study the sensitivity of the extinction time, as well as the cell number at extinction time, with respect to the parameter values. ...
... which is similar to the Kolmogorov backward equation, is unclosed, and there exists no analytical solution. We can obtain a numerical estimate for the scalar g(f 0 , 0) using a domain truncation method, as proposed in [58,59]. ...
Thesis
This thesis aims to design and analyze population dynamics models dedicated to the dynamics of somatic cells during the early stages of ovarian follicle growth. The model behaviors are analyzed through theoretical and numerical approaches, and the calibration of parameters is performed by proposing maximum likelihood strategies adapted to our specific dataset. A non-linear stochastic model, that accounts for the joint dynamics of two cell types (precursors and proliferative), is dedicated to the activation of follicular growth. In particular, we compute the extinction time of precursor cells. A rigorous finite state projection approach is implemented to characterize the system state at extinction. A linear multitype age-structured model for the proliferative cell population is dedicated to the early follicle growth. The different types correspond here to the spatial cell positions. This model is of decomposable kind; the transitions are unidirectional from the first to the last spatial type. We prove the long-term convergence for both the stochastic Bellman-Harris model and the multi-type McKendrick-VonFoerster equation. We adapt existing results in a context where the Perron-Frobenius theorem does not apply, and obtain explicit analytical formulas for the asymptotic moments of cell numbers and stable age distribution. We also study the well-posedness of the inverse problem associated with the deterministic model.
... Among these are methods that approximate the distribution only within a given finite subset (a truncation) of the state space, neglecting the rest. Such truncation-based schemes date back to the late 60s [149,150] and several ways to solve the truncated problem have been proposed over the last decade [12,133,36,74,106,113,115,114,86,25,24,69,108,107,34,156,99,98,95,13,35]. How to compute and control the approximation errors introduced by these schemes is a matter of ongoing research. ...
... are any increasing truncations approaching the state space (3.2). These notions of convergence form a hierarchy [95,Chap. 5]: if w is positive and norm-like, then convergence (3.8) in w-norm ⇒ w-weakly* ⇒ in 1 ⇔ in total variation ⇒ pointwise, for any sequence of points in the space 1 . If the approximations π r are probability distributions (i.e. ...
... see Section 5), truncations with small tail masses typically also result in small scheme-specific errors for the schemes studied in Section 4. Choosing a truncation with a verifiably small tail mass is a difficult task. We know of two ways to systematically generate truncations accompanied by bounds on their tail masses: using Foster-Lyapunov criteria [34,36,156] or using moment bounds [99,95,98]. For the sake of simplicity, we focus here on the latter and leave the former to Remark 3.1 below. ...
Preprint
Full-text available
Computing the stationary distributions of a continuous-time Markov chain involves solving a set of linear equations. In most cases of interest, the number of equations is infinite or too large, and cannot be solved analytically or numerically. Several approximation schemes overcome this issue by truncating the state space to a manageable size. In this review, we first give a comprehensive theoretical account of the stationary distributions and their relation to the long-term behaviour of the Markov chain, which is readily accessible to non-experts and free of irreducibility assumptions made in standard texts. We then review truncation-based approximation schemes paying particular attention to their convergence and to the errors they introduce, and we illustrate their performance with an example of a stochastic reaction network of relevance in biology and chemistry. We conclude by elaborating on computational trade-offs associated with error control and some open questions.
... bounds on ρ(w) for norm-like functions w, as in (2.5)-(2.6)). Such bounds can be obtained using either Foster-Lyapunov criteria [34,35,15,27,30] or mathematical programming [24,45,19,27,29,44,8,9,14]. ...
... bounds on ρ(w) for norm-like functions w, as in (2.5)-(2.6)). Such bounds can be obtained using either Foster-Lyapunov criteria [34,35,15,27,30] or mathematical programming [24,45,19,27,29,44,8,9,14]. ...
... Bounds of this type can be obtained through various analytical and numerical methods. For instance, the moment bounds for the applications in this paper can be computed using either Foster-Lyapunov criteria [15,35,34,27,30] or mathematical programming approaches, i.e. by solving linear or semidefinite programs [24,19,27,45,29,44,8,9,14]. ...
Preprint
Full-text available
We study a class of countably-infinite-dimensional linear programs (CILPs) whose feasible sets are bounded subsets of appropriately defined spaces of measures. The optimal value, optimal points, and minimal points of these CILPs can be approximated by solving finite-dimensional linear programs. We show how to construct finite-dimensional programs that lead to approximations with easy-to-evaluate error bounds, and we prove that the errors converge to zero as the size of the finite-dimensional programs approaches that of the original problem. We discuss the use of our methods in the computation of the stationary distributions, occupation measures, and exit distributions of Markov chains.
... which can be checked using Foster-Lyapunov criteria [44,38]. Note that a bound on the approximation error of ν r (1.17) could be obtained by obtaining an upper bound on the mean exit time E [τ ]. ...
... Theorem 2.2 (Kolmogorov's forward equations, see Section 2.3 of [38]). Suppose that the diagonal of the rate matrix is γ-integrable: ...
... This can be achieved, for instance, by employing simulation-based criteria to guide the truncation choice (as suggested for the FSP scheme [49,57]) or by expanding the truncation efficiently based on properties of the exit distribution from the truncated space [51]. Alternatively, one could use moment bounds and Markov's inequality to guide the truncation choice and to quantify the approximation errors a priori (see [38,39,46,29] for more on this type of approach). ...
Article
Full-text available
We introduce the exit time finite state projection (ETFSP) scheme, a truncation-based method that yields approximations to the exit distribution and occupation measure associated with the time of exit from a domain (i.e., the time of first passage to the complement of the domain) of time-homogeneous continuous-time Markov chains. We prove that: (i) the computed approximations bound the measures from below; (ii) the total variation distances between the approximations and the measures decrease monotonically as states are added to the truncation; and (iii) the scheme converges, in the sense that, as the truncation tends to the entire state space, the total variation distances tend to zero. Furthermore, we give a computable bound on the total variation distance between the exit distribution and its approximation, and we delineate the cases in which the bound is sharp. We also revisit the related finite state projection scheme and give a comprehensive account of its theoretical properties. We demonstrate the use of the ETFSP scheme by applying it to two biological examples: the computation of the first passage time associated with the expression of a gene, and the fixation times of competing species subject to demographic noise.
... Here, we present two different mathematical programming approaches that yield bounds on, and approximations of, the stationary solutions of the CME. Our first approach builds on our previous work 26,32,33 , and uses semidefinite programming to obtain upper and lower bounds on the moments of stationary solutions of networks with polynomial and rational propensities. The scheme constrains the possible solutions of a truncated, underdetemined set of moment equations by appending semidefinite inequalities that are satisfied by all probability distributions on the state space. ...
... Furthermore, the rational moments satisfy well-known semidefinite inequalities 33,41,42 . Specifically, the localising matrices are positive semidefinite: ...
... When w is norm-like, convergence in weak* implies convergence in total variation-see Ref. 33 (Remark 5.10). ...
Article
Full-text available
The stochastic dynamics of biochemical networks are usually modelled with the chemical master equation (CME). The stationary distributions of CMEs are seldom solvable analytically, and numerical methods typically produce estimates with uncontrolled errors. Here, we introduce mathematical programming approaches that yield approximations of these distributions with computable error bounds which enable the verification of their accuracy. First, we use semidefinite programming to compute increasingly tighter upper and lower bounds on the moments of the stationary distributions for networks with rational propensities. Second, we use these moment bounds to formulate linear programs that yield convergent upper and lower bounds on the stationary distributions themselves, their marginals and stationary averages. The bounds obtained also provide a computational test for the uniqueness of the distribution. In the unique case, the bounds form an approximation of the stationary distribution with a computable bound on its error. In the non-unique case, our approach yields converging approximations of the ergodic distributions. We illustrate our methodology through several biochemical examples taken from the literature: Schlogl's model for a chemical bifurcation, a two-dimensional toggle switch, a model for bursty gene expression, and a dimerisation model with multiple stationary distributions.
... Theorem 10 as given here seems to have been first shown in (Kuntz, 2017), although I would take this with a grain of salt. Lemma 11 which makes up the bulk of the theorem's proof goes back to (Chung, 1967), if not earlier. ...
... Notes and references. The treatment in this section follows that in (Kuntz, 2017;?). However, the simple ideas underpinning the above characterisation are classical. ...
Preprint
Full-text available
A rigorous and largely self-contained account of (a) the bread-and-butter concepts and techniques in Markov chain theory and (b) the long-term behaviour of chains. As much as possible, the treatment is probabilistic instead of analytical (I stay away from semigroup theory). Personally, I tend to find that the intuition lies with the former and not the latter. This manuscript is geared towards those interested in the use of Markov chains as models of real-life phenomena. For this reason, I focus on the type of chains most commonly encountered in practice (time-homogeneous, minimal, and right-continuous in the discrete topology) and choose a starting point very familiar to this audience: the (Kendall-) Gillespie Algorithm commonly used to simulate these chains. In order to keep the prerequisite knowledge and technical complications to a minimum, I take a 'bare-bones' approach that keeps the focus on chains (instead of more general processes) to an almost pathological degree: I use the 'jump and hold' structure of chains extensively; almost no martingale theory; minimal coupling; no stochastic calculus; and, even though regeneration and renewal (of course!) feature in the manuscript, they do so exclusively in the context of chains. I have also taken some extra steps to avoid imposing certain assumptions encountered in other texts that sometimes prove to be stumbling blocks in practice (e.g., irreducibility of the state space, boundedness of test functions and of stopping times). For more details, see the preface.
Article
Full-text available
We introduce the exit time finite state projection (ETFSP) scheme, a truncation-based method that yields approximations to the exit distribution and occupation measure associated with the time of exit from a domain (i.e., the time of first passage to the complement of the domain) of time-homogeneous continuous-time Markov chains. We prove that: (i) the computed approximations bound the measures from below; (ii) the total variation distances between the approximations and the measures decrease monotonically as states are added to the truncation; and (iii) the scheme converges, in the sense that, as the truncation tends to the entire state space, the total variation distances tend to zero. Furthermore, we give a computable bound on the total variation distance between the exit distribution and its approximation, and we delineate the cases in which the bound is sharp. We also revisit the related finite state projection scheme and give a comprehensive account of its theoretical properties. We demonstrate the use of the ETFSP scheme by applying it to two biological examples: the computation of the first passage time associated with the expression of a gene, and the fixation times of competing species subject to demographic noise.
Article
Full-text available
We consider the problem of computing first-passage time distributions for reaction processes modeled by master equations. We show that this generally intractable class of problems is equivalent to a sequential Bayesian inference problem for an auxiliary observation process. The solution can be approximated efficiently by solving a closed set of coupled ordinary differential equations (for the low-order moments of the process) whose size scales with the number of species. We apply it to an epidemic model and a trimerization process and show good agreement with stochastic simulations.
Article
Full-text available
The chemical master equation (CME) is frequently used in systems biology to quantify the effects of stochastic fluctuations that arise due to biomolecular species with low copy numbers. The CME is a system of ordinary differential equations that describes the evolution of probability density for each population vector in the state-space of the stochastic reaction dynamics. For many examples of interest, this state-space is infinite, making it difficult to obtain exact solutions of the CME. To deal with this problem, the Finite State Projection (FSP) algorithm was developed by Munsky and Khammash (Jour. Chem. Phys. 2006), to provide approximate solutions to the CME by truncating the state-space. The FSP works well for finite time-periods but it cannot be used for estimating the stationary solutions of CMEs, which are often of interest in systems biology. The aim of this paper is to develop a version of FSP which we refer to as the stationary FSP (sFSP) that allows one to obtain accurate approximations of the stationary solutions of a CME by solving a finite linear-algebraic system that yields the stationary distribution of a continuous-time Markov chain over the truncated state-space. We establish that these approximations are guaranteed to converge to the exact stationary solution as the truncated state-space expands to the full state-space. We provide several examples to illustrate our sFSP method and demonstrate its efficiency in estimating the stationary distributions. In particular, we show that using a quantized tensor train (QTT) implementation of our sFSP method, problems admitting more than 100 million states can be efficiently solved.
Article
The method of moments has been proposed as a potential means to reduce the dimensionality of the chemical master equation (CME) appearing in stochastic chemical kinetics. However, attempts to apply the method of moments to the CME usually result in the so-called closure problem. Several authors have proposed moment closure schemes, which allow them to obtain approximations of quantities of interest, such as the mean molecular count for each species. However, these approximations have the dissatisfying feature that they come with no error bounds. This paper presents a fundamentally different approach to the closure problem in stochastic chemical kinetics. Instead of making an approximation to compute a single number for the quantity of interest, we calculate mathematically rigorous bounds on this quantity by solving semidefinite programs. These bounds provide a check on the validity of the moment closure approximations and are in some cases so tight that they effectively provide the desired quantity. In this paper, the bounded quantities of interest are the mean molecular count for each species, the variance in this count, and the probability that the count lies in an arbitrary interval. At present, we consider only steady-state probability distributions, intending to discuss the dynamic problem in a future publication.
Article
The method of moments has been proposed as a potential means to reduce the dimensionality of the chemical master equation (CME) appearing in stochastic chemical kinetics. However, attempts to apply the method of moments to the CME usually result in the so-called closure problem. Several authors have proposed moment closure schemes, which allow them to obtain approximations of quantities of interest, such as the mean count of molecules of each species. However, these approximations have the dissatisfying feature that they come with no error bounds. Recently, a method was proposed for calculating rigorous bounds on quantities of interest for stochastic chemical kinetic systems at steady state. In particular, the method used semidefinite programming to calculate bounds on mean molecular counts, variances in these counts, and probability histograms. In this paper, we extend that idea to the associated dynamic problem -- calculating rigorous time-varying bounds on means and variances.
Article
This book concerns discrete-time homogeneous Markov chains that admit an invariant probability measure. The main objective is to give a systematic, self-contained presentation on some key issues about the ergodic behavior of that class of Markov chains. These issues include, in particular, the various types of convergence of expected and pathwise occupation measures, and ergodic decompositions of the state space. Some of the results presented appear for the first time in book form. A distinguishing feature of the book is the emphasis on the role of expected occupation measures to study the long-run behavior of Markov chains on uncountable spaces. The intended audience are graduate students and researchers in theoretical and applied probability, operations research, engineering and economics.
Book
The book contains review articles on recent advances in first-passage phenomena and applications contributed by leading international experts. It is intended for graduate students and researchers who are interested in learning about this intriguing and important topic. © 2014 by World Scientific Publishing Co. Pte. Ltd. All rights reserved.
Article
Model-based prediction of stochastic noise in biomolecular reactions often resorts to approximation with unknown precision. As a result, unexpected stochastic fluctuation causes a headache for the designers of biomolecular circuits. This paper proposes a convex optimization approach to quantifying the steady state moments of molecular copy counts with theoretical rigor. We show that the stochastic moments lie in a convex semi-algebraic set specified by linear matrix inequalities. Thus, the upper and the lower bounds of some moments can be computed by a semidefinite program. Using a protein dimerization process as an example, we demonstrate that the proposed method can precisely predict the mean and the variance of the copy number of the monomer protein.