Content uploaded by Alexander Andreychenko

Author content

All content in this area was uploaded by Alexander Andreychenko on Apr 18, 2014

Content may be subject to copyright.

arXiv:1006.4425v1 [math.PR] 23 Jun 2010

On-the-ﬂy Uniformization of Time-Inhomogeneous

Inﬁnite Markov Population Models

Aleksander Andreychenko and Pepijn Crouzen and Verena Wolf

Computer Science

Saarland University

Saarbr¨ucken, Germany

{andreychenko, crouzen, wolf}@cs.uni-saarland.de

Abstract—This paper presents an on-the-ﬂy uniformization

technique for the analysis of time-inhomogeneous Markov pop-

ulation models. This technique is applicable to models with

inﬁnite state space and unbounded rates, which are, for instance,

encountered in the realm of biochemical reaction networks.

To deal with the inﬁnite state space, we dynamically maintain

a ﬁnite subset of the states where most of the probability mass

is located. This approach yields an under-approximation of the

original, inﬁnite system. We present experimental results to show

the applicability of our technique.

I. INTRODUCTION

Markov population models (MPMs) are continuous-time

Markov processes, where the state of the system is a vector

of natural numbers (i.e., the populations). Such models are

used in various application domains: biology, where the state

variables describe the population sizes of different organisms,

queueing theory, where we model a state as a vector of queue

occupancies, chemistry, where the state variables represent the

amount of molecules of different chemical species, etc [9].

Besides the expectations and variances of the different

populations, the probabilities of certain events occurring can

be of interest when studying MPMs. It may be necessary

to know the probability of the extinction of a species, the

probability that a population reaches a certain threshold, or

even the full distribution of the MPM at a certain time-point,

for instance to calibrate model parameters.

Many Markov population models have inﬁnitely many

states. In the case of biological or chemical applications, we

normally cannot provide hard upper bounds for population

numbers and in the ﬁeld of queueing theory it may be

interesting to consider unbounded queues. The evaluation of

inﬁnite MPMs through numerical [3] or statistical [5] analysis

has been well-studied for time-homogeneous models where the

dynamics of the system are independent of time.

However, we also ﬁnd many time-inhomogeneous Markov

models, where the dynamics of the system does indeed change

over time. When modeling an epidemic, we may have to take

into account that infection rates vary seasonally. For trafﬁc

models, time-dependent arrival rates can be used to model the

morning and evening rush hours. In cellular biology we see

that reaction propensities depend on the cell volume, which

waxes and wains as the cell grows and divides. The class

of ﬁnite time-inhomogeneous Markov models has also been

studied in recent years [2], [13], [14].

In this paper, we develop a numerical algorithm to ap-

proximate transient probability distributions (i.e., the proba-

bility to be in a certain state at a certain time) for inﬁnite

time-inhomogeneous MPMs. We consider MPMs with state-

dependent rates and do not require the existence of an upper-

bound for the transition rates in the MPM.

Our algorithm is based on the uniformization technique,

which is a well-known method to approximate the tran-

sient probability distribution of ﬁnite time-homogeneous

Markov models [8], [7]. Recently, two adaptations of uni-

formization have been developed. These adaptations re-

spectively approximate the transient probabilities for ﬁnite

time-inhomogeneous [2] and inﬁnite time-homogeneous [3]

Markov models. Our algorithm combines and reﬁnes these two

techniques such that inﬁnite time-inhomogeneous MPMs with

unbounded rates can be tackled. We present two cases studies

to investigate the effectiveness of our approach.

II. MARKOV POPULATION MODELS

Markov chains with large or even inﬁnite state spaces

are usually described by some high-level modeling formal-

ism that allows the generation of a (possibly inﬁnite) set

of states and transitions. Here, we use transition classes to

specify a Markov population model, that is, a continuous-

time Markov chain (CTMC) {X(t), t ≥0}with state space

S=Zn

+={0,1,...}n, where the i-th state variable represents

the number of instances of the i-th species. Depending on the

application area, “species” stands for types of system com-

ponents, molecules, customers, etc. The application areas that

we have in mind are chemical reaction networks, performance

evaluation of computer systems, logistics, epidemics, etc [9].

Deﬁnition 1 (Transition Class) A transition class τis a

triple (G, v, α)where G⊆Zn

+is the guard,v∈Znis the

change vector, and α:G×R≥0→R≥0is the rate function.

The guard is the set of states where an instance of τis possible,

and if the current state is x∈Gthen x+v∈Zn

+is the state

after an instance of τhas occurred. The rate α(x, t)determines

the time-dependent transition probabilities for an inﬁnitesimal

time-step dt

Pr (X(t+dt) = x+v|X(t) = x) = α(x, t)·dt.

A CTMC Xcan be speciﬁed by a set of mtransition

classes τ1,...,τmas follows. For j∈ {1,...,m}, let τj=

(Gj, vj, αj). We deﬁne the generator matrix Q(t)of Xsuch

that the row that describes the transitions of a state xhas entry

αj(x, t)at position Q(t)x,x+vjwhenever x∈Gj. Moreover,

the diagonal entries of Q(t)are the negative sums of the off-

diagonal row entries because the row sums of a generator

matrix are zero. We assume that each change vector vjhas

at least one non-zero entry.

Example 1 We consider a simple gene expression model for

E. coli cells [12]. It consists of the transcription of a gene

into messenger RNA (mRNA) and subsequent translation of

the latter into proteins. A state of the system is uniquely

determined by the number of mRNA and protein molecules,

that is, a state is a pair (xR, xP)∈Z2

+. We assume that

initially there are no mRNA molecules and no proteins in the

system, i.e., Pr (X(0) = (0,0)) = 1. Four types of reactions

occur in the system. Let j∈ {1,...,4}and τj= (Gj, uj, αj)

be the transition class that describes the j-th reaction type. We

ﬁrst deﬁne the guard sets G1,...,G4and the update functions

u1,...,u4.

•Transition class τ1models gene transcription. The corre-

sponding stoichiometric equation is ∅ → mRNA. If a τ1-

transition occurs, the number of mRNA molecules increases

by one. Thus, u1(xR, xP) = (xR+ 1, xP). This transition

class is possible in all states, i.e., G1=Z2

+.

•We represent the translation of mRNA into protein by τ2

(mRNA →mRNA+P). A τ2-transition is only possible if there

is at least one mRNA molecule in the system. We set G2=

{(xR, xP)∈Z2

+|xR>0}and u2(xR, xP) = (xR, xP+

1). Note that in this case mRNA is a reactant that is not

consumed.

•Both mRNA and protein molecules can degrade, which is

modeled by τ3and τ4(mRNA → ∅ and P → ∅). Hence,

G3=G2,G4={(xR, xP)∈Z2

+|xP>0},u3(xR, xP) =

(xR−1, xP), and u4(xR, xP) = (xR, xP−1).

Let k1, k2, k3, k4be real-valued positive constants. We

assume that transcription happens at rate α1(xR, xP, t) =

k1·V(t), that is, the rate is proportional to the cell vol-

ume V(t)[15]. The (time-independent) translation rate de-

pends linearly on the number of mRNA molecules. Therefore,

α2(xR, xP, t) = k2·xR. Finally, for degradation, we set

α3(xR, xP, t) = k3·xRand α4(xR, xP, t) = k4·xP.

We now discuss the transient probability distribution of a

MPM. Let Sbe the state space of Xand let the transition

function P(t, t + ∆) be such that the entry for the pair (x, y)

of states equals

P(t, t + ∆)xy =Pr (X(t+ ∆) = y|X(t) = x), t, ∆≥0.

If the initial probabilities Pr (X(0) = x)are speciﬁed for

each x∈S, the transient state probabilities p(t)(x) :=

Pr (X(t) = x), are given by

p(t)(y) = Xx∈Sp(0)(x)·P(0, t)xy .

We assume that a transition class description uniquely speciﬁes

a CTMC and rule out “pathological cases” by assuming that

the sample paths X(t)(ω)are right-continuous step functions.

In this case the transition functions are the unique solution of

the Kolmogorov backward and forward equations

d

dt P(t0, t) = Q(t)·P(t0, t)(1)

d

dt P(t0, t) = P(t0, t)·Q(t),(2)

where 0≤t0≤t. Multiplication of Eq. (2) with the row

vector p(t)with entries p(t)(x)gives

d

dt p(t)=p(t)·Q(t).(3)

If Sis ﬁnite, algorithms for the computation of p(t)are usually

based on the numerical integration of the linear system of

differential equations in Eq. (3) with initial condition p(0).

Here, we focus on another approach called uniformization that

is widely used for time-homogeneous Markov chains [8]. It

has been adapted for time-inhomogeneous Markov chains by

Van Dijk [13] and subsequently improved [14], [2]. The main

advantage of solution techniques based on uniformization is

that they provide an underapproximation of the vector p(t)

and, thus, provide tight error bounds. Moreover, they are

numerically stable and often superior to numerical integration

methods in terms of running times [11].

III. UNIFORMIZATION

Uniformization is based on the idea to construct, for a

CTMC X, a Poisson process N(t), t ≥0and a subordinated

discrete-time Markov chain (DTMC) Y(i), i ∈Nsuch that for

all xand for all t

Pr (X(t) = x) = Pr (Y(N(t)) = x).(4)

For a ﬁnite time-homogeneous MPM with state space Sthe

rate Λof the Poisson process N(also called the uniformization

rate) is chosen to be greater or equal than the maximal exit-

rate appearing in X

Λ≥max

x∈S

m

X

j=1

αj(x).

For the DTMC Ywe ﬁnd transition probabilities

Pr (Y(i+1) = x+vj|Y(i) = x) = αj(x)

Λ.

When Xis time-inhomogeneous, Arns et al. [2] suggest to

deﬁne the time-dependent uniformization rate Λ(t)of Nas

Λ(t)≥max

x∈S

m

X

j=1

αj(x, t).(5)

The (time-dependent) transition probabilities of the DTMC Y

are then such that αj(x,t)

Λ(t)is the probability to enter state x+vj

from state xif a state-change occurs at time t. Arns et al. prove

that Eq. (4) is true if the αjare (right or left) continuous

functions in tand if Sis ﬁnite (see Theorem 7 in [2]). Here,

we relax the latter condition and allow Sto be inﬁnite. If

supx∈SPjαj(x, t)<∞during the time interval of interest,

the proof of Eq. (4) goes along similar lines. If, however,

supx∈SPjαj(x, t) = ∞then the Poisson process Nis not

well-deﬁned as its rate must be inﬁnite according to Eq. (5).

Therefore, the inﬁnite state space has to be truncated in an

appropriate way.

A. State Space Truncation

We consider a time interval [t, t + ∆) of length ∆,

where the transient distribution at time t,p(t), of the inﬁ-

nite time-inhomogeneous MPM Xis known. We now wish

to approximate the transient distribution at time t+ ∆,

p(t+∆). We assume that p(t)has ﬁnite support St,0. Deﬁne

Pr (N(t, t + ∆) = i) = Pr (N(t+ ∆) −N(t) = i)as the

probability that Nperforms isteps within [t, t + ∆). For a

ﬁxed positive ǫ≪1, let Rand the rate function Λbe such

that St,R is the set of states that are reachable from the set St,0

within at most Rtransitions, where Ris the minimal number

of steps that Nperforms within [t, t + ∆) with probability

1−ǫ, i.e.

R

X

i=0

Pr (N(t, t + ∆) = i)≥1−ǫ. (6)

Furthermore, we have that the rate of Nat time t′∈[t, t + ∆)

must satisfy

Λ(t′)≥max

x∈St,R

m

X

j=1

αj(x, t′).(7)

Note that Λ(t′)is adaptive and depends on t′,t,∆,St,0, and

Ras opposed to Arns et al. where Λ(t′)depends only on t′,

t, and ∆.

Finding appropriate values for ∆and Ris non-trivial as

Λ(t′)determines the speed of the Poisson process Nand

thereby inﬂuences the value of R. On the other hand, R

determines the size of the set St,R and thus inﬂuences Λ(t′).

We discuss how to ﬁnd appropriate choices for ∆and Rgiven

the set St,0in Section IV-A.

Assume that we ﬁnd ∆and Rwith the above mentioned

properties and deﬁne Λ(t′)as in Eq. (7). Then, for all x∈S,

we get an ǫ-approximation

Pr (X(t+∆) = x)≥

R

X

i=0

Pr (Y(i)=x∧N(t, t+ ∆) = i),(8)

where Yhas initial distribution p(t). The probabilities

Pr (Y(i) = x∧N(t, t + ∆) = i)can now be approximated in

the same way as for the ﬁnite case [2].

From Eq. (8) we see that it is beneﬁcial if Ris small,

since this means fewer probabilities have to be computed in

the right-hand side of Eq. (8). Note that the truncation-point

Ris small when the uniformization rates Λ(t′)are small

during [t, t + ∆) because if Njumps at a slower rate then

Pr (N(t, t + ∆) > i)becomes smaller. Thus, it is beneﬁcial to

choose Λ(t′)as small as possible while still satisfying Eq. (7).

B. Bounding approach

Let ˆp(t+∆)(x)denote the right hand side of Eq. (8), i.e., the

approximation of the transient probability of state xat time

t+∆. We compute this approximation with the uniformization

method as follows. The processes Yand Nare independent

which implies that

Pr (Y(i)=x∧N(t, t+ ∆) = i)

=Pr (Y(i)=x)·Pr (N(t, t+∆) = i).

The probabilities Pr (N(t, t + ∆) = i)follow a Poisson dis-

tribution with parameter ¯

Λ(t, t + ∆) ·∆, where

¯

Λ(t, t + ∆) = 1

∆Rt+∆

tΛ(t′)dt′.

For the distribution Pr (Y(i)=x), Arns et al. suggest an

underapproximation that relies on the fact that for any time-

point t′∈[t, t + ∆) we have:

αj(x,t′)

Λ(t′)≥mint′′∈[t,t+∆)

αj(x,t′′)

Λ(t′′)=: uj(x, t, t + ∆).

Thus, for i∈ {1,2,...,R}, we iteratively approximate

Pr (Y(i)=y)as

Pr (Y(i)=y)≥P

x,j:y=x+vj

Pr (Y(i−1)=x)·uj(x, t, t+ ∆)

+Pr (Y(i−1)=y)·u0(y, t, t +∆).(9)

Here, xranges over all direct predecessors of yand the self-

loop probability u0(y, t, t + ∆) of yis given by

u0(y, t, t + ∆) = min

t′∈[t,t+∆) 1−

m

P

j=1

αj(y,t′)

Λ(t′)!.

Note that often we can split αj(x, t′)into two factors λj(t′)

and rj(x)such that αj(x, t′) = λj(t′)·rj(x)for all t′, j, x1.

Thus, the functions λj:R≥0→R>0contain the time-

dependent part (but are state-independent) and the functions

rj:S→R>0contain the state-dependent part (but are

time-independent). Then each minimum deﬁned above can be

computed for all states by considering

min

t′∈[t,t+∆)

λj(t′)

Λ(t′).

In particular, if λjand Λare monotone, the above minimum

is easily found analytically.

The approximation in Eq. (9) implies that for the time

interval [t, t + ∆), we compute a sequence of substochastic

vectors v(1), v(2),...,v(R)to approximate the probabilities

Pr (Y(i) = x). Initially we start the DTMC Ywith the

approximation ˆp(t)=: v(0) of the previous step. Then we

compute v(i+1) from v(i)based on the transition probabilities

uj(x, t, t + ∆) for i∈ {0,1,...,R}. Since these transition

probabilities may sum up to less than one, the resulting vector

v(i+1) may also sum up to less than one. Since, for the com-

putation of ˆpt+∆ , we weight these vectors with the Poisson

probabilities and add them up the underapproximation ˆpt+∆

1Note that this this decomposition is always possible for chemical reaction

networks where the time-dependence stems from ﬂuctuations in reaction

volume or temperature.

contains an additional approximation error. In general, the

larger the time-period ∆, the worse the underapproximations

uj(x, t, t + ∆) are and thus the underapproximation ˆpt+∆

becomes worse as well. We illustrate this effect by applying

the bounding approach to our running example.

Example 2 In the gene expression of Example 1, the time-

dependence is due to the volume and only affects the rate

function α1of the ﬁrst transition class. The time until an E.

coli cell divides varies widely from about 20 minutes to many

hours and depends on growth conditions. Here, we assume a

cell cycle time of one hour and a linear growth [1]. Thus, if at

time t= 0 we consider a cell immediately after division then

the cell volume doubles after 3600 sec. Assume that ∆≤3600.

Then, α1(x, t′) = k′

1·(1 + t′

3600 )for all x∈S. Assume we

have a right truncation point Rsuch that

Λ(t′) = max

xR,xP

k′

1·(1 + t′

3600 ) + (k2+k3)·xR+k4·xP

where xRand xPrange over all states (xR, xP)∈S0,R and

Eq. (6) holds. Then we ﬁnd, for each time-point t′∈[0,∆), the

same state for which the exit-rate α0(x, t′) := Pm

j=1 αj(x, t′)

is maximal, since the only time-dependent propensity is inde-

pendent of the state-variables. Let (xmax

R, xmax

P)denote this

state. In general this is not the case, for instance in the realm

of chemical reaction systems we have that the propensities of

bimolecular reactions (reactions of the from A+B→...) are

dependent both on cell-volume and the population numbers.

For such a system we may ﬁnd that different states have the

maximal exit-rate within the time-frame [0,∆). We discuss how

to overcome this difﬁculty in Subsection IV-B. The transition

probabilities of the DTMC Yare now deﬁned as

u1(xR, xP,0,∆) = min

t′∈[0,∆)

α1(xR, xP, t′)

Λ(t′)

=α1(x, 0)

Λ(0)

=k′

1

k′

1+ (k2+k3)·xmax

R+k4·xmax

P

and, for j∈ {2,3},

uj(xR, xP,0,∆) = min

t′∈[0,∆)

αj(xR, xP, t′)

Λ(t′)

= min

t′∈[0,∆)

kj·xR

Λ(∆)

=kj·xR

k′

1·(1 + ∆

3600 ) + (k2+k3)·xmax

R+k4·xmax

P

,

u4(xR, xP,0,∆)

=k4·xP

k′

1·(1 + ∆

3600 ) + (k2+k3)·xmax

R+k4·xmax

P

.

For the self-loop probability we ﬁnd:

u0(xR, xP,0,∆) = min

t′∈[0,∆)

1−

4

X

j=1

αj(xR, xP, t′)

Λ(t′)

=

1−max

t′∈[0,∆)

4

X

j=1

αj(xR, xP, t′)

Λ(t′)

= 1 −

4

X

j=1

αj(xR, xP,∆)

Λ(∆)

= 1 −k′

1·(1 + ∆

3600 ) + (k2+k3)·xR+k4·xP

k′

1·(1 + ∆

3600 ) + (k2+k3)·xmax

R+k4·xmax

P

.

We now calculate the fraction of probability lost during the

computation of v(i+1) from v(i), i.e.,

1−

4

X

j=0

uj(xR, xP,0,∆)

=k′

1·(1 + ∆

3600 )

k′

1·(1 + ∆

3600 ) + (k2+k3)·xmax

R+k4·xmax

P

−k′

1

k′

1+ (k2+k3)·xmax

R+k4·xmax

P

=(k2+k3)·xmax

R+k4·xmax

P

k′

1+ (k2+k3)·xmax

R+k4·xmax

P

−(k2+k3)·xmax

R+k4·xmax

P

k′

1·(1 + ∆

3600 ) + (k2+k3)·xmax

R+k4·xmax

P

.

For ∆ = 0 we have a probability loss of 0and for ∆>0we

can see that the probability loss increases with increasing ∆.

C. Time-stepping approach

Given that a large time horizon may lead to decreased

accuracy, Arns et al. [2] suggest to partition the time period

of interest [0, tmax)in steps of length ∆. In each step, an

approximation of the transient distribution at the current time

instant, ˆp(t), is computed and used as initial condition for

the next step. The number of states that we consider, that is,

|St,R|grows in each step. The probabilities of all remaining

states of Sare approximated as zero. Thus, each step yields

a vector ˆp(t+∆) with positive entries for all states x∈St,R

that approximate Pr (X(t+ ∆) = x). The vector ˆp(t+∆) with

support St,R =St+∆,0is then used as the initial distribution

to approximate the vector ˆp(t+∆+∆′). See Figure 1 for a sketch

of the state truncation approach. Note that the chosen time-

period ∆may vary for different steps of the approach.

It is easy to see that the total error is the sum of the errors in

each step, where the error of a single step equals the amount of

probability mass that “got lost” due to the underapproximation.

More precisely, we have two sources of error, namely the error

due to the truncation of the inﬁnite sum in Eq. (4) and the error

due to the bounding approach that relies on Eq. (9).

In [2], Arns et al. give exact formulas for the ﬁrst three

terms of the sum in Eq. (8). Thus, if the approximation ˆp(t)

of p(t)is exact, then ˆp(t+∆) is an underapproximation due to

Support at time t

x2

x1

St,0

Truncation for the ﬁrst step

x2

x1

St,0

St,R

Truncation for the second step

x2

x1

St,0

St+∆,0

St+∆,R

Fig. 1. Illustration of the state space truncation approach for the two-dimensional case. Given the distribution ˆp(t)with support St,0, a truncation point R

and a time-step ∆, we compute in the ﬁrst step the distribution ˆp(t+∆) with support St,R =St+∆,0. For the next step we consider the set St+∆,R.

the remaining terms in Eq. (8). This implies that the smaller

Rbecomes, the closer the error will be to the error bound ǫ.

On the other hand, a small truncation point means that only

a small time step ∆is possible (see Eq. (6)), which means

that many steps are necessary until the ﬁnal time instant tmax

is reached. In order to explore the trade-off between running

time and accuracy, we run experiments with different values

for the predeﬁned truncation point Rthat determines the step

size ∆. We report on these experiments in Section V.

IV. ON-THE-FLY ALGORITHM

As we can see in Figure 1, the number of states that are

considered to compute ˆp(tmax )from ˆp(t)grows in each step,

since all states within a radius of Rtransitions from a state

in the previous set St,0are added. This makes the approach

infeasible for Markov models with a large or even inﬁnite

state space because the memory requirements are too large.

Therefore, we suggest to use a similar strategy as described in

previous work [3] to keep the memory requirements low and

achieve faster running times.

The underlying principle of this approach is to dynamically

maintain a snapshot of the part of the state space where

most of the transient probability distribution is located. We

achieve this by adding and removing states in an on-the-ﬂy

fashion. The decision which states to add and which states

to remove depends on a small probability threshold δ > 0.

The computation of the probabilities v(i)(x)that approximate

Pr (Y(i) = x)is done without explicitly constructing the

transition matrix of Y. Instead, in the implementation, a state

xis represented as an array with the ﬁelds

•x.DTMC containing the current DTMC probability v(i)(x),

•x.CTMC containing the current CTMC probability ˆp(t)(x),

•x.income that is initialized as zero,

•x.ujthat contains the transition probability uj(x, t, t + ∆)

where j∈ {1,...,m},

as well as pointers to all direct successors x+vj. Let

S(0) := {x:v(0)(x)>0}=St,0

and, for i∈ {1,...,R}let S(i)be the set of states that we con-

sider to compute v(i+1) from v(i). For each state x∈S(i)we

add the value x.DTMC·x.ujto the ﬁeld (x+vj).income

for each j∈ {1,...,m}and we add x.DTMC·x.u0to

the ﬁeld x.income. Afterwards we iterate once more over

all states in x∈S(i)and set x.DTMC:=x.income and

x.income:=0, whenever the value x.income is greater or

equal to δ. Otherwise, if x.income < δ, we remove the

state x, i.e., S(i+1) contains all states xwith v(i+1) (x)≥δ.

Similarly, if the direct successor x+vjdoes not exist yet and

there is probability ﬂow from xto x+vjthen we create it and

add it to S(i+1) if the propagated probability x.DTMC·x.uj

is greater or equal to δ. Even though x+vjmight in total

receive more than δ, we do not create it and add it to S(i+1) to

improve the efﬁciency of our method. This strategy avoids that

many states are created only to test whether the sum of their

incoming probability ﬂow is large enough – and immediately

deleted because it is not.

For many Markov population models, the approximate on-

the-ﬂy solution leads to an enormous reduction of the memory

requirements as already report in [3]. Moreover, it decreases

the speed of the Poisson process Nsince the sets St′,0and

St′,R are smaller and thus the maximum in Eq. (7) is now

taken over fewer states. We illustrate this effect in Figure 2.

This effect is particularly important if during an interval

[t, tmax)in certain parts of the state space the dynamics of

the system is fast while it is slow in other parts where the

latter contain the main part of the probability mass. On the

other hand, the threshold δintroduces another approximation

error which may become large if the time horizon if interest

is long. Moreover, if ρis a bound for the error introduced by

the above strategy of neglecting certain states, we can reserve

a portion of ρ·∆

tmax for the interval [t, t + ∆) and repeat

the computation with a smaller threshold δif more than the

allowed portion of probability was neglected. Note that we can

easily track how much probability got “lost” by adding up the

probability inﬂow that was not added to any income-ﬁeld.

The approximation that we suggest above is again an

underapproximation and since the approximations suggested

in the previous sections are so as well, we are still able to

compute the total error of the approximation ˆp(t)of p(t)as

1−Px∈St,R ˆp(t)(x).(10)

Support at time t

x2

x1

St,0

Truncation for the ﬁrst step

and approx. support of ˆpt+∆

x2

x1

St,0

St,R

St+∆,0

Truncation for the second step

x2

x1

St+∆,0

St+∆,R

Fig. 2. Illustration of the on-the-ﬂy algorithm for the two-dimensional case. Given the distribution ˆp(t)with support St,0, a truncation point Rand a

time-step ∆, we compute in the ﬁrst step the distribution ˆp(t+∆) with approximate support St+∆,0⊂St,R. For the next step we consider the set St+∆,R.

Clearly, t′> t implies that the error at time t′is higher than

the error at time t. For our experimental results in Section V

we choose δ∈ {10−10,10−12}and report on the total error

of the approximation at time tmax.

A. Determining the step-size

Given an error bound ǫ > 0, a time-point t, for which the

support of ˆp(t)is St,0, and a time-point tmax for which we wish

to approximate the transient probability distribution, we now

discuss how to ﬁnd a time-step ∆such that Eqs. (6) and (7)

hold. Recall that the probabilities Pr (N(t, t + ∆) = i)follow

a Poisson distribution with parameter ¯

Λ(t, t + ∆) ·∆, which

we denote by µR,∆to emphasize the dependence on ∆and

the right truncation point R. Note that the latter dependence

is due to the maximum in Eq. (6) that is deﬁned over the set

St,R, the set of all states that are reachable from a state in

St,0by at most Rtransitions. We have

µR,∆=Zt+∆

t

Λ(t′)dt′.(11)

Here, we propose to ﬁrst choose a desired right truncation

point R∗and then ﬁnd a time-step ∆such that Eqs. (6) and (7)

hold. We perform an iteration where in each step we systemati-

cally choose different values for ∆and compare the associated

right truncation point Rwith R∗. Since µR∗,∆is monotone

in ∆this can be done in a binary search fashion as described

in Algorithm 1. We start with the two bounds ∆−= 0 and

∆+=tmax −t. The function FindMaxState(∆, R∗)ﬁnds a

state xmax such that for all time-points t′∈[t, t + ∆) we have

m

X

j=1

αj(xmax, t′)≥max

x′∈St,R∗

m

X

j=1

αj(x′, t′).(12)

The choice of xmax also determines the uniformization rate

Λ(t′) =

m

X

j=1

αj(xmax, t′).

It immediately follows from Eq. (12) that Eq. (7) holds. In Sec-

tion IV-B, we discuss why we ﬁnd Λby selecting a state xmax

and how we can implement the function FindMaxState(∆, R∗)

efﬁciently while avoiding that the uniformization rates Λ(t′)

are chosen to be very large.

The function ComputeParameter(t, t + ∆, xmax )now com-

putes the integral µR∗,∆using xmax . If possible we compute

the integral analytically, otherwise we use a numerical inte-

gration technique. The function FoxGlynn(µ, ǫ)computes the

right truncation point of a homogeneous Poisson process with

rate µfor a given error bound ǫ, i.e. the value ˆ

Rthat is the

smallest positive integer such that

Pˆ

R

i=0

µi

i!e−µ≥1−ǫ.

For the reﬁnement of the bounds ∆−and ∆+in lines 13–17

we we exploit that Ris monotone in ∆.

B. Determining the maximal rates

The function FindMaxState(∆, R∗)in Algorithm 1 ﬁnds

a state xmax such that its exit-rate is greater or equal

than the maximal exit-rate α0(x, t′) = Pm

j=1 αj(x, t′)over

all states xin St,R∗. In principal it is enough to ﬁnd a

function Λ(t′)with this property, for instance the function

maxx∈St,R∗Pm

j=1 αj(x, t′), but this function may be hard to

determine analytically and it is also not clear how to represent

such a function practically in an implementation. Selecting a

state xmax and deﬁning Λ(t′)to be the exit-rate of this state

solves these problems.

We now present two ways of implementing the function

FindMaxState.

a) For this approach we assume that all rate functions increase

monotonically in the state variables. This is, for instance,

always the case for models from chemical kinetics. We

exploit that the change vectors are constant and deﬁne for

each dimension k∈ {1,...,n}

vmax

k:= maxj∈{1,...,m}vjk

where vjk is the k-th entry of the change vector vj. For

the set St,0we compute, the maximum value for each

dimension k∈ {1,...,n}

ymax

k:= max

y∈St,0

yk.

Input R∗,t,tmax,ǫ

Output ∆,xmax

Global State space ˆ

S, ...

1∆+:= tmax −t;//upper bound for ∆

2xmax := FindMaxState(∆+, R∗);

3µR∗,∆+:= ComputeParameter(t, t + ∆+, xmax)

4R+:= FoxGlynn(µR∗,∆+, ǫ);

5if R+≤R∗then

6∆ := ∆+;

7else

8R−:= 0; ∆−:= 0; //lower bound for ∆

9while R6=R∗

10 ∆ := ∆+−∆−

2;

11 µR∗,∆:= FindMaxInt(∆, R∗);

12 R:= FoxGlynn(µR∗,∆, ǫ);

13 if R−< R∗< R

14 R+:= R; ∆+:= ∆;

15 elseif R < R∗< R+

16 R−:= R; ∆−:= ∆;

17 endif

18 endwhile

19 endif

Alg. 1. The step size ∆is determined in a binary-search fashion.

We now ﬁnd the state xmax which is guaranteed to have a

higher exit-rate than any state in St,R∗for all time-points

in the interval [t, t + ∆) as follows,

xmax

k:= ymax

k+R∗·vmax

k.

It is obvious that the state variables xmax

kare upper bounds

for the state variables appearing in St,R∗. Then, since all

rates increase monotonically in the state variables, we have

that the exit-rate of xmax = (xmax

1,...,xmax

n)must be an

upper-bound for the exit-rates appearing in St,R∗for all

time-points.

b) The ﬁrst two moments of a Markov population model can

be accurately approximated using the method of moments

proposed by Engblom [4]. This approximation assumes

that the expectations and the (co-)variances change con-

tinuously and deterministically in time. and it is accurate

if the rate functions are at most quadratic in the state

variables. We approximate the means Ek(t′) := E[Xk(t′)]

and the variances σ2

k(t′) := VAR[Xk(t′)] for all k∈

{1,...,n}. For each k, we determine the time instant

ˆ

t∈[t, t+∆) at which Ek(ˆ

t)+ℓ·σk(ˆ

t)is maximal for some

ﬁxed ℓ. We use this maximum to determine the spread of

the distribution, i.e. we assume that the values of X(t′)will

stay below xmax

k:= Ek(ˆ

t) + ℓ·σk(ˆ

t). Note that a more

detailed approach is to consider the multivariate normal

distribution with mean E[X(t′)] and covariance matrix

COV [X(t′)]. But since the spread of a multivariate normal

distribution is difﬁcult to derive in higher dimensions, we

simply consider each dimension independently. We now

have xmax = (xmax

1,...,xmax

n). If during the analysis

a state is found which exceeds xmax in one dimension

then we repeat our computation with a higher value for

ℓ. To make this approach efﬁcient, ℓhas to be chosen in

an appropriate way. Our experimental results indicate that

for two-dimensional system the choice ℓ= 4 yields best

results.

C. Complete algorithm

Our complete algorithm now proceeds as follows. Given an

initial distribution p(0) with ﬁnite support S0,0, a time-bound

tmax, thresholds δand ǫ, and a desired right truncation point

R∗, we ﬁrst set t:= 0.

Now we compute a time-step ∆and the state xmax using

Algorithm 1 with inputs R∗,t,tmax , and ǫ. We then ap-

proximate the transient distribution ˆpt+∆ using an on-the-ﬂy

version of the bounding approach [2], where the state space is

dynamically maintained and states with probability less than

δare discarded as described above. For the rate function Λ

we use the exit-rate of state xmax . When computing DTMC

probabilities, we use exact formulas for the ﬁrst two terms [2]

of the sum in Eq. (8) and lower bounds, given by Eq. (9),

for the rest. This gives us the approximation ˆpt+∆ with ﬁnite

support St+∆,0. We now set t:= t+ ∆ and repeat the above

step with initial distribution ˆptuntil we have t=tmax.

V. CASE STUDY

We implemented the approach outlined in Section IV in

C++ and ran experiments on a 2.4GHz Linux machine with

4 GB of RAM. We consider a Markov population model that

describes a network of chemical reactions. According to the

theory of stochastic chemical kinetics [6], the form of the rate

function of a reaction depends on how many molecules of each

chemical species are needed for one instance of the reaction

to occur. The relationship to the volume has been discussed in

detail by Wolkenhauer et al. [15]. If no reactants are needed2,

that is, the reaction is of the form ∅ → ... then αj(x, t) =

kj·V(t)where kjis a positive constant and V(t)is the volume

of the compartment in which the reactions take place. If one

molecule is needed (case Si→...) then αj(x, t) = kj·xi

where xiis the number of molecules of type Si. Thus, in this

case, αj(x, t)is independent of time. If two distinct molecules

are needed (case Si+Sℓ→...)then αj(x, t) = kj

V(t)·xi·xℓ.

All these theoretical considerations are based on the as-

sumption that the chemical reactions are elementary, that is,

they are not a combination of several reactions. Our example

may contain non-elementary reactions and thus a realistic

biological model may contain different volume dependencies.

But since the focus of the paper is on the numerical algorithm,

we do not aim for an accurate biological description here.

The reaction network that we consider is a gene regulatory

network, called the exclusive switch [10]. It consists of two

2Typically, reactions requiring no reactants are used in the case of open

systems where it is assumed that the reaction is always possible at a constant

rate and the reactant population is not explicitly modeled.

FindMaxState

implementation δ R∗total error execution time max |S|

method a) 10−12 10 4·10−422min 57

10−10 10 7·10−411min 25

10−12 20 5·10−543min 179

10−10 20 9·10−227min 104

method b) 10−12 10 3·10−438min 214

10−10 10 1·10−324min 119

10−12 20 1·10−396min 344

10−10 20 2·10−338min 214

TABLE I

RESULTS OF THE ANALYSIS OF THE EXCLUS IVE SWITCH EXAMPLE.

genes with a common promotor region. Each of the two gene

products P1and P2inhibits the expression of the other product

if a molecule is bound to the promotor region. More precisely,

if the promotor region is free, molecules of both types P1and

P2are produced. If a molecule of type P1is bound to the

promotor region, only molecules of type P1are produced. If

a molecule of type P2is bound to the promotor region, only

molecules of type P2are produced. No other conﬁguration

of the promotor region exists. The probability distribution

of the exclusive switch is bistable which means that after a

certain amount of time, the probability mass concentrates on

two distinct regions in the state space. The system has ﬁve

chemical species of which two have an inﬁnite range, namely

P1and P2. We deﬁne the transition classes τj= (Gj, uj, αj),

j∈ {1,...,10}as follows.

•For j∈ {1,2}we describe production of Pjby Gj={x∈

N5|x3>0},uj(x) = x+ej, and αj(x, t) = 0.5·x3.

Here, x3denotes the number of unbound DNA molecules

which is either zero or one and the vector ejis such that

all its entries are zero except the j-th entry which is one.

•We describe degradation of Pjby Gj+2 ={x∈N5|xj>

0},uj+2(x) = x−ej, and αj+2 (x, t) = 0.005 ·xj. Here,

xjdenotes the number of Pjmolecules.

•We model the binding of Pjto the promotor as Gj+4 =

{x∈N5|x3>0, xj>0},uj+4(x) = x−ej−e3+ej+3,

and αj+4(x, t) = (0.1−0.05

3600 ·t)·xj·x3for t≤3600.

Here, xj+3 is one of a molecule of type Pjis bound to

the promotor region and zero otherwise.

•For unbinding of Pjwe deﬁne Gj+6 ={x∈N5|xj+3 >

0},uj+6(x) = x+ej+e3−ej+3 , and αj+6 (x, t) =

0.005 ·xj+3.

•Finally, we have production of Pjif a molecule of type Pj

is bound to the promotor, i.e., Gj+8 ={x∈N5|xj+3 >

0},uj+8(x) = x+ej, and αj+8 (x, t) = 0.5·xj+3 .

Note that only the rate functions α6and α7, which denote

the binding of a protein to the promotor region, are time-

dependent. This is intuitively clear since if the cell volume

grows it becomes less likely that a protein molecules is located

close to the promotor region. We started the system at time

t= 0 in state (0,0,1,0,0) with probability one and considered

a time horizon of t= 3600. Table I contains the results of our

experiments. The ﬁrst column refers to the two variants for

implementing the method FindMaxState which we suggest in

Section IV-B. The second and third column lists the different

values that we used for the threshold δand the right truncation

point R∗. We list the total error at time tmax in the fourth

column (see Eq. (10)). The last column with heading max |S|

contains the maximal size of the set St,R∗that we considered

during the analysis. For our implementation we kept the input

ǫ= 10−10 of Algorithm 1 ﬁxed.

A. Discussion

We now discuss the effect of the different input parameters

on the performance of our algorithm. As expected, decreasing

the threshold δincreases the accuracy, since less states are

discarded on-the-ﬂy. However, this comes at a cost of using

more memory, since more states have to be represented, and

the running time is also increased.

We also see that using method “b” to ﬁnd the uniformization

rate is less effective than method “a” (see Section IV-B).

Method “b” chooses a larger uniformization rate than method

“a”, which leads to slower execution times and increased

memory usage. The effect of this choice on the accuracy is

not completely clear, although also here method “a” seems to

be somewhat better.

The effect of the choice of R∗is most interesting. Choosing

a larger value for R∗means that more summands on the

right-hand side of Eq. (9) have to be approximated using the

bounding approach. This should decrease the accuracy of the

algorithm, but we see that for one of the experiments this is not

the case (method “a”, δ= 10−12). This may be caused by the

fact that increasing R∗also increases the time-steps ∆and the

uniformization rate Λ(t). By increasing the time-steps we ﬁnd

that less steps have to be taken to reach the ﬁnal time-point

tmax which decreases the probability lost by the truncation

of the uniformization sum. We also see that increasing R∗

increases the memory and time needed for computation.

VI. CONCLUSION

We have presented an algorithm for the numerical ap-

proximation of transient distributions for inﬁnite time-

inhomogeneous Markov population models with unbounded

rates. Our algorithm provides a strict lower bound for this

transient distribution. There is a trade-off between the tightness

of the bound and the performance of the algorithm, both in

terms of computation time and required memory.

As future work, we will investigate the relationship between

the parameters of our approach (truncation point, the signiﬁ-

cance threshold δ, the method by which we determine the rate

of the Poisson process), the accuracy and the running time of

the algorithm more closely. For this we will consider Markov

population models with different structures and dynamics.

REFERENCES

[1] A. Arkin, J. Ross, and H. H. McAdams. Stochastic kinetic analysis of

developmental pathway bifurcation in phage λ-infected escherichia coli

cells. Genetics, 149:1633–1648, 1998.

[2] M. Arns, P. Buchholz, and A. Panchenko. On the numerical analysis of

inhomogeneous continuous time Markov chains. INFORMS Journal on

Computing. To appear.

[3] Fr´ed´eric Didier, Thomas A. Henzinger, Maria Mateescu, and Verena

Wolf. Fast adaptive uniformization of the chemical master equation. In

Proc. of HIBI, 2009. To appear.

[4] S. Engblom. Computing the moments of high dimensional solutions of

the master equation. Appl. Math. Comput., 180:498–515, 2006.

[5] D. T. Gillespie. A general method for numerically simulating the time

evolution of coupled chemical reactions. J. Comput. Phys., 22:403–434,

1976.

[6] D. T. Gillespie. Exact stochastic simulation of coupled chemical

reactions. J. Phys. Chem., 81(25):2340–2361, 1977.

[7] W. K. Grassmann. Computational methods in probability theory. In

D. P. Heyman and M. J. Sobel, editors, Stochastic Models, volume 2 of

Handbooks in Operations Research and Management Science, chapter 5,

pages 199–254. Elsevier, 1990.

[8] A. Jensen. Markoff chains as an aid in the study of Markoff processes.

Skandinavisk Aktuarietidskrift, 36:87–91, 1953.

[9] J. F. C. Kingman. Markov population processes. Journal of Applied

Probability, 6(1):1–16, 1969.

[10] Adiel Loinger, Azi Lipshtat, Nathalie Q. Balaban, and Ofer Biham.

Stochastic simulations of genetic switch systems. Phys. Rev. E,

75(2):021904, 2007.

[11] W. J. Stewart. Introduction to the Numerical Solution of Markov Chains.

Princeton University Press, 1995.

[12] M. Thattai and A. van Oudenaarden. Intrinsic noise in gene regulatory

networks. PNAS, USA, 98(15):8614–8619, July 2001.

[13] N.M. van Dijk. Uniformization for nonhomogeneous Markov chains.

Operations research letters, 12(5):283–291, 1992.

[14] Aad P. A. van Moorsel and Katinka Wolter. Numerical solution of non-

homogeneous Markov processes through uniformization. In Proc. of

the European Simulation Multiconference - Simulation, pages 710–717.

SCS Europe, 1998.

[15] O. Wolkenhauer, M. Ullah, W. Kolch, and K. Cho. Modeling and sim-

ulation of intracellular dynamics: Choosing an appropriate framework.

IEEE Transactions on NanoBioscience, 3(3):200–207, 2004.