Page 1
G.Ciobanu, M.Koutny (Eds.): Membrane Computing and
Biologically Inspired Process Calculi 2010 (MeCBIC 2010)
EPTCS 40, 2010, pp. 142–161, doi:10.4204/EPTCS.40.10
Lumpability Abstractions of Rule-based Systems∗
Jerome Feret
LIENS (INRIA/´ENS/CNRS)
Paris, France
feret@ens.fr
Thomas Henzinger
Institute and Science of Technology
Vienna, Austria
thenzinger@ist.ac.at
Heinz Koeppl
School of Computer and Communication Sciences
EPFL
Lausanne, Switzerland
heinz.koeppl@epfl.ch
Tatjana Petrov
School of Computer and Communication Sciences
EPFL
Lausanne, Switzerland
tatjana.petrov@epfl.ch
The induction of a signaling pathway is characterized by transient complex formation and mutual
posttranslational modification of proteins. To faithfully capture this combinatorial process in a math-
ematical model is an important challenge in systems biology. Exploiting the limited context on which
most binding and modification events are conditioned, attempts have been made to reduce the com-
binatorial complexity by quotienting the reachable set of molecular species, into species aggregates
while preserving the deterministic semantics of the thermodynamic limit. Recently we proposed a
quotienting that also preserves the stochastic semantics and that is complete in the sense that the
semantics of individual species can be recovered from the aggregate semantics. In this paper we
prove that this quotienting yields a sufficient condition for weak lumpability and that it gives rise to a
backward Markov bisimulation between the original and aggregated transition system. We illustrate
the framework on a case study of the EGF/insulin receptor crosstalk.
1 Introduction
Often a few elementary events of binding and covalent modification [31] in a biomolecular reaction
system give rise to a combinatorial number of non-isomorphic reachable species or complexes [17, 18].
Instances of such systems are signaling pathway, polymerizations involved in cytoskeleton maintainance,
the formation of transcription factor complexes in gene-regulation.
For such biomolecular systems, traditional chemical kinetics face fundamental limitations, that are
related to the question of how biomolecular events are represented and translated into a mathematical
model [24]. More specifically, chemical reactions can only operate on a collection of fully specified
molecular species and each such species gives rise to one differential equation, describing the rate of
change of that species’ concentration. Many combinatorial systems do not permit the enumeration of all
molecular species and thus render their traditional differential description prohibitive. However, even if
one could enumerate them, it remains questionable whether chemical reactions are the appropriate way
to represent and to reason about such systems.
As the dynamics of a biomolecular reaction mixture comes about through the repeated execution of a
few elementary events one may wonder about the effective degrees of freedom of the reaction mixture’s
dynamics. If the velocity of all events – or their probabilities to occur per time-unit per instance –
∗J´ erˆ ome Feret’s contribution was partially supported by the ABSTRACTCELL ANR-Chair of Excellence. Heinz Koeppl
acknowledges the support from the Swiss National Science Foundation, grant no. 200020-117975/1. Tatjana Petrov acknowl-
edges the support from SystemsX.ch, the Swiss Initiative in Systems Biology.
Page 2
J. Feret, T. Henzinger, H. Koeppl, T. Petrov 143
are different for all complexes (w.r.t. modification) and pairs of complexes (w.r.t. binding) to which
the events can apply to, then the degrees of freedom would equal to the number of molecular species.
However, due to the local nature of physical forces underlying molecular dynamics, the kinetics of most
events appear to be ignorant with respect to the global configuration of the complexes they are operating
on. More provocatively, one may say that even if there would be variations of kinetics of an event from
one context to another, experimental biology does not – and most likely never will – have the means to
discern between all different contexts. For instance, fluorescence resonance energy transfer (FRET), may
report on a specific protein-binding event and even its velocity, however we have no means to determine
whether the binding partners are already part of a protein complex – not to speak of the composition and
modification state of these complexes. To this end, molecular species remain elusive and appear to be
inappropriate entities of descriptions.
To align with the mentioned experimental insufficiencies and with the underlying biophysical lo-
cality, rule-based or agent-based descriptions were introduced as a framework to encode such reaction
mixtures succinctly and to enable their mathematical analysis [8, 3]. Rules exploit the limited context
on which most elementary events are conditioned. They just enumerate that part of a molecular species
that is relevant for a rule to be applicable. Thus, in contrast to chemical reactions, rules can operate on a
collection of partially specified molecular species. Recently, attempts have been made to identify the set
of those partially specified species that allow for a self-consistent description of the rule-set’s dynamics
[11, 6]. Naturally, as partially specified species – or fragments – in general encompass many fully speci-
fied species, the cardinality of that set is less than of the set of molecular species. These approaches aim
to obtain a self-consistent fragment dynamics based on ordinary differential equations. It represents the
dynamics in the thermodynamic limit of stochastic kinetics when scaling species multiplicities to infinity
while maintaining a constant concentration (multiplicity per unit volume) [22]. In many applications in
cellular biology this limiting dynamics is an inappropriate model due to the low multiplicities of some
molecular species – think of transcription factor - DNA binding events. We recently showed that the
obtained differential fragments cannot be used to describe the finite volume case of stochastic kinetics
[12]. Exploiting statistical independence of events we instead derived stochastic fragments that repre-
sent the effective degrees of freedom in the stochastic case. Conceptually, the procedure replaces the
Cartesian product by a Cartesian sum for statistically independent states. In contrast to the differential
case, stochastic fragments have the important property that the sample paths of molecular species can be
recovered from that of partially specified species.
We believe that interdisciplinary fields, such as systems biology, can move forward quickly enough
only if the well-established knowledge in each of the disciplines involved is exploited to its maximum,
i.e. if the standardized, well-established theories are recognized and re-used. For that reason, in this
paper we translate our abstraction method ([12]) into the language of well-established contexts of ab-
straction for probabilistic systems – lumpability and bisimulation. Lumpability is mostly considered
from a theoretical point of view in the theory of stochastic processes [19, 13, 28, 26, 27, 4]. A Markov
chain is lumpable with respect to a given aggregation (quotienting) of its states, if the lumped chain pre-
serves the Markov property [21]. A sound aggregation for any initial probability distribution is referred
to as strong lumpability, while otherwise it is termed weak lumpability [4, 29]. Approximate aggregation
techniques for Markov chains of biochemical networks are discussed in [14]. Probabilistic bisimulation
was introduced as an extension to classic bisimulation in [23]. It is extended to continuous-state and
continuous-time in [9] and, for the discrete-state case, to weak bisimulation [2]. For instance, in [9]
the authors use bisimulation of labelled Markov processes, the state space of which is not necessarily
discrete, and they provide a logical characterization of probabilistic bisimulation. Another notion of
weak bisimulation was recently introduced in [10]. Therein two labeled Markov chains are defined to
Page 3
144Lumpability Abstractions of Rule-based Systems
be equivalent if every finite sequence of observations has the same probability of occurring in the two
chains. Herein we recognize the sound aggregations of [12] as a form of backward Markov bisimulations
on weighted labeled transition systems (WLTS), and we show it to be equivalent to the notion of weak
lumpability on Markov chains.
The remaining part of the paper is organized as follows. In the Section 2, we introduce weighted
labeled transition systems (WLTS) and we assign it the trace density semantics of a continuous-time
Markov chain (CTMC). Moreover, we define the Kappa language, and we assign a WLTS to a Kappa
specification. Based on the notion of the annotated contact map we briefly summarize in Section 3 the
general procedure to compute stochastic fragments, as it is offered in [12]. In Section 4, we introduce
the characterizations of sound and complete abstractions on WLTS as a backward Markov bisimulation.
Moreover, we define it being equivalent to the weak lumpability on Markov chains. Finally, we provide
in Section 5 results for the achieved dimensionality reduction for a rule-based model of the crosstalk
between the EGF/insulin signaling pathway [6]. This mechanistic model comprises 76 rules giving rise
to 42956 reactions and 2768 molecular species.
2Preliminaries
The stochastic semantics of a biochemical network is modelled by a continuous-time Markov chain
(CTMC). The main object that we will use in the analysis is the weighted labelled transition system
(WLTS) on a countable state space. We will assign a WLTS to a given Kappa specification, and we
manipulate that object when reasoning about abstractions.
2.1CTMC
We will observe the CTMC that is generated by the weighted labelled transition system (WLTS) on a
countable state space. We define the CTMC of a WLTS, by defining the Borel σ-algebra containing all
cylinder sets of traces [20] that can occur in the system, and the corresponding probability distribution
among them. We also introduce the standard notation of a rate matrix, which we will use when analysing
the lumpability and bisimulation properties in Sec. 4.
Definition 1. (WLTS) A weighted-labelled transition system W is a tuple (X ,L,w,π0), where
• X is a countable state space;
• L is a set of labels;
• w : X ×L ×X → R+
• π0: X → [0,1] is an initial probability distribution.
We assume that the label fully identifies the transition, i.e. for any x ∈ X and l ∈ L, there is at most one
x?∈ X , such that w(x,l,x?) > 0. Moreover, we assume that the system is finitely branching, in the sense
that (i) the set {x ∈ X | π0(x) > 0} is finite, and (ii) for arbitrary ˆ x ∈ X , the set {(l,x?) ∈ L ×X |
w(ˆ x,l,x?) > 0} is finite.
The activity of the state xi, denoted a : X → R+
a(xi) :=∑{w(xi,l,xj) | xj∈ X, l ∈ L}.
ThedefinitionofaWLTSimplicitelydefinesatransitionrelation→⊆X ×X , suchthat(xi,xj)∈→,
if and only if there exists a non-zero transition from state xito state xj, i.e. the total weight over all labels
is strictly bigger then zero, written∑{w(xi,l,xj) | l ∈L}>0. Moreover, we can differentiate the initial
set of states I ⊆ X , such that their initial probabilities are positive, i.e. I = {x ∈ X | π0(x) > 0}.
0is the weighting function that maps two states and a label to a real value;
0is the sum of all weights originating at xi, i.e.
Page 4
J. Feret, T. Henzinger, H. Koeppl, T. Petrov145
Definition 2. (Rate matrix of a WLTS) Given a WLTS W = (X ,L,w,π0), we assign it the CTMC rate
matrix R : X ×X → R, given by R(xi,xj) =∑{w(xi,l,xj) | l ∈ L}.
The consequence is that we do not enforce R(xi,xi) = −∑{R(xi,xj) | i ?= j}, as it is usual for the
generator matrix of CTMCs. This however does not affect the transient, not the steady-state behavior
of the CTMC [1]. We do so for the following reason. When abstracting the WLTS by partitioning the
state space, we get another WLTS. If the two states x and x?which have a transition between each other
were aggregated in the same partition class ˜ x, it will result as a prolongation of the residence time in the
abstract state ˜ x, i.e. we will have a self-loop in the abstract WLTS.
If we refer to the generated stochastic Markov process, written as a continuous-time random variable
{Xt}t∈R+
the value xiat time point t. It thus holds that Pr(X0= xi) = π0(xi), and Pr(Xt+dt= xj | Xt= xi) =
R(xi,xj)dt when i ?= j, whereas Pr(Xt+dt= xi | Xt= xi) = R(xi,xi)dt+(1−∑{R(xi,xj?)dt | xj? ∈ X }),
which gives after simplification: Pr(Xt+dt= xi| Xt= xi) = 1−∑{R(xi,xj?)dt | j??= i}.
We define the cylinder sets of traces that can be observed in the system W . By observing the trace at
a certain time point, we mean observing the sequence of visited states, labels that were assigned to the
executed transitions, and time points of when the transition happened.
Definition 3. (A trace of a WLTS) Let us observe the WLTS W = (X ,L,w,π0) and its CTMC. Given a
number k in N, we define a trace of length k as τ ∈ (X ×L ×R+
τ = x0
→ x1...xk−1
If the trace τ is such that (i) π0(x0) > 0, and (ii) for all i, 0 ≤ i ≤ k, we have that w(xi,li,xi+1) > 0, then
we say that τ belongs to the set of traces of W , and we write it τ ∈ T (W ).
The ‘time stamps’ on each of the transitions denote intuitively the absolute time of the transition,
from the moment when the system was started (t = 0). We cannot assign the probability distribution to
the traces in T (W ), since the probability of any such trace is zero. We thus introduce the cylinder set of
traces over intervals of times.
Definition 4. (Cylinder set of traces) If IR is the set of all nonempty intervals in R+
cylinder set of traces τIR∈ (X ×L ×IR)k×X , such that:
τIR= x0
→ x1...xk−1
denotes the set of all traces τ = x0
→ x1...xk−1
of traces τIRis such that π0(x0) > 0, and for all i = 0,...,k−1, we have that w(xi,li,xi+1) > 0, then we
say that τIRbelongs to the cylinder set of traces of W , and we write τIR∈ TIR(W ).
Let Ω(TIR(W )) be the smallest Borel σ-algebra that contains all the cylinder sets of traces in
TIR(W ). We define a probability measure over Ω(TIR(W )) in the following way.
Definition 5. (Trace density semantics on a WLTS) Given a WLTS (X ,L,w,π0), and a number k in N,
the probability of the cylinder set of traces τIR∈ TIR(W ), specified as in expression (1), is given by:
k
∏
i=1
0, over the countable state space X . We write Pr(Xt= xi), the probability that the process takes
0)k×X , written
xk.
l1,t1
lk,t1+...+tk
→
0, then we define the
l1,I1
lk,Ik
→ xk
xk, such that ti∈ Ii, 1 ≤ i ≤ k. If the cylinder
(1)
l1,t1
lk,t1+...+tk
→
π(τIR) = π(x0
l1,I1
→ x1...xk−1
lk,Ik
→ xk) = π0(x0)
w(xi−1,li,xi)
a(xi−1)
?
e−a(xi−1)·inf(Ii)−e−a(xi−1)·sup(Ii)?
.
Note that?
a(xi−1)e−a(xi−1).
Iia(xi−1)e−a(xi−1)·tdt = e−a(xi−1)·inf(Ii)−e−a(xi−1)·sup(Ii)is the probability of exiting the state
xi−1in a time interval Ii−1, since the probability density function of the residence time of xi−1is equal to
Page 5
146Lumpability Abstractions of Rule-based Systems
2.2Kappa
We present Kappa in a process-like notation. We start with an operational semantics, then define the
stochastic semantics of a Kappa model.
We assume a finite set of agent names A , representing different kinds of proteins; a finite set of
sites S, corresponding to protein domains; a finite set of internal states I, and Σι,Σλtwo signature maps
from A to℘(S), listing the domains of a protein which can bears respectively an internal state and a
binding state. We denote by Σ the signature map that associates to each agent name A ∈ A the combined
interface Σι(A)∪Σλ(A).
Definition 6. (Kappa agent) A Kappa agent A(σ) is defined by its type A ∈ A and its interface σ. In
A(σ), the interface σ is a sequence of sites s in Σ(A), with internal states (as subscript) and binding
states (as superscript). The internal state of the site s may be written as sε, which means that either it
does not have internal states (when s ∈ Σ(A)\Σι(A)), or it is not specified. A site that bears an internal
state m ∈ I is written sm(in such a case s ∈ Σι(A)). The binding states of a site s can be specified as sε,
if it is free, otherwise it is bound (which is possible only when s ∈ Σλ(A)). There are several levels of
information about the binding partner: we use a binding label i ∈ N when we know the binding partner,
or a wildcard bond − when we only know that the site is bound. The detailed description of the syntax
of a Kappa agent is given by the following grammar:
a
N
σ
::=
::=
::=
N(σ)
A ∈ A
ε | s,σ
(agent)
(agent name)
(interface)
s
n
ι
λ
::=
::=
::=
::=
nλ
x ∈ S
ε | m ∈ I
ε | − | i ∈ N
ι
(site)
(site name)
(internal state)
(binding state)
We generally omit the symbol ε.
Definition 7. (Kappa expression) Kappa expression E is a set of agents A(σ) and fictitious agents / 0.
Thus the syntax of a Kappa expression is defined as follows:
E ::= ε | a , E | / 0 , E.
Thestructuralequivalence≡, definedasthesmallestbinaryequivalencerelationbetweenexpressions
that satisfies the rules given as follows:
E , A(σ,s,s?,σ?) , E?
E , a , a?, E?
≡
≡
≡
E , A(σ,s?,s,σ?) , E?
E , a?, a , E?
E , / 0E
i, j ∈ N and i does not occur in E
E[i/j] ≡ E
i ∈ N and i occurs only once in E
E[ε/i] ≡ E
stipulates that neither the order of sites in interfaces nor the order of agents in expressions matters, that
a fictitious agent might as well not be there, that binding labels can be injectively renamed and that
dangling bonds can be removed.
Definition 8. (Kappa pattern, Kappa mixture) A Kappa pattern is a Kappa expression which satisfies
the following five conditions: (i) no site name occurs more than once in a given interface; (ii) each site
name s in the interface of the agent A occurs in Σ(A); (iii) each site s which occurs in the interface of
the agent A with a non empty internal state occurs in Σι(A); (iv) each site s which occurs in the interface
of the agent A with a non empty binding state occurs in Σλ(A); and (v) each binding label i ∈ N occurs
exactly twice if it does at all –there are no dangling bonds. A mixture is a pattern that is fully specified,
i.e. each agent A documents its full interface Σ(A), a site can only be free or tagged with a binding label
i ∈ N, a site in Σι(A) bears an internal state in I, and no fictitious agent occurs.
Page 6
J. Feret, T. Henzinger, H. Koeppl, T. Petrov 147
Definition 9. (Kappa rule) A Kappa rule r is defined by two Kappa patterns E?and Er, and a rate
k ∈ R+
A rule r is well-defined, if the expression Eris obtained from E?by finite application of the following
operations: (i) creation (some fictitious agents / 0 are replaced with some fully defined agents of the form
A(σ), moreover σ documents all the sites occurring in Σ(A) and all site in Σι(A) bears an internal state
in I), (ii) unbinding (some occurrences of the wild card and binding labels are removed), (iii) deletion
(some agents with only free sites are replaced with fictitious agent / 0), (iv) modification (some non-empty
internal states are replaced with some non-empty internal states), (v) binding (some free sites are bound
pair-wise by using binding labels in N).
From now on, we assume all rules to be well-defined. We sometimes omit the rate of a rule. More-
over, we denote by E?↔ Er@k1,k2the two rules E?↔ Er→ @k1and Er→ E?@k2.
Definition 10. (Kappa system) A Kappa system R = (πR
over initial mixtures πR
0
: {M01,...,M0k} → [0,1], and a finite set of rules {r1,...,rn}.
In order to apply a rule r := E?→ Er@k to a mixture M, we use the structural equivalence ≡ to bring
the participating agents to the front of E (with their sites in the same order as in E?), rename binding
labels if necessary and introduce a fictitious agent for each agent that is created by r. This yields an
equivalent expression E?that matches the left and side (lhs) E?, which is written E |= E?as defined as
follows:
0, and is written: r = E?→ Er@k.
0,{r1,...,rn}) is given by finite distribution
E |= εσ |= σ? =⇒ N(σ) |= N(σ?)
σ |= ε
s |= s?∧σ |= σ? =⇒ s,σ |= s?,σ?
ι |= ι?∧λ |= λ? =⇒ nλ
ι?∈ {ε,ι} =⇒ ι |= ι?
λ?∈ {−,λ} =⇒ λ |= λ?
ι|= nλ?
ι?
a |= a?∧E |= E? =⇒ a , E |= a?, E?
/ 0 |= / 0
Note that in order to find a matching, we only uses structural equivalence on E, not E?. We then
replace E?by E?[Er] which is defined as follows:
E[ε] = EN(σ)[N(σr)] = N(σ[σr])
σ[ε] = σ
(s,σ)[sr,σr] = s[sr],σ[σr]
nλ
ι[nλr
ιr] = nλ[λr]
ι[ιr]
(a , E)[ar, Er] = a[ar] , E[Er]
/ 0[ar] = ar
ar[/ 0] = / 0
ιr∈ I =⇒ ι[ιr] = ιr
λr∈ N∪{ε} =⇒ λ[λr] = λr
λ[−] = λ
This may produce dangling bonds (if r unbinds a wildcard bond or destroys an agent on one side of a
bond) or fictitious agents (if r destroys agents), so we use ≡ resolve them.
2.2.1Population-based stochastic semantics
In addition to the rate constants k, careful counting of the number of times each rule can be applied to
a mixture is required to define the system’s quantitative semantics correctly [7, 25]. Thus we define the
notions of embedding between a mixture and an expression. Let Z = a1, ... , amand Z?= c1,...,cnbe
two patterns with no occurrence of the fictitious agent and such that there exists a pattern Z?= b1,...,bm
that satisfies both Z ≡ Z?and Z?|= Z?(and so, in particular, n ≤ m).
The agent permutations used in the proof that Z ≡ Z?allow us to derive a permutation p such that
ap(i)≡ bi. The restriction φ of p to the integers between 1 and n is called an embedding between Z?and
Z. This is written Z??φZ. There may be several embeddings between Z?and Z for the same Z?; if so,
this influences the relative weight of the reaction in the stochastic semantics. We denote by [Z,Z?] the set
Page 7
148Lumpability Abstractions of Rule-based Systems
of embeddings between Z and Z?. This notion of embedding is extended to patterns (including fictitious
agents) by defining Z??φZ if, and only if, (↓/ 0Z?)?φ(↓/ 0Z), where ↓/ 0removes all occurrences of the
fictitious agent in patterns.
We assume that E?is the lhs of a rule r := E?→ Er@k and Z is a mixture such that E??φZ. Let
Z = a1,...,amand ↓/ 0E?= c1,...cn. Given Z?≡ Z (we write ↓/ 0Z?= b1,...,bm) and a bijection p such
that we have Z?|= E?, bi≡ ap(i)for 1 ≤ i ≤ m and φ(j) = p(j) for 1 ≤ j ≤ n. The result of applying r
along φ to the mixture Z is defined (modulo ≡) as any pattern that is ≡-equivalent to Z?[Er]. In other
words the embedding φ between E?and Z fully defines the action of r on Z up to structural equivalence.
We are now ready to define the stochastic semantics by the mean of a WLTS. In this semantics, the
state is a soup of agents, that is to say that we do not care about the order of agents in mixture. So the
states of the system are the class of ≡-mixture.
Defining species as connected mixture, the state of the system can be seen as a multi-set of species.
The formal definition of a Kappa species is as follows:
Definition 11. (Kappa species) A pattern E is reducible whenever E ≡ E?,E??for some non-empty pat-
terns E?, E??; A Kappa species is the ≡-equivalence class of a irreducible Kappa mixture.
As explained earlier, the action of a rule r on a mixture E is fully defined (up to ≡) by an embedding
φ between the lhs E?of the rule r and the mixture. So as to consider computation steps over ≡-equivalent
of mixtures, we introduce an equivalence relation ≡Lover triples (r,E,φ) where φ is an embedding of
the lhs E?of r into E. We say that (r1,E1,φ1) ≡L(r2,E2,φ2) if, and only if, (i) r1= r2and (ii) there
exists an embedding ψ ∈ [E1,E2] such that φ2= ψ ◦φ1.
Definition 12. (WLTS of a Kappa system) Let R = (πR
WLTS WR= (X ,L,w,π0) where: (i) X is the set of all ≡-equivalent classes of mixtures; (ii) L is the
set of all ≡L-equivalence classes of triples (r,E,φ) such that φ is an embedding between the lhs E?of r
andE; (iii)w(x,l,x?)=
0,{r1,...,rn}) be a Kappa system. We define the
k
|[E?,E?]|wheneverthereexistaruler=E?→Er@k, twomixturesE andE?, andan
embedding φ ∈[E?,E], such that x=[E]≡, l =[r,E,φ]≡L, and E?is the result (up to ≡) of the application
of r along φ to the mixture E; otherwise w(x,l,x?) = 0; (iv) π0(x) =∑{πR
The stochastic semantics of a Kappa system R is then defined as the trace distribution semantics of
the WLTS WR.
0(E?) | E?∈ Dom(πR
0)∩x}.
3Reduction procedure
In this section, we assume, without any loss of generality that Σιand Σλare disjoint sets. This can always
be achieved by taking two disjoint copies Sιand Sλof S and using site names in Sιto bear internal
states, and site names in Sλto bear binding states.
Informally, a contact map represents a summary of the agents and their potential bindings.
Definition 13. (Contact map) Given a Kappa system R, a contact map (CM) is a graph object (N ,E),
where the set of nodes N are agent types equipped with the corresponding interface, and the edges
are specified between the sites of the nodes. Formally, we have that N = {(A,Σ(A)) | A ∈ A } and
E ⊆ {((A,s),(A?,s?)) | A,A?∈ A and s ∈ Σ(A),s?∈ Σ(A?)}. If the site s of an agent of type A and the
site s?of an agent of type A?bear the same binding state in the rhs Erof a rule, then there exists an edge
e = ((A,s),(A?,s?)) ∈ E between s ∈ Σ(A) and s?∈ Σ(A?).
We say that a site s of the agent a is tested by the rule r, if it is contained in the lhs E?of the rule r.
Page 8
J. Feret, T. Henzinger, H. Koeppl, T. Petrov149
Definition 14. (Annotated Contact map) Given a Kappa system R, and its CM (N ,E), a valid an-
notated contact map (ACM) (N ,E) is a contact map where all agents are annotated with respect to
the rule set R. The annotation on the agent of type A ∈ A with respect to the rule r is given by the
equivalence relation on its set of sites ≈A⊆ Σ(A)×Σ(A) such that:
• If a rule r tests the sites s1and site s2of agents a1,a2(it is possible that a1= a2) of type A, then
s1≈As2;
• If a rule r creates an agent a of type A, then all the sites of Σ(A) are in the same equivalence class,
i.e. ≈A= Σ(A)×Σ(A);
Note that there can be several annotations of the agent type A ∈ A which satisfy the conditions. More
precisely, if the equivalence relation ≈Ameets the condition, then so does any of its refinement. This
allows to define the smallest such equivalence relation ≈Awhich we call the minimal annotation of
agent A. An ACM is minimal whenever each agent type is annotated by its minimal annotation.
Let r be a rule and an ACM which is valid with respect to the singleton {r}. Then for any agent
type A ∈ A , either A does not occur in the lhs of r, or A occurs but all occurrences of A have an empty
interface, or A occurs, tests some sites which all belongs to a same equivalence class C in ≈A. In the
latter case, we define testACM
r
(A) =C, otherwise, we define testACM
The meaning of the ACM is to summarize the dependences between sites that can occur during the
simulation of a Kappa system. If the two sites s and s?in the Σ(A) are correlated by the relation ≈A,
i.e. s ≈As?, it suggests that they are dependent in the following way. We must not aggregate in the same
equivalence class any two states x and x?, such that they contain the agent A in a different evaluation
of the sites s and s?. On the other hand, if the two sites s and s?are not correlated by ≈A, then we
may aggregate the states by the ’marginal’ criteria, i.e. the condition which involves only one of the
sites. Therefore, the less states are related by (≈A)A∈A, the better the reduction will be. To numerically
justify this, we can imagine having an agent type A whose interface has n different sites s1,...,sn, and
each of them has two possible internal state modifications. Let us observe the two limiting relations ≈A,
i.e. ≈A= {(si,sj) | 1 ≤ i ≤ n,1 ≤ j ≤ n}, and ≈?
least to 2nstates to describe all modifications of the agent A, whereas the annotation ≈?
is enough to use only 2·n of them.
The ACM can be used to identify parts of Kappa species that we call fragments.
Definition 15. (Kappa fragments) A fragment is the ≡-equivalent class of a non empty irreducible pat-
tern E such that: (i) the set of sites in the interface σ of an agent A(σ) in E is an equivalence class of
≈A, (ii) sites can only be free or tagged with a binding label i ∈ N and sites in Σιare tagged with an
internal state in I, (iii) there is no occurrence of fictitious agent / 0.
We can use fragments to abstract the WLTS WR, by identifying the mixtures which have the same
(multi-)set of fragments. To reach that goal, we first overload the definition of ≡ in order to identify
mixtures having the same fragments. We introduce the binary relation ≡?as the smallest equivalence
relation over patterns which is compatible with ≡ and such that:
A(σ),A(σ?),E ≡?A(↑Cσ?,↑Σ(A)\Cσ),A(↑Cσ,↑Σ(A)\Cσ?),E
for any agent type A ∈ A , σ,σ?interfaces, E pattern, and C an ≈A-equivalence class of sites. For any
set of sites X ⊆ S, the projection function ↑Xover interfaces keeps only the sites in X, formally ↑Xis
defined by ↑Xε = ε, ↑X(sλ
Now we define the relation ∼L? which stipulates that the rule r1applies on E1along φ1the same
way as the rule r2on E2along φ2. More formally, we write (r1,E1,φ1) ∼L? (r2,E2,φ2) whenever the
following properties are all satisfied:
r
(A) = / 0.
A= {(si,si) | 1 ≤ i ≤ n}. The annotation ≈Aenforces at
Asuggests that it
ι,σ?) = sλ
ι, ↑Xσ?whenever s ∈ X, and ↑X(sλ
ι,σ?) =↑Xσ?otherwise.
Page 9
150Lumpability Abstractions of Rule-based Systems
1. r1= r2;
2. E1≡?E2;
3. φ2= ψ ◦φ1, where ψ is the permutation which tracks how the sub-interface ↑testACM
moved in the proof that E1≡?E2, for any agent Ai(σi) occurring in E1.
Moreprecisely, thetransposition[i
i and i+1; the transposition [1
of two agents of type A, for any agent type A ∈ A ; any other step is associated with the identity
function (over N). The function ψ is defined as the composition of all the permutations (in the
reverse order) which are associated to the elementary steps in the proof that E1≡?E2.
4. the result of the application of r1to E1along φ1is ≡-equivalent to the result of the application of
r2to E2along φ2.
Definition 16. (Abstract WLTS of a Kappa system) Let R = (πR
define the WLTS ˜
WR= (˜
L, ˜ w, ˜ π0) where:
•
•
the lhs E?of r and E;
• ˜ w(˜ x,˜λ,˜x?) is equal to
that ˜ x = [E]≡? and˜λ = [r,E,φ]≡?
• for any ˜ x ∈
Let us define the relation ∼ over X by [E1]≡∼ [E2]≡if, and only if, E1≡?E2and the relation ∼L
over L by [λ1]≡L∼L[λ2]≡Lif, and only if, λ1≡?
r
(Ai)(Ai(σi)) is
i+1]isassociatedtoanagentpermutationoftheagentsatposition
2] is associated to a step which permutes the sub-interface testACM
r
(A)
0,{r1,...,rn}) be a Kappa system. We
X , ˜
˜
˜
X is the set of all ≡?-equivalent class of mixture;
L is the set of all ≡?
L-equivalent class of triples (r,E,φ) such that φ is an embedding between
k
|[E?,E?]|whenever there exist a mixture E, a rule r, and an embeddings φ such
L; otherwise, ˜ w(˜ x,˜λ,˜x?) is equal to 0;
X , ˜ π0(˜ x) =∑E?∈Dom(πR
˜
0)∩˜ xπR
0(E?).
Lλ2.
4Abstraction
We introduce abstractions on WLTS by aggregating the states and labels into partition classes. We
obtain a new WLTS defined over the aggregated states and labels. Each non-trivial abstraction is a loss
of information. However some of them are such that it is possible to do the stochastic analysis on the
aggregates rather than on concrete states. We address the problem of characterizing when this is possible,
and if so, how the weights in the abstracted system are computed. We also discuss the reverse process
- given the abstracted system, and a particular probability distributions over the aggregates, whether we
can make conclusions about the traces in the concrete system. We do the general theoretical analysis of
the abstractions on WLTS, and afterwards we show the relation with the reduction of Kappa systems,
that is presented in Sec. 3.
Definition 17. (Abstraction) Consider a WLTS W = (X ,L,w,π0), and a pair of equivalence relations
(∼,∼L) ∈ X2×L2, such that each ∼-equivalence class and each ∼L-equivalence class is finite. We
denote the equivalence classes by ˜ x,˜l, and we write x ∈ ˜ x, to indicate that x belongs to the equivalence
class ˜ x, and l ∈˜l to indicate that the label l belongs to the equivalence class˜l.
A WLTS of the form ˜
W = (X/∼,L/∼L, ˜ w, ˜ π0), where ˜ π0(˜ x) =∑{π0(x) | x ∈ ˜ x} is called an abstrac-
tion of W , induced by the pair of equivalence relations (∼,∼L). Note that more abstractions can be
induced by W , depending on how ˜ w is defined.
Moreover, for any two cylinder sets of traces ˜ τIR∈ TIR(˜
˜ x0
→ ˜ x1... ˜ xk−1
W ) and τIR∈ TIR(W ), we say that ˜ τIR=
→ x1...xk−1
˜l1,I1
˜lk,Ik
→ ˜ xkis an abstraction of τIR= x0
l1,I1
lk,Ik
→ xk, and we write it τIR∈ ˜ τIR.
Page 10
J. Feret, T. Henzinger, H. Koeppl, T. Petrov151
Definition 18. (Sound abstraction: Aggregation) We say that the abstraction
of W , if the probability of any cylinder set of traces ˜ τIR∈TIR(˜
of all the cylinder sets of traces τIR∈ TIR(W ), whose abstraction is ˜ τIR:
π(˜ τIR) =∑{π(τIR) | τIR∈ ˜ τIR}.
˜
W is a sound abstraction
W ) is equal to the sum of the probabilities
We introduce a function γ :X/∼→(X →[0,1]) which assigns to each of the partition class ˜ x∈X/∼
a probability distribution over the states that belong to this partition class. The set of all such vectors γ
we denote by ΓX ,∼is defined as:
{γ | γ : X/∼→ (X → [0,1])∧∀˜ x ∈
˜
X ,∑
x∈˜ x
γ(˜ x,x) = 1}.
We can think of the value γ(˜ x,x) as the conditional probability of being in the state x, knowing that we
are in state ˜ x, i.e. Pr(Xt= x | Xt∈ ˜ x) = γ(˜ x,x). We note that, when thinking of γ as the conditional prob-
ability, it should be a time-dependent value. However, we refer to γ as to a single, constant distribution.
This will be justified in Lem. 1.
Definition 19. (Complete abstraction: Deaggregation) We say that the abstraction
abstraction of W for γ ∈ ΓX ,∼, if the following holds. Given the probability of an arbitrary abstract
cylinder set of traces of length k ≥ 1, that ends in the abstract state ˜ xk(written ˜ τIR→ ˜ xk), we can
recompute the probability of ending the trace in the concrete state xk∈ ˜ xkin the following way:
π(˜ τIR→ xk) = γ(˜ xk,xk)·π(˜ τIR→ ˜ xk).
˜
W cannot be induced by any pair of relations (∼,∼L), because
there might not exist a weighting function ˜ w : X/∼×L/∼L×X/∼→ R, such that the conditions from
Dfn. 18 and Dfn. 19 are met. Moreover, even if such ˜ w exists, the remaining question is whether the
information on the abstract system is enough to compute them.
We now restate the main Theorem from [12], that the abstractions for Kappa systems, that we re-
sumed in Sec. 3, are sound and complete.
Theorem 1. (The abstraction induced by the ACM is sound and complete) Given a Kappa system R =
(πR
classes (∼,∼L) ⊆ X2×L2, as proposed in the Def. 16 is a sound and complete abstraction of the
WR= (X ,L,w,π0), provided that for any two mixtures M and M?such that M ≡?M?, we have:
π0([M]≡)·|[M?,M?]| = π0([M?]≡)·|[M,M]|.
We consider a mixture M. We denote by x ∈ X the equivalence class [M]≡, and by ˜ x ∈
lence class [M]≡? = [x]∼. The conditional probability γ(˜ x,x) is computed as the ratio of the number of
automorphisms of x (embedding between x and x) and the sum of the number of automorphisms of any
∼-equivalent state. Thus we have:
˜
W is a complete
Sound and complete abstractions
0,{r1,...,rn}), and an ACM (N ,E), an abstraction ˜
WR=(X/∼,L/∼, ˜ w, ˜ π0) induced by the partition
˜
X the equiva-
γ(˜ x,x) =
|[x,x]|
∑{|[x?,x?]| | x ∼ x?}.
The reader can find the detailed proof in [12].
Page 11
152Lumpability Abstractions of Rule-based Systems
4.1 Lumpability
Now we define different versions of lumpability and investigate the relationship with sound and complete
abstractions.
Definition 20. (Lumped process) Given a WLTS W = (X ,L,w,π0), where X = {x1,x2,...}, and a
partition ∼⊆ X ×X on its state space, we observe the continuous-time stochastic process {Xt}t∈R+
that is generated by W (Dfn. 2). We define the lumped process {Yt} on the state space X/∼={˜ x1, ˜ x2,...}
(denoted by capital indices, i.e. ˜ xI, ˜ xJ) and with initial distribution ˜ π0, so that
0,
Pr(Yt= ˜ xJ | Y0= ˜ x0) = Pr(Xt∈ ˜ xJ | X0∈ ˜ x0).
The lumped process is not necessarily a Markov process.
Definition 21. (Lumpability) Given a WLTS W =(X ,L,w,π0) that generates the process {Xt}, we say
that it is lumpable with respect to the equivalence relation ∼⊆ X ×X if and only if its lumped process
{Yt} has the Markov property.
The evolution of a process depends on the initial distribution, and so does the lumpability property.
We thus define the set of initial distributions, for which the lumpability holds. We denote the set of all
probability distributions over X as PX:
PX= {π | π : X → [?,∞] and ∑
§?∈X
π(§?) = ∞}.
Moreover, we denote the set of initial distributions that produce a chain lumpable with respect to the
given equivalence relation ∼ by PI
PI
X ,∼:
X ,∼:= {π | the lumped process initialized with π is lumpable with respect to ∼}.
Whenever a distribution π ∈ PX is positive on the equivalence class ˜ x, i.e. ∑{π(x) | x ∈ ˜ x} > 0,
we denote by π|˜ x(x), the conditional distribution over the states of ˜ x: π|˜ x(x) = π(x)/π(˜ x), when x ∈ ˜ x,
and π|˜ x(x) = 0, otherwise.
Definition 22. (Strong and weak lumpability) Given a WLTS W = (X ,L,w,π0) that generates the
process {Xt}, and an equivalence relation ∼⊆ X ×X , we say that {Xt} is:
• strongly lumpable with respect to ∼, if the lumped process {Yt} is Markov with respect to any
initial distribution, i.e. PI
• weakly lumpable with respect to ∼, if there exists an initial distribution that makes the lumped
process {Yt} Markov, i.e. PI
Note that the definitions of strong and weak lumpability involve the quantifiers ”for all” and ”exists”
overtheprobabilitydistributionsoverasetofstates. Thus, checkingforeitheroftheminvolvesingeneral
an infinite number of checks. People have given sufficient conditions of strong and weak lumpability on
discrete-time Markov chains (DTMC’s) [21, 26]. The results had been extended to the continuous-time
case [4, 27]. We rephrase the sufficient conditions stated therein.
In order to understand the sense of the weak lumpability characterization, we discuss the meaning
of γ. We recall the semantics of a WLTS W by observing the cylinder sets of traces, i.e. τIR= x0
x1...xk−1
set of traces, denoted ˜ τIR= ˜ x0
→ ˜ x1... ˜ xk−1
X ,∼= PX;
X ,∼?= / 0.
l1,I1
→
lk,Ik
→ xk∈ TIR(W ). The abstraction ˜
W of W , induced by (∼,∼L) generates an abstract cylinder
lk,Ik
→ ˜ xk∈ TIR(˜
l1,I1
W ).
Page 12
J. Feret, T. Henzinger, H. Koeppl, T. Petrov 153
For any cylinder set of traces ˜ τIR∈ TIR(˜
probabilities over the lumped state ˜ xk, knowing that the abstract cylinder of traces ˜ τIR, which ends in the
abstract state ˜ xk, was observed, i.e.
γ˜ τIR(xk) =π(˜ τIR→ xk)
W ), we denote by γ˜ τIRthe distribution of the conditional
π(˜ τIR)
.
The definition of the complete abstraction suggests that, if γ˜ τIRwas independent of the traces on which it
is conditioned, i.e. ˜ τIR, then the completeness would hold.
Theorem 2. (Lumpability on CTMCs) Let us observe a WLTS W = (X ,L,w,π0) that generates the
process {Xt}, and an equivalence relation ∼⊆ X ×X . We consider the rate matrix R : X ×X → R.
If the lumped process is Markov, then we denote its rate matrix by˜R : X/∼×X/∼→ R. Then we have
the following characterizations about the lumped process {˜Xt}:
• If for all xi1,xi2∈ X such that xi1∼ xi2, and for all ˜ xJ∈ X/∼, we have that
∑
xj∈˜ xJ
R(xi1,xj) =∑
xj∈˜ xJ
R(xi2,xj),
(2)
then {Xt} is strongly lumpable with respect to ∼; We have:
˜R(˜ xI, ˜ xJ) =∑{R(xi1,xj) | xj∈ ˜ xJ};
• If there exists a family of probability distributions over the lumped states, γ ∈ ΓX ,∼, such that for
all xj1,xj2∈ X such that xj1∼ xj2and for all ˜ xI∈ X/∼, we have that
a(xj1) = a(xj2) and∑xi∈˜ xiR(xi,xj1)
γ(˜ xJ,xj1)
=∑xi∈˜ xIR(xi,xj2)
γ(˜ xJ,xj2)
,
(3)
then
1. If the distribution γ is in accordance with π0, i.e. π0|X/∼= γ, then for any finite sequence of
states (x0,...,xk) ∈ Xk+1and any sequence of time intervals (I1,...,Ik) ∈ IRk, we consider
the set ˜ τIRof the traces of the form x?
0
→ x?
xi∼ x?
In other words, knowing that we are in state ˜ xI, the conditional probability of being in state
x ∈ ˜ xIis invariant of time.
2. The process {Xt} is weakly lumpable with respect to ∼. Moreover, we have:
˜R(˜ xI, ˜ xJ) =∑{R(xi,xj2) | xi∈ ˜ xI}
l1,t1
1...x?
k−1
lk,t1+...+tk
→
x?
k. For all i, 0 ≤ i ≤ k and
i, and for all i, 1 ≤ i ≤ k, ti∈ Iiand li∈ L, we have that: if π(˜ τIR) > 0 then γ˜ τIR= γ.
γ(˜ xJ,xj2)
;
One shall notice that Thm. 2 gives a weaker condition than the completeness of WLTS abstraction
(eg see Dfn. 19). The main reason is that we do not ’track’ transition labels, in the sense that we observe
the abstraction on the cylinder sets of traces induced only by ∼, and not also by ∼L. Yet, in the particular
case when states fully define the transition labels (ie, if w(x1,l1,x?
x?
complete abstraction of WLTS.
The characterization of weak (resp. strong) lumpability given in Thm. 2 is sufficient, but not a nec-
essary condition: there exist systems which are strongly or weakly lumpable, but do not satisfy the
1) > 0, w(x2,l2,x?
2) > 0, x1∼ x2, and
1∼ x?
2, then l1∼Ll2), the given condition for weak lumpability coincides with the definition of the
Page 13
154Lumpability Abstractions of Rule-based Systems
conditions given in the theorem. Interestingly, there are systems, such that the characterization from
Thm. 2 would detect as strong, but not weakly lumpable, which is counter-intuitive with the terminol-
ogy. One shall also notice that the conditions of Thm. 2 imply that: in order to aggregate two states in
the CTMC, they must not have different waiting times until the next transition (e.g. they should have the
same activity). It is stated explicitly in the characterization of weak lumpability and it can be obtained
by summation over the outgoing class in the characterization of strong lumpability.
We consider a WLTS W = (X ,L,w,π0), and the set of all equivalence relations ∼ on X , denoted
PTX. We introduce the subsets of PTX, denoted PS, PW,CS,CW in the following meaning: (i) PS -the
set of all equivalence relations such that {Xt} is strongly lumpable with respect to ∼; (ii) PW - the set
of all equivalence relations such that {Xt} is weakly lumpable with respect to ∼; (iii) CS - the set of all
equivalence relations such that {Xt} satisfies the condition for strong lumpability given in the Thm. 2;
(iv) CW - the set of all equivalence relations such that {Xt} satisfies the condition for weak lumpability
given in the Thm. 2.
Lemma 1. (Relations on lumpability properties and conditions) Consider an arbitrary WLTS W =
(X ,L,w,π0) and the equivalence relation ∼⊆X ×X . We have the following relations: (1a) if ∼∈PS
then ∼∈ PW, if ∼∈ CS then ∼∈ PS; and if ∼∈ CW then ∼∈ PW; The converse implication does not
hold for any of the statements; (2a) If ∼∈CW, that does not imply ∼∈CS; (2b) If ∼∈CS, that does not
imply ∼∈CW.
Proof. The statement in the part (1) trivially follows from the Dfn. 21 and Thm. 2. To show (2a) and (2b),
we consider the WLTS W specified in the Fig. 1(a), with the state space X = {x,y1,y2,y3,z1,z2,z3}.
Let ∼1be an equivalence relation on X , such that y1∼1y2and z1∼1z2. By lumping the states by ∼1,
we get the system
˜
W1, as shown in Fig. 1(b). It is easy to check that ∼1∈CS. Moreover, we have that
∼1∈CW, since for
γ =
1
(0.5,0.5)
the weak lumpability condition is satisfied, so ∼1∈CS∩CW. We further lump the states y12and y3, by
taking the transitive closure of the relation ∼1union (y1,y3), denoted ∼2= tc(∼1∪(y1,y3)) (Fig. 1(c)).
This lumping is such that ∼2/ ∈CS because we have
y1∼ y3, but w(y1,l,z12) > 0, and w(y3,l,z12) = 0.
On the other hand, for
γ =
1
(1/3,1/3,1/3)
we argue that ∼2∈CW. Therefore, if the initial distribution is in accordance with γ, the abstraction
is sound and complete.
If we rather lump z1and z2, by ∼3=tc(∼1∪(z1,z3)), we get the system
is such that ∼3∈ CS\CW. More precisely, we cannot find a γ which would witness ∼3∈ CW: if such
a γ existed, we would have γ({x})(x) = 1, and consequently γ(y12) = (0.5,0.5), and γ(y3) = 1. This
implies that the conditional distribution γ(z123) cannot be invariant of time - it will alternate between the
distributions (0,0,1) and (0.5,0.5,0), depending on the choice made in x. Note that, since ∼3∈ CS, it
follows that ∼3∈ PS, and this implies ∼3∈ PW.
This discussion indicates that if we decide to check for weak lumpability instead of for strong by us-
ing the characterization from Thm. 2, it might happen that we eliminate the aggregations that are strongly
lumpable. In the case of reductions of Kappa systems, we will use the weak lumpability characterization.
?
xy12
y3
1
z12
z3
1
(0.5,0.5)
?
?
xy123
z12
z3
1
(0.5,0.5)
?
˜
W2
˜
W3(Fig. 1(d)). This system
Page 14
J. Feret, T. Henzinger, H. Koeppl, T. Petrov155
x
y1
y2
y3
z1
z2
z3
1/2
1/3
1
1
1
1/3
1/3
1/2
1/2
1/2
1/2
1/2
(a) W : the concrete system
x
y1
y2
y3
z1
z2
z3
1/2
1/3
1
1
1
1/3
1/3
1/2
1/2
1/2
1/2
1/2
˜ x
˜ z12
˜ z3
1/2
1/2
1/2
1/2
˜ y12
˜ y3
1
1
2/3
1/3
(b)
˜
W1; ∼1∈ PS∩PW ∩CS∩CW
x
y1
y2
y3
z1
z2
z3
1/2
1/3
1
1
1
1/3
1/3
1/2
1/2
1/2
1/2
1/2
˜ x˜ y
˜ z12
˜ z3
1
2/3
1/3
1/2
1/2
1/2
1/2
(c)
˜
W2; ∼2∈CW \CS
x
y1
y2
y3
z1
z2
z3
1/2
1/3
1
1
1
1/3
1/3
1/2
1/2
1/2
1/2
1/2
˜ x
1/2
1/2
1/2
1
1
1/2
˜ y12
˜ y3
˜ z
(d)
˜
W3; ∼3∈CS\CW
Figure 1: Different abstractions of system W
4.2Bisimulations
Aiming to define the algorithm that is abstracting the WLTS of a Kappa system, we start by redefining
the lumpability properties in the bisimulation notions. Bisimulation is typically defined by logically
characterizing the distinguishing property of the states that may be aggregated.
We define three kinds of bisimulation relations on the WLTS, which are based on the lumpability
characterizations given in Thm. 2. We adopt the terminology of [5]. The forward bisimulations arise
from the characterization for strong lumpability: the bisimilar states have the same forward behaviour in
the sense that they are each targeting any other lumped state with the same total affinity (total outgoing
rate). This concept is well established for dependability or performance analysis [16, 15]. What we use in
theabstractionsofKappasystemsisbackward bisimulation. Thebisimilarstateshavethesamebackward
behaviour in the sense that they are reached by the predecessors from one lumped state with the same
probabilistic quantity, which becomes the rate in the abstract system. It is however less established and
only applied in very few approaches for stochastic modelling [30]. The backward uniform bisimulation
is an instance of a backward bisimulation with an additional constraint that only the equally-probable
states may be aggregated.
Definition 23. Given a WLTS W = (X ,L,w,π0), and (∼,∼L) a pair of equivalence relations respec-
tively over X and L, we define a function δF: X ×L/∼L×X/∼→ R+
δF(xi,˜l, ˜ xj) =∑{|w(xi,l,xj)| | l ∈˜l and xj∈ ˜ xj}.
Furthermore, given a family of probability distributions over the partitions γ ∈ ΓX ,∼, we define the
quantity δB: X/∼×L/∼L×X → R+
δB(˜ xi,˜l,xj) =∑{γ(˜ xi,xi)·|w(xi,l,xj)| | l ∈˜l,xi∈ ˜ xi}
0:
0:
γ(˜ xj,xj)
.
Specifically, if we have that γ is a uniform distribution over the equivalence classes, we can express
the latter expression in terms of cardinalities of the equivalence classes:
δBU(˜ xi,˜l,xj) =|˜ xj|
|˜ xi|∑{|w(xi,l,xj)| | l ∈˜l,xi∈ ˜ xi}.
Page 15
156Lumpability Abstractions of Rule-based Systems
Definition 24. (Forward and backward Markov bisimulation) Given a WLTS W = (X ,L,w,π0), and
(∼,∼L) a pair of equivalence relations respectively over X and L, we say that (∼,∼L) is a
1. Forward Markov Bisimulation, if for all xiand xj, the following is satisfied: xi∼ xj, iff for all
equivalence classes ˜ x ∈ X/∼,˜l ∈ L/∼L, we have that a(xi) = a(xj) and δF(xi,˜l, ˜ x) = δF(xj,˜l, ˜ x).
Remark. Note that this involves the bisimulation in the classical sense: if xihas a successor in
some class, xjhas it as well, and they are related by appropriate labels (and probabilities in this
case).
2. Backward Markov bisimulation, if for all xiand xj, there exists an γ ∈ ΓX ,∼, such that the follow-
ing is satisfied: xi∼xj, iff for all equivalence classes ˜ x∈X/∼,˜l ∈L/∼L, we have that a(xi)=a(xj)
and δB(˜ x,˜l,xi) = δB(˜ x,˜l,xj).
Theorem 3. (Forward Markov bisimulation implies sound abstraction) Let W = (X ,L,w,π0) be a
WLTS. If (∼,∼L) induces a forward Markov bisimulation, then for any aggregates ˜ xi,˜l, and ˜ xj, we can
define
˜ w(˜ xi,˜l, ˜ xj) = δF(xi,˜l, ˜ xj).
The so defined abstraction ˜
W =(X/∼,L/∼L, ˜ w, ˜ π0) is sound. We then say that W refines ˜
Markov bisimulation (∼,∼L), written W ?F,(∼,∼L)
Theorem 4. (Backward Markov bisimulation implies sound and complete abstraction) Given a WLTS
W = (X ,L,w,π0), if (∼,∼L) induces a backward Markov bisimulation with conditional probabilities
over the aggregates γ ∈ ΓX ,∼, then for any aggregates ˜ xi,˜l, and ˜ xj, we can define
˜ w(˜ xi,˜l, ˜ xj) = δB(˜ xi,˜l,xj).
˜
W = (X/∼,L/∼L, ˜ w, ˜ π0) is sound and complete. We then
say that W refines
˜
W by a backward Markov bisimulation (∼,∼L) with conditional distributions γ,
written W ?B,(∼,∼L),γ
In particular, if we know that γ is uniform, the equation (4) becomes ˜ w(˜ xi,˜l, ˜ xj) = δBU(xi,˜l, ˜ xj),
written also W ?BU,(∼,∼L)
W by a forward
˜
W .
(4)
If γ(˜ x) = π0|˜ x, then the so defined abstraction
˜
W .
˜
W .
4.3Proving bisimulations
The forward bisimulation relation for abstracting the transition systems with CTMC semantics has been
established and used in applications (eg, [16, 15]). Moreover, computing the backward uniform bisimu-
lation when γ is uniform is defined [5, 30]. It is based on an alternative characterization of the backward
uniform Markov bisimulation, which eases the analysis.
Lemma 2. (Proving backward uniform Markov bisimulation) Let W = (X ,L,w,π0) be a WLTS and
(∼,∼L) be a pair of equivalence relations respectively over X and L. For any state x?∈ X , and any
pair of classes ˜ x,˜l ∈ X/∼×L/∼L, let us define the set Pred(˜ x,˜l,x?) of transitions from a state in ˜ x to the
state x?and with a label in˜l as follows:
Pred(˜ x,˜l,x?) = {(x,l) ∈ ˜ x×˜l | w(x,l,x?) > 0}.
Assume that: (1) π0|X/∼= ˜ π0, and (2) for any x?
L/∼L, there exists a bijective map φ between Pred(˜ x,˜l,x?
Pred(˜ x,˜l,x?
ThenwehavethatW isthebackwarduniformbisimulationoftheabstraction ˜
i.e. W ?BU,(∼,∼L)
i,x?j∈ X such that x?
i) and Pred(˜ x,˜l,x?j), such that for any (xi,li) ∈
i) = w(xj,lj,x?j).
i∼ x?jand any ˜ x ∈ X/∼,˜l ∈
i), if φ(xi,li) = (xj,lj), then we have that w(xi,li,x?
W =(X/∼,L/∼, ˜ w, ˜ π0),
˜
W .
Page 16
J. Feret, T. Henzinger, H. Koeppl, T. Petrov157
˜
W
W
Wi
˜
W
W
⇒
�BU,(∼2,∼L2)
�BU,(∼1,∼L1)
�B,(∼,∼L),γ
Figure 2: Proving backward refinement
On the other hand, as soon as γ over the aggregates is not uniform, we cannot observe the bijection
between predecessors over the states. Proving that the given abstraction is a backward bisimulation
cannot be established unless we have a right ’guess’ of the distributions γ. Lem. 3 states how to avoid
proving backward bisimulation by instead proving two uniform backward bisimulations. More precisely,
if we want to prove the backward refinement between the systems W and
system Wi, which is a backward uniform refinement of both of the systems W and
Lemma 3. (Proving backward Markov bisimulation) Consider a WLTS W = (X ,L,w,π0), and any
aggregation relation (∼,∼L), such that
tem Wi= (Xi,Li,wi,πi
Wi?BU,(∼1,∼L1)W , Wi?BU,(∼2,∼L2)
xi
xi
W ?B,(∼,∼L),γ
γ(˜ x,x) =|{xi∈ Xi| xi∼1xi
|{xi∈ Xi| xi∼2xi
This Lemma contains the key observation for the abstraction of Kappa systems, and for proving
Thm.1. It thus completes the intention of the theoretical analysis in this paper. More precisely, we
observe the WLTS WRof a given Kappa system R, as defined in Dfn. 12 and its abstraction generated
as proposed in the reduction procedure (Sec. 3, Dfn. 16). The main observation is that the system W
is already an abstraction. More concretely, the states of W are multisets of species, and as such, they
abstract the individual species. For example, a state that contains two agents of type A(su) abstracts away
the potential individual behavior of these two agents, for example A1(su) and A2(su). To show that the
abstraction is sound and complete, we observe the system Wi, which is the individual-based semantics of
a Kappa system, where each individual agent is uniquely identified. The backward uniform refinement is
established between Wiand W by the modeling assumptions. We are left to prove the backward uniform
refinement between Wiand
˜
W . This is done by inspections on the ACM’s (Dfn. 14).
˜
W , it is enough to observe the
˜
W (Fig. 4.3).
˜
W = (X/∼,L/∼L, ˜ w, ˜ π0). We assume that there exist a sys-
0), and the two pairs of equivalence relations (∼1,∼L1), (∼2,∼L2), such that
˜
W , ∼1?∼2(in the sense that, for any xi
2), ∼L1?∼L2, and, for any xi
1is equal to the number of states which are ∼L1- equivalent to xi
˜
W , where γ is defined as
1,xi
2∈Xi, xi
1∼1xi
2⇒xi
1∼2
1,xi
2such that xi
1∼2xi
2, the number of states which are ∼L1- equivalent to
2. Under this assumption, we have that
0}|
0}|, for any [xi
0]∼1= x.
4.4Example
We consider the following Kappa system. We have the agent types A ={A,B}, the site names {s,t}, the
signatures Σι(A)=Σι(B)={s} and Σλ(A)=Σλ(B)={t}, the alphabet of internal states I={u,p}. The
contact map is defined by (N ,E), such that N = {(A,s),(A,t),(B,s),(B,t)} and E = {((A,t),(B,t))}
and the following rules:
r1 : A(su) ↔ A?sp
r3 : A(t) , B(t) ↔ A?t1?, B?t1?@k3,k3−
?@k1,k1−
r2 : B(su) ↔ B?sp
?@k2,k2−
Page 17
158Lumpability Abstractions of Rule-based Systems
Moreover, using the minimal ACM for annotating the agents, as written in Dfn. 14, we get that ≈Ahas
two equivalence classes {s} and {t}; and that ≈Bhas two equivalence classes {s} and {t} as well.
The fragments derived from an ACM (Dfn. 15) are the following: F1=A(su), F2=A?sp
Let us pick a (finite) initial distribution π0. Now we observe the WLTS W = (X ,L,w,π0) assigned
to the Kappa system RAB(introduced in Dfn. 12), and the state y which is the ≡-equivalence class of the
mixture Eydefined as follows:
A?sp,t1?,B?sp,t1?,A?su,t2?,B?su,t2?,A?su,t3?,B?su,t3?
The unique (up to ≡) non ≡?-equivalent mixtures is Ey? which is defined as follows:
A?sp,t1?,B?su,t1?,A?su,t2?,B?sp,t2?,A?su,t3?,B?su,t3?
We denote y?= [Ey?]≡,˜y?= [Ey?]≡?. We compute however that the distribution among state ˜ y = [Ey]≡?.
We have: γ(˜ y,y) = 1/3 and γ(˜ y,y?) = 2/3. Roughly speaking this comes from the fact that if we
annotate fragments of type A and B in ˜ y with the identifiers 1, 2, 3 (there are 36 possible annotation), and
if we assume that agents with the same identifiers are bound together. Then there are 12 annotations so
that the phosphorilated A and B are bound together, and 24 where this is not the case. A more detailed
analysis of this model is given in [12].
?, F3=A(t),
F4= A?t1?, B?t1?, F5= B(su), F6= B?sp
?, F7= B(t).
5 Case study
In this section, we apply our framework to a case study. We have indeed refactored in Kappa the model
of a crosstalk between the EGF rececptor and the Insulin receptor pathways which is described in [6]. In
this model, two kinds of receptors, EGF receptor (EGFR) and insulin receptor (IR) can recruit a protein
called Sos, which can be phosphorylated, or not. Each kind of receptor has its own pathway, and these
two pathways shared some common proteins.
The contact map is given in Fig. 3. One can notice that some sites can be bound to several other
sites, which denotes a competition (concurrency). Moreover, the site d of a EGF receptor (EGFR) can be
bound to the site d of another EGFR (since there is a loop in the CM). Moreover rules are given in Table
1. We do not give the rate for rules, but we assume no hypotheses on the rates (rates are parameters,
some of them might be equal, or not). Moreover, each rule is reversible.
We roughly explain how each pathway works, by focusing on the forward direction of rules. First, we
describe how EGFR can recruit a transport molecule (Grb). EGFR can recruit a ligand (EGF) on site a
(r01,r02), and two EGFRs can form a dimer (r03,r04,r05). We have used two rules to encode EGF/EGFR
binding, in order to model the fact that the rate of association may depend on the fact whether EGFR is
in a dimer, or not. The same way, the rate of dimer formation/dissociation may depend on the number of
ligands that are bound to receptors. The site b of EGFR can be phosphorylated (r06,r07) at a rate which
depends whether the receptor is in a dimer, or not. Then, EGFR can recruit an adapter molecule called
Shc (r08). Then, EGFR can phosphorylize Shc (r09,r10) (the rate depend on the fact whether the receptor
is still in a dimer, or not). Shc can then recruit a transport modecule (Grb) (r11). Yet, each receptor has
a shorter way to recruit a transport molecule. The site c of EGFR can be phosphorylated (r12,r13), and
then recruit Grb directly (r14,r15).
Then, we describe how an insulin receptor (IR) can recruit Grb. IR can recruit insulin molecules (Ins)
on two sites a (r16,r17) and b (r18,r19) (the rate may depend on the fact whether an insulin molecule
Page 18
J. Feret, T. Henzinger, H. Koeppl, T. Petrov159
Grb
Sos
EGFR
EGF
IR
Ins
Shc
IRS
d
d
b
a
a
b
a
c
b sitea
a
a
b
b
d
c
a
Figure 3: Contact map for the EGFR/Insulin crosstalk.
r01: EGF(a) , EGFR(a,d) ←→ EGF?a1?, EGFR?a1,d?
r03: EGFR(a,d) , EGFR(a−,d) ←→ EGFR?a,d1?, EGFR?a−,d1?
r05: EGFR(a−,d) , EGFR(a−,d) ←→ EGFR?a−,d1?, EGFR?a−,d1?
r07: EGFR(bu,d−) ←→ EGFR?bp,d−?
r09: EGFR?b1
r11: Grb(a) , Shc?bp
r13: EGFR(cu,d−) ←→ EGFR?cp,d−?
r15: EGFR?cp,d−?, Grb(a) ←→ EGFR?c1
r17: IR(a,b−) , Ins(a) ←→ IR?a1,b−?, Ins?a1?
r19: IR(a−,b) , Ins(a) ←→ IR?a−,b1?, Ins?a1?
r02: EGF(a) , EGFR(a,d−) ←→ EGF?a1?, EGFR?a1,d−?
r04: EGFR(a,d) , EGFR(a,d) ←→ EGFR?a,d1?, EGFR?a,d1?
r06: EGFR(bu,d) ←→ EGFR?bp,d?
r08: EGFR?bp
r10: EGFR?b1
r12: EGFR(cu,d) ←→ EGFR?cp,d?
r14: EGFR?cp,d?, Grb(a) ←→ EGFR?c1
r16: IR(a,b) , Ins(a) ←→ IR?a1,b?, Ins?a1?
r18: IR(a,b) , Ins(a) ←→ IR?a,b1?, Ins?a1?
?, Shc(a) ←→ EGFR?b1
p,d−?, Shc?a1,bu
p
?, Shc?a1?
?
p,d?, Grb?a1?
p,d?, Shc?a1,bu
?←→ Grb?a1?, Shc?b1
?←→ EGFR?b1
p,d?, Shc?a1,bp
p
?
?←→ EGFR?b1
p,d−?, Shc?a1,bp
?
p,d−?, Grb?a1?
r20: IR(a,b,cu) ←→ IR?a,b,cp
r22: IR(a,b−,cu) ←→ IR?a,b−,cp
r24: IR?cp
r26: IR(a,b,du) ←→ IR?a,b,dp
r28: IR(a,b−,du) ←→ IR?a,b−,dp
r30: IR?dp
r32: Grb(a) , IRS?bp
r34: Grb(b) , Sos?dp
r36: Sos(du) ←→ Sos?dp
r38: IRS(bu) ←→ IRS?bp
?
r21: IR(a−,b,cu) ←→ IR?a−,b,cp
r23: IR(a−,b−,cu) ←→ IR?a−,b−,cp
r25: IR?a−,b−,c1?, Shc?a1,bu
r27: IR(a−,b,du) ←→ IR?a−,b,dp
r29: IR(a−,b−,du) ←→ IR?a−,b−,dp
r31: IR?a−,b−,d1?, IRS?a1,bu
r33: Grb(b) , Sos(du) ←→ Grb?b1?, Sos?d1
r35: Grb?b1?, Sos?d1
r37: Shc(bu) ←→ Shc?bp
?
?
?
?, Shc(a) ←→ IR?c1
p
?, Shc?a1?
?
?
p
?←→ IR?a−,b−,d1?, IRS?a1,bp
u
?←→ Grb?b1?, Sos?d1
?
?
?←→ IR?a−,b−,c1?, Shc?a1,bp
?
?
?←→ Grb?a1?, IRS?b1
?←→ Grb?b1?, Sos?d1
?
?
?, IRS(a) ←→ IR?d1
?, IRS?a1?
?
p
?
?
?
p
up
?
Table 1: Rule set for the EGFR/Insulin crosstalk.
has already been recruited). The site c of the IR can be phosphorylated (r20,r21,r22,r23) at a rate which
depends on the number of recruited insulin molecules (in practice the rates of rules r21 and r22 are the
same). Then, IR can recruit an adapter Shc (r24). Whenever IR is also bound to two insulin molecules,
Shc can be phosphorylized (r25). Shc can then recruit Grb (r11). Yet, IR has an other way of recruiting
a Grb. The site d of IR (r26,r27,r28,r29), and then recruit another adapter called IRS (r30) which can be
activated when the insulin receptor is bound to two insulin molecules (r31). Then, IRS can recruit Grb
(r32).
Independently, Grb can recruit a protein Sos (r33,r34). And Sos can be activated (r35,r36) at the rate
which may depend on the fact whether it is bound to a Grb, or not. Other rules describe the recuitment
of Sos by Grb. And spontaneous (de)phosphorylation of Shc (r37) and IRS (r38).
In this model, 2,768 different complexes may occur. This number is mainly due to the fact that each
dimer made of two proteins EGFR has 4 sites (the sites b and c for each EGFR) to recruit a Grb, which
induces a small combinatorial blow up. Scanning the set of rules, one can notice that no rule tests both the
site a and b of some proteins Grb. Thus, the partition {{a},{b}} can be used safely for the sites of Grb,
in the annotated contact map. As a consequence, the number of fragments is only 609. Unfortunatly,
Page 19
160Lumpability Abstractions of Rule-based Systems
this is the only reduction that we can do (ie, the partition for the sites of any other kind of proteins, is the
coarsest one).
Last, one can notice that, given some additional hypotheses on the rate of some rule, that the sites a
and b of IR have a symetric role in the system. We could consider this symetry to reduce the set of consid-
ered fragments, by identifying two symetric fragments, such as IR?a1,b?, Ins?a1?and IR?a,b1?, Ins?a1?.
6Conclusions
Reducing the complexity of combinatorial reaction mixtures is an important milestone towards simula-
tion and analysis of large-scale realistic models of cellular signal transduction. In this paper we study a
scalable reduction method, that is applicable to any rule-based specification. The reduction is sound and
moreover complete, i.e. the sample traces of individual molecular species can be reconstructed from the
traces of aggregated species in the reduced model. We put this method into the general context of abstrac-
tions of probabilistic transition system and show that it yields a sufficient condition for weak lumpability
and that it is equivalent to backward Markov bisimulation. The reduction factor depends on the number
of independent molecular events and is strictly smaller than that of the less-demanding reduction based
on the differential semantics.
A compelling problem for future work is thus to analyze differential fragments in the context of
stochastic semantics and to obtain error bounds for this reduction as a function of the kinetic parameters
of the system.
References
[1] Christel Baier, Boudewijn Haverkort, Holger Hermanns & Joost-Pieter Katoen (2003): Model-checking al-
gorithms for continuous-time Markov chains. IEEE Transactions on Software Engineering 29(7), p. 2003.
[2] Christel Baier & Holger Hermanns (1999). Weak Bisimulation for Fully Probabilistic Processes.
[3] Michael L. Blinov, James R. Faeder & William S. Hlavacek (2004): BioNetGen: software for rule-based
modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 20, pp.
3289–3292.
[4] Peter Buchholz (1994): Exact and Ordinary Lumpability in Finite Markov Chains. Journal of Applied Prob-
ability 31, no1, pp. 59–75.
[5] PeterBuchholz(2008): Bisimulationrelationsforweightedautomata. TheoreticalComputerScienceVolume
393, Issue 1-3, pp. 109–123.
[6] Holger Conzelmann, Dirk Fey & Ernst D. Gilles (2008): Exact model reduction of combinatorial reaction.
BMC Syst Biol 2(78), pp. 342–351.
[7] Vincent Danos, J´ erˆ ome Feret, Walter Fontana, Russel Harmer & Jean Krivine (2008): Rule-based modelling,
symmetries, refinements. In: FMSB’08, LNBI.
[8] Vincent Danos & Cosimo Laneve (2003): Core Formal Molecular Biology. Theoretical Computer Science
325, pp. 69–110.
[9] Jos´ ee Desharnais, Abbas Edalat & Prakash Panangaden (2002): Bisimulation for labelled Markov processes.
Inf. Comput. 179(2), pp. 163–193.
[10] Laurent Doyen, Thomas A. Henzinger & Jean-Francois Raskin (2008): Equivalence of Labeled Markov
Chains. Int. J. Found. Comput. Sci. 19(3), pp. 549–563. Available at http://dx.doi.org/10.1142/
S0129054108005814.
Page 20
J. Feret, T. Henzinger, H. Koeppl, T. Petrov161
[11] J´ erˆ ome Feret, Vincent Danos, Jean Krivine, Russ Harmer & Walter Fontana (2009): Internal coarse-graining
of molecular systems. Proc. Natl. Acad. Sci. USA 106(16), pp. 6453–6458. Available at http://dx.doi.
org/10.1073/pnas.0809908106.
[12] Jerome Feret, Heinz Koeppl & Tatjana Petrov: Stochastic fragments: A framework for the exact reduction of
the stochastic semantics of rule-based models. International Journal of Software and Informatics To appear.
[13] Donald Gross & Douglas R. Miller (1984): The Randomization Technique as a Modeling Tool and Solution
Procedure for Transient Markov Processes. Operations Research 32, no2, pp. 343–361.
[14] Thomas A. Henzinger, Maria Mateescu & Verena Wolf (2009): Sliding Window Abstraction for Infinite
Markov Chains. In: CAV, pp. 337–352.
[15] Holger Hermanns (2002): Interactive Markov Chains And the Quest for Quantified Quality. Ph.D. thesis,
University of Marburg.
[16] Jane Hillston (1996): A compositional approach to performance modelling. Cambridge University Press,
New York, NY, USA.
[17] William S. Hlavacek, James R. Faeder, Michael L. Blinov, Alan S. Perelson & Byron Goldstein (2005): The
complexity of complexes in signal transduction. Biotechnol. Bio-eng. 84, pp. 783–794.
[18] William S. Hlavacek, James R. Faeder, Michael L. Blinov, Richard G. Posner, Michael Hucka & Walter
Fontana (2006): Rules for Modeling Signal-Transduction Systems. Science’s STKE 2006(344).
[19] D. Kannan Jianjun Paul Tian (2006): Lumpability and Commutativity of Markov Processes. Stochastic
analysis and Applications 24, no3, pp. 685–702.
[20] J. G. Kemeny, J. L. Snell & A. W. Knapp (1976): Denumerable Markov Chains. Springer-Verlag, New York,
NY, USA.
[21] John Kemeny & James L. Snell (1960): Finite Markov Chains. Van Nostrand.
[22] Thomas G. Kurtz (1971): Limit Theorems for Sequences of Jump Markov Processes Approximating Ordinary
Differential Processes. Journal of Applied Probability 8(2), pp. 344–356.
[23] Kim G. Larsen & Arne Skou (1989): Bisimulation through probabilistic testing (preliminary report). In:
POPL ’89, ACM, New York, NY, USA, pp. 344–352.
[24] Donald A. McQuarrie (1967): Stochastic approach to chemical kinetics. Journal of Applied Probability 4(3),
pp. 413–478.
[25] Elaine Murphy, Vincent Danos, Jerome Feret, Russell Harmer & Jean Krivine (2009): Rule Based Modelling
and Model Refinement. In: Elements of Computational Systems Biology, Wiley.
[26] Gerardo Rubino & Bruno Sericola (1991): A Finite Characterization of Weak Lumpable Markov Processes.
Part I: The Discrete Time Case. Stochastic processes and their applications 38, pp. 195–204.
[27] Gerardo Rubino & Bruno Sericola (1993): A Finite Characterization of Weak Lumpable Markov Processes.
Part II: The Continuous Time Case. Stochastic processes and their applications 45, pp. 115–125.
[28] Roger B. Sidje, Kevin Burrage & Shev MacNamara (2007): Inexact Uniformization Method for Computing
Transient Distributions of Markov Chains. SIAM Journal on Scientific Computing 29, issue 6, pp. 2562–
2580.
[29] Ana Sokolova & Erik P. de Vink (2003): On relational properties of lumpability. In: Proceedings of the 4th
PROGRESS.
[30] Jeremy Sproston & Susanna Donatelli (2004): Backward Stochastic Bisimulation in CSL Model Checking.
In: QEST ’04, Washington, DC, USA, pp. 220–229.
[31] Christopher T. Walsh (2006): Posttranslation Modification of Proteins: Expanding Nature’s Inventory.
Roberts and Co. Publisher.
Download full-text