Page 1

G.Ciobanu, M.Koutny (Eds.): Membrane Computing and

Biologically Inspired Process Calculi 2010 (MeCBIC 2010)

EPTCS 40, 2010, pp. 142–161, doi:10.4204/EPTCS.40.10

Lumpability Abstractions of Rule-based Systems∗

Jerome Feret

LIENS (INRIA/´ENS/CNRS)

Paris, France

feret@ens.fr

Thomas Henzinger

Institute and Science of Technology

Vienna, Austria

thenzinger@ist.ac.at

Heinz Koeppl

School of Computer and Communication Sciences

EPFL

Lausanne, Switzerland

heinz.koeppl@epfl.ch

Tatjana Petrov

School of Computer and Communication Sciences

EPFL

Lausanne, Switzerland

tatjana.petrov@epfl.ch

The induction of a signaling pathway is characterized by transient complex formation and mutual

posttranslational modification of proteins. To faithfully capture this combinatorial process in a math-

ematical model is an important challenge in systems biology. Exploiting the limited context on which

most binding and modification events are conditioned, attempts have been made to reduce the com-

binatorial complexity by quotienting the reachable set of molecular species, into species aggregates

while preserving the deterministic semantics of the thermodynamic limit. Recently we proposed a

quotienting that also preserves the stochastic semantics and that is complete in the sense that the

semantics of individual species can be recovered from the aggregate semantics. In this paper we

prove that this quotienting yields a sufficient condition for weak lumpability and that it gives rise to a

backward Markov bisimulation between the original and aggregated transition system. We illustrate

the framework on a case study of the EGF/insulin receptor crosstalk.

1Introduction

Often a few elementary events of binding and covalent modification [31] in a biomolecular reaction

system give rise to a combinatorial number of non-isomorphic reachable species or complexes [17, 18].

Instances of such systems are signaling pathway, polymerizations involved in cytoskeleton maintainance,

the formation of transcription factor complexes in gene-regulation.

For such biomolecular systems, traditional chemical kinetics face fundamental limitations, that are

related to the question of how biomolecular events are represented and translated into a mathematical

model [24]. More specifically, chemical reactions can only operate on a collection of fully specified

molecular species and each such species gives rise to one differential equation, describing the rate of

change of that species’ concentration. Many combinatorial systems do not permit the enumeration of all

molecular species and thus render their traditional differential description prohibitive. However, even if

one could enumerate them, it remains questionable whether chemical reactions are the appropriate way

to represent and to reason about such systems.

As the dynamics of a biomolecular reaction mixture comes about through the repeated execution of a

few elementary events one may wonder about the effective degrees of freedom of the reaction mixture’s

dynamics. If the velocity of all events – or their probabilities to occur per time-unit per instance –

∗J´ erˆ ome Feret’s contribution was partially supported by the ABSTRACTCELL ANR-Chair of Excellence. Heinz Koeppl

acknowledges the support from the Swiss National Science Foundation, grant no. 200020-117975/1. Tatjana Petrov acknowl-

edges the support from SystemsX.ch, the Swiss Initiative in Systems Biology.

Page 2

J. Feret, T. Henzinger, H. Koeppl, T. Petrov 143

are different for all complexes (w.r.t. modification) and pairs of complexes (w.r.t. binding) to which

the events can apply to, then the degrees of freedom would equal to the number of molecular species.

However, due to the local nature of physical forces underlying molecular dynamics, the kinetics of most

events appear to be ignorant with respect to the global configuration of the complexes they are operating

on. More provocatively, one may say that even if there would be variations of kinetics of an event from

one context to another, experimental biology does not – and most likely never will – have the means to

discern between all different contexts. For instance, fluorescence resonance energy transfer (FRET), may

report on a specific protein-binding event and even its velocity, however we have no means to determine

whether the binding partners are already part of a protein complex – not to speak of the composition and

modification state of these complexes. To this end, molecular species remain elusive and appear to be

inappropriate entities of descriptions.

To align with the mentioned experimental insufficiencies and with the underlying biophysical lo-

cality, rule-based or agent-based descriptions were introduced as a framework to encode such reaction

mixtures succinctly and to enable their mathematical analysis [8, 3]. Rules exploit the limited context

on which most elementary events are conditioned. They just enumerate that part of a molecular species

that is relevant for a rule to be applicable. Thus, in contrast to chemical reactions, rules can operate on a

collection of partially specified molecular species. Recently, attempts have been made to identify the set

of those partially specified species that allow for a self-consistent description of the rule-set’s dynamics

[11, 6]. Naturally, as partially specified species – or fragments – in general encompass many fully speci-

fied species, the cardinality of that set is less than of the set of molecular species. These approaches aim

to obtain a self-consistent fragment dynamics based on ordinary differential equations. It represents the

dynamics in the thermodynamic limit of stochastic kinetics when scaling species multiplicities to infinity

while maintaining a constant concentration (multiplicity per unit volume) [22]. In many applications in

cellular biology this limiting dynamics is an inappropriate model due to the low multiplicities of some

molecular species – think of transcription factor - DNA binding events. We recently showed that the

obtained differential fragments cannot be used to describe the finite volume case of stochastic kinetics

[12]. Exploiting statistical independence of events we instead derived stochastic fragments that repre-

sent the effective degrees of freedom in the stochastic case. Conceptually, the procedure replaces the

Cartesian product by a Cartesian sum for statistically independent states. In contrast to the differential

case, stochastic fragments have the important property that the sample paths of molecular species can be

recovered from that of partially specified species.

We believe that interdisciplinary fields, such as systems biology, can move forward quickly enough

only if the well-established knowledge in each of the disciplines involved is exploited to its maximum,

i.e. if the standardized, well-established theories are recognized and re-used. For that reason, in this

paper we translate our abstraction method ([12]) into the language of well-established contexts of ab-

straction for probabilistic systems – lumpability and bisimulation. Lumpability is mostly considered

from a theoretical point of view in the theory of stochastic processes [19, 13, 28, 26, 27, 4]. A Markov

chain is lumpable with respect to a given aggregation (quotienting) of its states, if the lumped chain pre-

serves the Markov property [21]. A sound aggregation for any initial probability distribution is referred

to as strong lumpability, while otherwise it is termed weak lumpability [4, 29]. Approximate aggregation

techniques for Markov chains of biochemical networks are discussed in [14]. Probabilistic bisimulation

was introduced as an extension to classic bisimulation in [23]. It is extended to continuous-state and

continuous-time in [9] and, for the discrete-state case, to weak bisimulation [2]. For instance, in [9]

the authors use bisimulation of labelled Markov processes, the state space of which is not necessarily

discrete, and they provide a logical characterization of probabilistic bisimulation. Another notion of

weak bisimulation was recently introduced in [10]. Therein two labeled Markov chains are defined to

Page 3

144 Lumpability Abstractions of Rule-based Systems

be equivalent if every finite sequence of observations has the same probability of occurring in the two

chains. Herein we recognize the sound aggregations of [12] as a form of backward Markov bisimulations

on weighted labeled transition systems (WLTS), and we show it to be equivalent to the notion of weak

lumpability on Markov chains.

The remaining part of the paper is organized as follows. In the Section 2, we introduce weighted

labeled transition systems (WLTS) and we assign it the trace density semantics of a continuous-time

Markov chain (CTMC). Moreover, we define the Kappa language, and we assign a WLTS to a Kappa

specification. Based on the notion of the annotated contact map we briefly summarize in Section 3 the

general procedure to compute stochastic fragments, as it is offered in [12]. In Section 4, we introduce

the characterizations of sound and complete abstractions on WLTS as a backward Markov bisimulation.

Moreover, we define it being equivalent to the weak lumpability on Markov chains. Finally, we provide

in Section 5 results for the achieved dimensionality reduction for a rule-based model of the crosstalk

between the EGF/insulin signaling pathway [6]. This mechanistic model comprises 76 rules giving rise

to 42956 reactions and 2768 molecular species.

2 Preliminaries

The stochastic semantics of a biochemical network is modelled by a continuous-time Markov chain

(CTMC). The main object that we will use in the analysis is the weighted labelled transition system

(WLTS) on a countable state space. We will assign a WLTS to a given Kappa specification, and we

manipulate that object when reasoning about abstractions.

2.1 CTMC

We will observe the CTMC that is generated by the weighted labelled transition system (WLTS) on a

countable state space. We define the CTMC of a WLTS, by defining the Borel σ-algebra containing all

cylinder sets of traces [20] that can occur in the system, and the corresponding probability distribution

among them. We also introduce the standard notation of a rate matrix, which we will use when analysing

the lumpability and bisimulation properties in Sec. 4.

Definition 1. (WLTS) A weighted-labelled transition system W is a tuple (X ,L,w,π0), where

• X is a countable state space;

• L is a set of labels;

• w : X ×L ×X → R+

• π0: X → [0,1] is an initial probability distribution.

We assume that the label fully identifies the transition, i.e. for any x ∈ X and l ∈ L, there is at most one

x?∈ X , such that w(x,l,x?) > 0. Moreover, we assume that the system is finitely branching, in the sense

that (i) the set {x ∈ X | π0(x) > 0} is finite, and (ii) for arbitrary ˆ x ∈ X , the set {(l,x?) ∈ L ×X |

w(ˆ x,l,x?) > 0} is finite.

The activity of the state xi, denoted a : X → R+

a(xi) :=∑{w(xi,l,xj) | xj∈ X, l ∈ L}.

ThedefinitionofaWLTSimplicitelydefinesatransitionrelation→⊆X ×X , suchthat(xi,xj)∈→,

if and only if there exists a non-zero transition from state xito state xj, i.e. the total weight over all labels

is strictly bigger then zero, written∑{w(xi,l,xj) | l ∈L}>0. Moreover, we can differentiate the initial

set of states I ⊆ X , such that their initial probabilities are positive, i.e. I = {x ∈ X | π0(x) > 0}.

0is the weighting function that maps two states and a label to a real value;

0is the sum of all weights originating at xi, i.e.

Page 4

J. Feret, T. Henzinger, H. Koeppl, T. Petrov145

Definition 2. (Rate matrix of a WLTS) Given a WLTS W = (X ,L,w,π0), we assign it the CTMC rate

matrix R : X ×X → R, given by R(xi,xj) =∑{w(xi,l,xj) | l ∈ L}.

The consequence is that we do not enforce R(xi,xi) = −∑{R(xi,xj) | i ?= j}, as it is usual for the

generator matrix of CTMCs. This however does not affect the transient, not the steady-state behavior

of the CTMC [1]. We do so for the following reason. When abstracting the WLTS by partitioning the

state space, we get another WLTS. If the two states x and x?which have a transition between each other

were aggregated in the same partition class ˜ x, it will result as a prolongation of the residence time in the

abstract state ˜ x, i.e. we will have a self-loop in the abstract WLTS.

If we refer to the generated stochastic Markov process, written as a continuous-time random variable

{Xt}t∈R+

the value xiat time point t. It thus holds that Pr(X0= xi) = π0(xi), and Pr(Xt+dt= xj | Xt= xi) =

R(xi,xj)dt when i ?= j, whereas Pr(Xt+dt= xi | Xt= xi) = R(xi,xi)dt+(1−∑{R(xi,xj?)dt | xj? ∈ X }),

which gives after simplification: Pr(Xt+dt= xi| Xt= xi) = 1−∑{R(xi,xj?)dt | j??= i}.

We define the cylinder sets of traces that can be observed in the system W . By observing the trace at

a certain time point, we mean observing the sequence of visited states, labels that were assigned to the

executed transitions, and time points of when the transition happened.

Definition 3. (A trace of a WLTS) Let us observe the WLTS W = (X ,L,w,π0) and its CTMC. Given a

number k in N, we define a trace of length k as τ ∈ (X ×L ×R+

τ = x0

→ x1...xk−1

If the trace τ is such that (i) π0(x0) > 0, and (ii) for all i, 0 ≤ i ≤ k, we have that w(xi,li,xi+1) > 0, then

we say that τ belongs to the set of traces of W , and we write it τ ∈ T (W ).

The ‘time stamps’ on each of the transitions denote intuitively the absolute time of the transition,

from the moment when the system was started (t = 0). We cannot assign the probability distribution to

the traces in T (W ), since the probability of any such trace is zero. We thus introduce the cylinder set of

traces over intervals of times.

Definition 4. (Cylinder set of traces) If IR is the set of all nonempty intervals in R+

cylinder set of traces τIR∈ (X ×L ×IR)k×X , such that:

τIR= x0

→ x1...xk−1

denotes the set of all traces τ = x0

→ x1...xk−1

of traces τIRis such that π0(x0) > 0, and for all i = 0,...,k−1, we have that w(xi,li,xi+1) > 0, then we

say that τIRbelongs to the cylinder set of traces of W , and we write τIR∈ TIR(W ).

Let Ω(TIR(W )) be the smallest Borel σ-algebra that contains all the cylinder sets of traces in

TIR(W ). We define a probability measure over Ω(TIR(W )) in the following way.

Definition 5. (Trace density semantics on a WLTS) Given a WLTS (X ,L,w,π0), and a number k in N,

the probability of the cylinder set of traces τIR∈ TIR(W ), specified as in expression (1), is given by:

k

∏

i=1

0, over the countable state space X . We write Pr(Xt= xi), the probability that the process takes

0)k×X , written

xk.

l1,t1

lk,t1+...+tk

→

0, then we define the

l1,I1

lk,Ik

→ xk

xk, such that ti∈ Ii, 1 ≤ i ≤ k. If the cylinder

(1)

l1,t1

lk,t1+...+tk

→

π(τIR) = π(x0

l1,I1

→ x1...xk−1

lk,Ik

→ xk) = π0(x0)

w(xi−1,li,xi)

a(xi−1)

?

e−a(xi−1)·inf(Ii)−e−a(xi−1)·sup(Ii)?

.

Note that?

a(xi−1)e−a(xi−1).

Iia(xi−1)e−a(xi−1)·tdt = e−a(xi−1)·inf(Ii)−e−a(xi−1)·sup(Ii)is the probability of exiting the state

xi−1in a time interval Ii−1, since the probability density function of the residence time of xi−1is equal to

Page 5

146 Lumpability Abstractions of Rule-based Systems

2.2Kappa

We present Kappa in a process-like notation. We start with an operational semantics, then define the

stochastic semantics of a Kappa model.

We assume a finite set of agent names A , representing different kinds of proteins; a finite set of

sites S, corresponding to protein domains; a finite set of internal states I, and Σι,Σλtwo signature maps

from A to℘(S), listing the domains of a protein which can bears respectively an internal state and a

binding state. We denote by Σ the signature map that associates to each agent name A ∈ A the combined

interface Σι(A)∪Σλ(A).

Definition 6. (Kappa agent) A Kappa agent A(σ) is defined by its type A ∈ A and its interface σ. In

A(σ), the interface σ is a sequence of sites s in Σ(A), with internal states (as subscript) and binding

states (as superscript). The internal state of the site s may be written as sε, which means that either it

does not have internal states (when s ∈ Σ(A)\Σι(A)), or it is not specified. A site that bears an internal

state m ∈ I is written sm(in such a case s ∈ Σι(A)). The binding states of a site s can be specified as sε,

if it is free, otherwise it is bound (which is possible only when s ∈ Σλ(A)). There are several levels of

information about the binding partner: we use a binding label i ∈ N when we know the binding partner,

or a wildcard bond − when we only know that the site is bound. The detailed description of the syntax

of a Kappa agent is given by the following grammar:

a

N

σ

::=

::=

::=

N(σ)

A ∈ A

ε | s,σ

(agent)

(agent name)

(interface)

s

n

ι

λ

::=

::=

::=

::=

nλ

x ∈ S

ε | m ∈ I

ε | − | i ∈ N

ι

(site)

(site name)

(internal state)

(binding state)

We generally omit the symbol ε.

Definition 7. (Kappa expression) Kappa expression E is a set of agents A(σ) and fictitious agents / 0.

Thus the syntax of a Kappa expression is defined as follows:

E ::= ε | a , E | / 0 , E.

Thestructuralequivalence≡, definedasthesmallestbinaryequivalencerelationbetweenexpressions

that satisfies the rules given as follows:

E , A(σ,s,s?,σ?) , E?

E , a , a?, E?

≡

≡

≡

E , A(σ,s?,s,σ?) , E?

E , a?, a , E?

E , / 0E

i, j ∈ N and i does not occur in E

E[i/j] ≡ E

i ∈ N and i occurs only once in E

E[ε/i] ≡ E

stipulates that neither the order of sites in interfaces nor the order of agents in expressions matters, that

a fictitious agent might as well not be there, that binding labels can be injectively renamed and that

dangling bonds can be removed.

Definition 8. (Kappa pattern, Kappa mixture) A Kappa pattern is a Kappa expression which satisfies

the following five conditions: (i) no site name occurs more than once in a given interface; (ii) each site

name s in the interface of the agent A occurs in Σ(A); (iii) each site s which occurs in the interface of

the agent A with a non empty internal state occurs in Σι(A); (iv) each site s which occurs in the interface

of the agent A with a non empty binding state occurs in Σλ(A); and (v) each binding label i ∈ N occurs

exactly twice if it does at all –there are no dangling bonds. A mixture is a pattern that is fully specified,

i.e. each agent A documents its full interface Σ(A), a site can only be free or tagged with a binding label

i ∈ N, a site in Σι(A) bears an internal state in I, and no fictitious agent occurs.

Page 6

J. Feret, T. Henzinger, H. Koeppl, T. Petrov 147

Definition 9. (Kappa rule) A Kappa rule r is defined by two Kappa patterns E?and Er, and a rate

k ∈ R+

A rule r is well-defined, if the expression Eris obtained from E?by finite application of the following

operations: (i) creation (some fictitious agents / 0 are replaced with some fully defined agents of the form

A(σ), moreover σ documents all the sites occurring in Σ(A) and all site in Σι(A) bears an internal state

in I), (ii) unbinding (some occurrences of the wild card and binding labels are removed), (iii) deletion

(some agents with only free sites are replaced with fictitious agent / 0), (iv) modification (some non-empty

internal states are replaced with some non-empty internal states), (v) binding (some free sites are bound

pair-wise by using binding labels in N).

From now on, we assume all rules to be well-defined. We sometimes omit the rate of a rule. More-

over, we denote by E?↔ Er@k1,k2the two rules E?↔ Er→ @k1and Er→ E?@k2.

Definition 10. (Kappa system) A Kappa system R = (πR

over initial mixtures πR

0

: {M01,...,M0k} → [0,1], and a finite set of rules {r1,...,rn}.

In order to apply a rule r := E?→ Er@k to a mixture M, we use the structural equivalence ≡ to bring

the participating agents to the front of E (with their sites in the same order as in E?), rename binding

labels if necessary and introduce a fictitious agent for each agent that is created by r. This yields an

equivalent expression E?that matches the left and side (lhs) E?, which is written E |= E?as defined as

follows:

0, and is written: r = E?→ Er@k.

0,{r1,...,rn}) is given by finite distribution

E |= εσ |= σ? =⇒ N(σ) |= N(σ?)

σ |= ε

s |= s?∧σ |= σ? =⇒ s,σ |= s?,σ?

ι |= ι?∧λ |= λ? =⇒ nλ

ι?∈ {ε,ι} =⇒ ι |= ι?

λ?∈ {−,λ} =⇒ λ |= λ?

ι|= nλ?

ι?

a |= a?∧E |= E? =⇒ a , E |= a?, E?

/ 0 |= / 0

Note that in order to find a matching, we only uses structural equivalence on E, not E?. We then

replace E?by E?[Er] which is defined as follows:

E[ε] = EN(σ)[N(σr)] = N(σ[σr])

σ[ε] = σ

(s,σ)[sr,σr] = s[sr],σ[σr]

nλ

ι[nλr

ιr] = nλ[λr]

ι[ιr]

(a , E)[ar, Er] = a[ar] , E[Er]

/ 0[ar] = ar

ar[/ 0] = / 0

ιr∈ I =⇒ ι[ιr] = ιr

λr∈ N∪{ε} =⇒ λ[λr] = λr

λ[−] = λ

This may produce dangling bonds (if r unbinds a wildcard bond or destroys an agent on one side of a

bond) or fictitious agents (if r destroys agents), so we use ≡ resolve them.

2.2.1 Population-based stochastic semantics

In addition to the rate constants k, careful counting of the number of times each rule can be applied to

a mixture is required to define the system’s quantitative semantics correctly [7, 25]. Thus we define the

notions of embedding between a mixture and an expression. Let Z = a1, ... , amand Z?= c1,...,cnbe

two patterns with no occurrence of the fictitious agent and such that there exists a pattern Z?= b1,...,bm

that satisfies both Z ≡ Z?and Z?|= Z?(and so, in particular, n ≤ m).

The agent permutations used in the proof that Z ≡ Z?allow us to derive a permutation p such that

ap(i)≡ bi. The restriction φ of p to the integers between 1 and n is called an embedding between Z?and

Z. This is written Z??φZ. There may be several embeddings between Z?and Z for the same Z?; if so,

this influences the relative weight of the reaction in the stochastic semantics. We denote by [Z,Z?] the set

Page 7

148Lumpability Abstractions of Rule-based Systems

of embeddings between Z and Z?. This notion of embedding is extended to patterns (including fictitious

agents) by defining Z??φZ if, and only if, (↓/ 0Z?)?φ(↓/ 0Z), where ↓/ 0removes all occurrences of the

fictitious agent in patterns.

We assume that E?is the lhs of a rule r := E?→ Er@k and Z is a mixture such that E??φZ. Let

Z = a1,...,amand ↓/ 0E?= c1,...cn. Given Z?≡ Z (we write ↓/ 0Z?= b1,...,bm) and a bijection p such

that we have Z?|= E?, bi≡ ap(i)for 1 ≤ i ≤ m and φ(j) = p(j) for 1 ≤ j ≤ n. The result of applying r

along φ to the mixture Z is defined (modulo ≡) as any pattern that is ≡-equivalent to Z?[Er]. In other

words the embedding φ between E?and Z fully defines the action of r on Z up to structural equivalence.

We are now ready to define the stochastic semantics by the mean of a WLTS. In this semantics, the

state is a soup of agents, that is to say that we do not care about the order of agents in mixture. So the

states of the system are the class of ≡-mixture.

Defining species as connected mixture, the state of the system can be seen as a multi-set of species.

The formal definition of a Kappa species is as follows:

Definition 11. (Kappa species) A pattern E is reducible whenever E ≡ E?,E??for some non-empty pat-

terns E?, E??; A Kappa species is the ≡-equivalence class of a irreducible Kappa mixture.

As explained earlier, the action of a rule r on a mixture E is fully defined (up to ≡) by an embedding

φ between the lhs E?of the rule r and the mixture. So as to consider computation steps over ≡-equivalent

of mixtures, we introduce an equivalence relation ≡Lover triples (r,E,φ) where φ is an embedding of

the lhs E?of r into E. We say that (r1,E1,φ1) ≡L(r2,E2,φ2) if, and only if, (i) r1= r2and (ii) there

exists an embedding ψ ∈ [E1,E2] such that φ2= ψ ◦φ1.

Definition 12. (WLTS of a Kappa system) Let R = (πR

WLTS WR= (X ,L,w,π0) where: (i) X is the set of all ≡-equivalent classes of mixtures; (ii) L is the

set of all ≡L-equivalence classes of triples (r,E,φ) such that φ is an embedding between the lhs E?of r

andE; (iii)w(x,l,x?)=

0,{r1,...,rn}) be a Kappa system. We define the

k

|[E?,E?]|wheneverthereexistaruler=E?→Er@k, twomixturesE andE?, andan

embedding φ ∈[E?,E], such that x=[E]≡, l =[r,E,φ]≡L, and E?is the result (up to ≡) of the application

of r along φ to the mixture E; otherwise w(x,l,x?) = 0; (iv) π0(x) =∑{πR

The stochastic semantics of a Kappa system R is then defined as the trace distribution semantics of

the WLTS WR.

0(E?) | E?∈ Dom(πR

0)∩x}.

3Reduction procedure

In this section, we assume, without any loss of generality that Σιand Σλare disjoint sets. This can always

be achieved by taking two disjoint copies Sιand Sλof S and using site names in Sιto bear internal

states, and site names in Sλto bear binding states.

Informally, a contact map represents a summary of the agents and their potential bindings.

Definition 13. (Contact map) Given a Kappa system R, a contact map (CM) is a graph object (N ,E),

where the set of nodes N are agent types equipped with the corresponding interface, and the edges

are specified between the sites of the nodes. Formally, we have that N = {(A,Σ(A)) | A ∈ A } and

E ⊆ {((A,s),(A?,s?)) | A,A?∈ A and s ∈ Σ(A),s?∈ Σ(A?)}. If the site s of an agent of type A and the

site s?of an agent of type A?bear the same binding state in the rhs Erof a rule, then there exists an edge

e = ((A,s),(A?,s?)) ∈ E between s ∈ Σ(A) and s?∈ Σ(A?).

We say that a site s of the agent a is tested by the rule r, if it is contained in the lhs E?of the rule r.

Page 8

J. Feret, T. Henzinger, H. Koeppl, T. Petrov149

Definition 14. (Annotated Contact map) Given a Kappa system R, and its CM (N ,E), a valid an-

notated contact map (ACM) (N ,E) is a contact map where all agents are annotated with respect to

the rule set R. The annotation on the agent of type A ∈ A with respect to the rule r is given by the

equivalence relation on its set of sites ≈A⊆ Σ(A)×Σ(A) such that:

• If a rule r tests the sites s1and site s2of agents a1,a2(it is possible that a1= a2) of type A, then

s1≈As2;

• If a rule r creates an agent a of type A, then all the sites of Σ(A) are in the same equivalence class,

i.e. ≈A= Σ(A)×Σ(A);

Note that there can be several annotations of the agent type A ∈ A which satisfy the conditions. More

precisely, if the equivalence relation ≈Ameets the condition, then so does any of its refinement. This

allows to define the smallest such equivalence relation ≈Awhich we call the minimal annotation of

agent A. An ACM is minimal whenever each agent type is annotated by its minimal annotation.

Let r be a rule and an ACM which is valid with respect to the singleton {r}. Then for any agent

type A ∈ A , either A does not occur in the lhs of r, or A occurs but all occurrences of A have an empty

interface, or A occurs, tests some sites which all belongs to a same equivalence class C in ≈A. In the

latter case, we define testACM

r

(A) =C, otherwise, we define testACM

The meaning of the ACM is to summarize the dependences between sites that can occur during the

simulation of a Kappa system. If the two sites s and s?in the Σ(A) are correlated by the relation ≈A,

i.e. s ≈As?, it suggests that they are dependent in the following way. We must not aggregate in the same

equivalence class any two states x and x?, such that they contain the agent A in a different evaluation

of the sites s and s?. On the other hand, if the two sites s and s?are not correlated by ≈A, then we

may aggregate the states by the ’marginal’ criteria, i.e. the condition which involves only one of the

sites. Therefore, the less states are related by (≈A)A∈A, the better the reduction will be. To numerically

justify this, we can imagine having an agent type A whose interface has n different sites s1,...,sn, and

each of them has two possible internal state modifications. Let us observe the two limiting relations ≈A,

i.e. ≈A= {(si,sj) | 1 ≤ i ≤ n,1 ≤ j ≤ n}, and ≈?

least to 2nstates to describe all modifications of the agent A, whereas the annotation ≈?

is enough to use only 2·n of them.

The ACM can be used to identify parts of Kappa species that we call fragments.

Definition 15. (Kappa fragments) A fragment is the ≡-equivalent class of a non empty irreducible pat-

tern E such that: (i) the set of sites in the interface σ of an agent A(σ) in E is an equivalence class of

≈A, (ii) sites can only be free or tagged with a binding label i ∈ N and sites in Σιare tagged with an

internal state in I, (iii) there is no occurrence of fictitious agent / 0.

We can use fragments to abstract the WLTS WR, by identifying the mixtures which have the same

(multi-)set of fragments. To reach that goal, we first overload the definition of ≡ in order to identify

mixtures having the same fragments. We introduce the binary relation ≡?as the smallest equivalence

relation over patterns which is compatible with ≡ and such that:

A(σ),A(σ?),E ≡?A(↑Cσ?,↑Σ(A)\Cσ),A(↑Cσ,↑Σ(A)\Cσ?),E

for any agent type A ∈ A , σ,σ?interfaces, E pattern, and C an ≈A-equivalence class of sites. For any

set of sites X ⊆ S, the projection function ↑Xover interfaces keeps only the sites in X, formally ↑Xis

defined by ↑Xε = ε, ↑X(sλ

Now we define the relation ∼L? which stipulates that the rule r1applies on E1along φ1the same

way as the rule r2on E2along φ2. More formally, we write (r1,E1,φ1) ∼L? (r2,E2,φ2) whenever the

following properties are all satisfied:

r

(A) = / 0.

A= {(si,si) | 1 ≤ i ≤ n}. The annotation ≈Aenforces at

Asuggests that it

ι,σ?) = sλ

ι, ↑Xσ?whenever s ∈ X, and ↑X(sλ

ι,σ?) =↑Xσ?otherwise.

Page 9

150Lumpability Abstractions of Rule-based Systems

1. r1= r2;

2. E1≡?E2;

3. φ2= ψ ◦φ1, where ψ is the permutation which tracks how the sub-interface ↑testACM

moved in the proof that E1≡?E2, for any agent Ai(σi) occurring in E1.

Moreprecisely, thetransposition[i

i and i+1; the transposition [1

of two agents of type A, for any agent type A ∈ A ; any other step is associated with the identity

function (over N). The function ψ is defined as the composition of all the permutations (in the

reverse order) which are associated to the elementary steps in the proof that E1≡?E2.

4. the result of the application of r1to E1along φ1is ≡-equivalent to the result of the application of

r2to E2along φ2.

Definition 16. (Abstract WLTS of a Kappa system) Let R = (πR

define the WLTS ˜

WR= (˜

L, ˜ w, ˜ π0) where:

•

•

the lhs E?of r and E;

• ˜ w(˜ x,˜λ,˜x?) is equal to

that ˜ x = [E]≡? and˜λ = [r,E,φ]≡?

• for any ˜ x ∈

Let us define the relation ∼ over X by [E1]≡∼ [E2]≡if, and only if, E1≡?E2and the relation ∼L

over L by [λ1]≡L∼L[λ2]≡Lif, and only if, λ1≡?

r

(Ai)(Ai(σi)) is

i+1]isassociatedtoanagentpermutationoftheagentsatposition

2] is associated to a step which permutes the sub-interface testACM

r

(A)

0,{r1,...,rn}) be a Kappa system. We

X , ˜

˜

˜

X is the set of all ≡?-equivalent class of mixture;

L is the set of all ≡?

L-equivalent class of triples (r,E,φ) such that φ is an embedding between

k

|[E?,E?]|whenever there exist a mixture E, a rule r, and an embeddings φ such

L; otherwise, ˜ w(˜ x,˜λ,˜x?) is equal to 0;

X , ˜ π0(˜ x) =∑E?∈Dom(πR

˜

0)∩˜ xπR

0(E?).

Lλ2.

4Abstraction

We introduce abstractions on WLTS by aggregating the states and labels into partition classes. We

obtain a new WLTS defined over the aggregated states and labels. Each non-trivial abstraction is a loss

of information. However some of them are such that it is possible to do the stochastic analysis on the

aggregates rather than on concrete states. We address the problem of characterizing when this is possible,

and if so, how the weights in the abstracted system are computed. We also discuss the reverse process

- given the abstracted system, and a particular probability distributions over the aggregates, whether we

can make conclusions about the traces in the concrete system. We do the general theoretical analysis of

the abstractions on WLTS, and afterwards we show the relation with the reduction of Kappa systems,

that is presented in Sec. 3.

Definition 17. (Abstraction) Consider a WLTS W = (X ,L,w,π0), and a pair of equivalence relations

(∼,∼L) ∈ X2×L2, such that each ∼-equivalence class and each ∼L-equivalence class is finite. We

denote the equivalence classes by ˜ x,˜l, and we write x ∈ ˜ x, to indicate that x belongs to the equivalence

class ˜ x, and l ∈˜l to indicate that the label l belongs to the equivalence class˜l.

A WLTS of the form ˜

W = (X/∼,L/∼L, ˜ w, ˜ π0), where ˜ π0(˜ x) =∑{π0(x) | x ∈ ˜ x} is called an abstrac-

tion of W , induced by the pair of equivalence relations (∼,∼L). Note that more abstractions can be

induced by W , depending on how ˜ w is defined.

Moreover, for any two cylinder sets of traces ˜ τIR∈ TIR(˜

˜ x0

→ ˜ x1... ˜ xk−1

W ) and τIR∈ TIR(W ), we say that ˜ τIR=

→ x1...xk−1

˜l1,I1

˜lk,Ik

→ ˜ xkis an abstraction of τIR= x0

l1,I1

lk,Ik

→ xk, and we write it τIR∈ ˜ τIR.

Page 10

J. Feret, T. Henzinger, H. Koeppl, T. Petrov 151

Definition 18. (Sound abstraction: Aggregation) We say that the abstraction

of W , if the probability of any cylinder set of traces ˜ τIR∈TIR(˜

of all the cylinder sets of traces τIR∈ TIR(W ), whose abstraction is ˜ τIR:

π(˜ τIR) =∑{π(τIR) | τIR∈ ˜ τIR}.

˜

W is a sound abstraction

W ) is equal to the sum of the probabilities

We introduce a function γ :X/∼→(X →[0,1]) which assigns to each of the partition class ˜ x∈X/∼

a probability distribution over the states that belong to this partition class. The set of all such vectors γ

we denote by ΓX ,∼is defined as:

{γ | γ : X/∼→ (X → [0,1])∧∀˜ x ∈

˜

X ,∑

x∈˜ x

γ(˜ x,x) = 1}.

We can think of the value γ(˜ x,x) as the conditional probability of being in the state x, knowing that we

are in state ˜ x, i.e. Pr(Xt= x | Xt∈ ˜ x) = γ(˜ x,x). We note that, when thinking of γ as the conditional prob-

ability, it should be a time-dependent value. However, we refer to γ as to a single, constant distribution.

This will be justified in Lem. 1.

Definition 19. (Complete abstraction: Deaggregation) We say that the abstraction

abstraction of W for γ ∈ ΓX ,∼, if the following holds. Given the probability of an arbitrary abstract

cylinder set of traces of length k ≥ 1, that ends in the abstract state ˜ xk(written ˜ τIR→ ˜ xk), we can

recompute the probability of ending the trace in the concrete state xk∈ ˜ xkin the following way:

π(˜ τIR→ xk) = γ(˜ xk,xk)·π(˜ τIR→ ˜ xk).

˜

W cannot be induced by any pair of relations (∼,∼L), because

there might not exist a weighting function ˜ w : X/∼×L/∼L×X/∼→ R, such that the conditions from

Dfn. 18 and Dfn. 19 are met. Moreover, even if such ˜ w exists, the remaining question is whether the

information on the abstract system is enough to compute them.

We now restate the main Theorem from [12], that the abstractions for Kappa systems, that we re-

sumed in Sec. 3, are sound and complete.

Theorem 1. (The abstraction induced by the ACM is sound and complete) Given a Kappa system R =

(πR

classes (∼,∼L) ⊆ X2×L2, as proposed in the Def. 16 is a sound and complete abstraction of the

WR= (X ,L,w,π0), provided that for any two mixtures M and M?such that M ≡?M?, we have:

π0([M]≡)·|[M?,M?]| = π0([M?]≡)·|[M,M]|.

We consider a mixture M. We denote by x ∈ X the equivalence class [M]≡, and by ˜ x ∈

lence class [M]≡? = [x]∼. The conditional probability γ(˜ x,x) is computed as the ratio of the number of

automorphisms of x (embedding between x and x) and the sum of the number of automorphisms of any

∼-equivalent state. Thus we have:

˜

W is a complete

Sound and complete abstractions

0,{r1,...,rn}), and an ACM (N ,E), an abstraction ˜

WR=(X/∼,L/∼, ˜ w, ˜ π0) induced by the partition

˜

X the equiva-

γ(˜ x,x) =

|[x,x]|

∑{|[x?,x?]| | x ∼ x?}.

The reader can find the detailed proof in [12].

Page 11

152Lumpability Abstractions of Rule-based Systems

4.1Lumpability

Now we define different versions of lumpability and investigate the relationship with sound and complete

abstractions.

Definition 20. (Lumped process) Given a WLTS W = (X ,L,w,π0), where X = {x1,x2,...}, and a

partition ∼⊆ X ×X on its state space, we observe the continuous-time stochastic process {Xt}t∈R+

that is generated by W (Dfn. 2). We define the lumped process {Yt} on the state space X/∼={˜ x1, ˜ x2,...}

(denoted by capital indices, i.e. ˜ xI, ˜ xJ) and with initial distribution ˜ π0, so that

0,

Pr(Yt= ˜ xJ | Y0= ˜ x0) = Pr(Xt∈ ˜ xJ | X0∈ ˜ x0).

The lumped process is not necessarily a Markov process.

Definition 21. (Lumpability) Given a WLTS W =(X ,L,w,π0) that generates the process {Xt}, we say

that it is lumpable with respect to the equivalence relation ∼⊆ X ×X if and only if its lumped process

{Yt} has the Markov property.

The evolution of a process depends on the initial distribution, and so does the lumpability property.

We thus define the set of initial distributions, for which the lumpability holds. We denote the set of all

probability distributions over X as PX:

PX= {π | π : X → [?,∞] and ∑

§?∈X

π(§?) = ∞}.

Moreover, we denote the set of initial distributions that produce a chain lumpable with respect to the

given equivalence relation ∼ by PI

PI

X ,∼:

X ,∼:= {π | the lumped process initialized with π is lumpable with respect to ∼}.

Whenever a distribution π ∈ PX is positive on the equivalence class ˜ x, i.e. ∑{π(x) | x ∈ ˜ x} > 0,

we denote by π|˜ x(x), the conditional distribution over the states of ˜ x: π|˜ x(x) = π(x)/π(˜ x), when x ∈ ˜ x,

and π|˜ x(x) = 0, otherwise.

Definition 22. (Strong and weak lumpability) Given a WLTS W = (X ,L,w,π0) that generates the

process {Xt}, and an equivalence relation ∼⊆ X ×X , we say that {Xt} is:

• strongly lumpable with respect to ∼, if the lumped process {Yt} is Markov with respect to any

initial distribution, i.e. PI

• weakly lumpable with respect to ∼, if there exists an initial distribution that makes the lumped

process {Yt} Markov, i.e. PI

Note that the definitions of strong and weak lumpability involve the quantifiers ”for all” and ”exists”

overtheprobabilitydistributionsoverasetofstates. Thus, checkingforeitheroftheminvolvesingeneral

an infinite number of checks. People have given sufficient conditions of strong and weak lumpability on

discrete-time Markov chains (DTMC’s) [21, 26]. The results had been extended to the continuous-time

case [4, 27]. We rephrase the sufficient conditions stated therein.

In order to understand the sense of the weak lumpability characterization, we discuss the meaning

of γ. We recall the semantics of a WLTS W by observing the cylinder sets of traces, i.e. τIR= x0

x1...xk−1

set of traces, denoted ˜ τIR= ˜ x0

→ ˜ x1... ˜ xk−1

X ,∼= PX;

X ,∼?= / 0.

l1,I1

→

lk,Ik

→ xk∈ TIR(W ). The abstraction ˜

W of W , induced by (∼,∼L) generates an abstract cylinder

lk,Ik

→ ˜ xk∈ TIR(˜

l1,I1

W ).

Page 12

J. Feret, T. Henzinger, H. Koeppl, T. Petrov 153

For any cylinder set of traces ˜ τIR∈ TIR(˜

probabilities over the lumped state ˜ xk, knowing that the abstract cylinder of traces ˜ τIR, which ends in the

abstract state ˜ xk, was observed, i.e.

γ˜ τIR(xk) =π(˜ τIR→ xk)

W ), we denote by γ˜ τIRthe distribution of the conditional

π(˜ τIR)

.

The definition of the complete abstraction suggests that, if γ˜ τIRwas independent of the traces on which it

is conditioned, i.e. ˜ τIR, then the completeness would hold.

Theorem 2. (Lumpability on CTMCs) Let us observe a WLTS W = (X ,L,w,π0) that generates the

process {Xt}, and an equivalence relation ∼⊆ X ×X . We consider the rate matrix R : X ×X → R.

If the lumped process is Markov, then we denote its rate matrix by˜R : X/∼×X/∼→ R. Then we have

the following characterizations about the lumped process {˜Xt}:

• If for all xi1,xi2∈ X such that xi1∼ xi2, and for all ˜ xJ∈ X/∼, we have that

∑

xj∈˜ xJ

R(xi1,xj) =∑

xj∈˜ xJ

R(xi2,xj),

(2)

then {Xt} is strongly lumpable with respect to ∼; We have:

˜R(˜ xI, ˜ xJ) =∑{R(xi1,xj) | xj∈ ˜ xJ};

• If there exists a family of probability distributions over the lumped states, γ ∈ ΓX ,∼, such that for

all xj1,xj2∈ X such that xj1∼ xj2and for all ˜ xI∈ X/∼, we have that

a(xj1) = a(xj2) and∑xi∈˜ xiR(xi,xj1)

γ(˜ xJ,xj1)

=∑xi∈˜ xIR(xi,xj2)

γ(˜ xJ,xj2)

,

(3)

then

1. If the distribution γ is in accordance with π0, i.e. π0|X/∼= γ, then for any finite sequence of

states (x0,...,xk) ∈ Xk+1and any sequence of time intervals (I1,...,Ik) ∈ IRk, we consider

the set ˜ τIRof the traces of the form x?

0

→ x?

xi∼ x?

In other words, knowing that we are in state ˜ xI, the conditional probability of being in state

x ∈ ˜ xIis invariant of time.

2. The process {Xt} is weakly lumpable with respect to ∼. Moreover, we have:

˜R(˜ xI, ˜ xJ) =∑{R(xi,xj2) | xi∈ ˜ xI}

l1,t1

1...x?

k−1

lk,t1+...+tk

→

x?

k. For all i, 0 ≤ i ≤ k and

i, and for all i, 1 ≤ i ≤ k, ti∈ Iiand li∈ L, we have that: if π(˜ τIR) > 0 then γ˜ τIR= γ.

γ(˜ xJ,xj2)

;

One shall notice that Thm. 2 gives a weaker condition than the completeness of WLTS abstraction

(eg see Dfn. 19). The main reason is that we do not ’track’ transition labels, in the sense that we observe

the abstraction on the cylinder sets of traces induced only by ∼, and not also by ∼L. Yet, in the particular

case when states fully define the transition labels (ie, if w(x1,l1,x?

x?

complete abstraction of WLTS.

The characterization of weak (resp. strong) lumpability given in Thm. 2 is sufficient, but not a nec-

essary condition: there exist systems which are strongly or weakly lumpable, but do not satisfy the

1) > 0, w(x2,l2,x?

2) > 0, x1∼ x2, and

1∼ x?

2, then l1∼Ll2), the given condition for weak lumpability coincides with the definition of the

Page 13

154 Lumpability Abstractions of Rule-based Systems

conditions given in the theorem. Interestingly, there are systems, such that the characterization from

Thm. 2 would detect as strong, but not weakly lumpable, which is counter-intuitive with the terminol-

ogy. One shall also notice that the conditions of Thm. 2 imply that: in order to aggregate two states in

the CTMC, they must not have different waiting times until the next transition (e.g. they should have the

same activity). It is stated explicitly in the characterization of weak lumpability and it can be obtained

by summation over the outgoing class in the characterization of strong lumpability.

We consider a WLTS W = (X ,L,w,π0), and the set of all equivalence relations ∼ on X , denoted

PTX. We introduce the subsets of PTX, denoted PS, PW,CS,CW in the following meaning: (i) PS -the

set of all equivalence relations such that {Xt} is strongly lumpable with respect to ∼; (ii) PW - the set

of all equivalence relations such that {Xt} is weakly lumpable with respect to ∼; (iii) CS - the set of all

equivalence relations such that {Xt} satisfies the condition for strong lumpability given in the Thm. 2;

(iv) CW - the set of all equivalence relations such that {Xt} satisfies the condition for weak lumpability

given in the Thm. 2.

Lemma 1. (Relations on lumpability properties and conditions) Consider an arbitrary WLTS W =

(X ,L,w,π0) and the equivalence relation ∼⊆X ×X . We have the following relations: (1a) if ∼∈PS

then ∼∈ PW, if ∼∈ CS then ∼∈ PS; and if ∼∈ CW then ∼∈ PW; The converse implication does not

hold for any of the statements; (2a) If ∼∈CW, that does not imply ∼∈CS; (2b) If ∼∈CS, that does not

imply ∼∈CW.

Proof. The statement in the part (1) trivially follows from the Dfn. 21 and Thm. 2. To show (2a) and (2b),

we consider the WLTS W specified in the Fig. 1(a), with the state space X = {x,y1,y2,y3,z1,z2,z3}.

Let ∼1be an equivalence relation on X , such that y1∼1y2and z1∼1z2. By lumping the states by ∼1,

we get the system

˜

W1, as shown in Fig. 1(b). It is easy to check that ∼1∈CS. Moreover, we have that

∼1∈CW, since for

γ =

1

(0.5,0.5)

the weak lumpability condition is satisfied, so ∼1∈CS∩CW. We further lump the states y12and y3, by

taking the transitive closure of the relation ∼1union (y1,y3), denoted ∼2= tc(∼1∪(y1,y3)) (Fig. 1(c)).

This lumping is such that ∼2/ ∈CS because we have

y1∼ y3, but w(y1,l,z12) > 0, and w(y3,l,z12) = 0.

On the other hand, for

γ =

1

(1/3,1/3,1/3)

we argue that ∼2∈CW. Therefore, if the initial distribution is in accordance with γ, the abstraction

is sound and complete.

If we rather lump z1and z2, by ∼3=tc(∼1∪(z1,z3)), we get the system

is such that ∼3∈ CS\CW. More precisely, we cannot find a γ which would witness ∼3∈ CW: if such

a γ existed, we would have γ({x})(x) = 1, and consequently γ(y12) = (0.5,0.5), and γ(y3) = 1. This

implies that the conditional distribution γ(z123) cannot be invariant of time - it will alternate between the

distributions (0,0,1) and (0.5,0.5,0), depending on the choice made in x. Note that, since ∼3∈ CS, it

follows that ∼3∈ PS, and this implies ∼3∈ PW.

This discussion indicates that if we decide to check for weak lumpability instead of for strong by us-

ing the characterization from Thm. 2, it might happen that we eliminate the aggregations that are strongly

lumpable. In the case of reductions of Kappa systems, we will use the weak lumpability characterization.

?

xy12

y3

1

z12

z3

1

(0.5,0.5)

?

?

xy123

z12

z3

1

(0.5,0.5)

?

˜

W2

˜

W3(Fig. 1(d)). This system

Page 14

J. Feret, T. Henzinger, H. Koeppl, T. Petrov155

x

y1

y2

y3

z1

z2

z3

1/2

1/3

1

1

1

1/3

1/3

1/2

1/2

1/2

1/2

1/2

(a) W : the concrete system

x

y1

y2

y3

z1

z2

z3

1/2

1/3

1

1

1

1/3

1/3

1/2

1/2

1/2

1/2

1/2

˜ x

˜ z12

˜ z3

1/2

1/2

1/2

1/2

˜ y12

˜ y3

1

1

2/3

1/3

(b)

˜

W1; ∼1∈ PS∩PW ∩CS∩CW

x

y1

y2

y3

z1

z2

z3

1/2

1/3

1

1

1

1/3

1/3

1/2

1/2

1/2

1/2

1/2

˜ x ˜ y

˜ z12

˜ z3

1

2/3

1/3

1/2

1/2

1/2

1/2

(c)

˜

W2; ∼2∈CW \CS

x

y1

y2

y3

z1

z2

z3

1/2

1/3

1

1

1

1/3

1/3

1/2

1/2

1/2

1/2

1/2

˜ x

1/2

1/2

1/2

1

1

1/2

˜ y12

˜ y3

˜ z

(d)

˜

W3; ∼3∈CS\CW

Figure 1: Different abstractions of system W

4.2Bisimulations

Aiming to define the algorithm that is abstracting the WLTS of a Kappa system, we start by redefining

the lumpability properties in the bisimulation notions. Bisimulation is typically defined by logically

characterizing the distinguishing property of the states that may be aggregated.

We define three kinds of bisimulation relations on the WLTS, which are based on the lumpability

characterizations given in Thm. 2. We adopt the terminology of [5]. The forward bisimulations arise

from the characterization for strong lumpability: the bisimilar states have the same forward behaviour in

the sense that they are each targeting any other lumped state with the same total affinity (total outgoing

rate). This concept is well established for dependability or performance analysis [16, 15]. What we use in

theabstractionsofKappasystemsisbackward bisimulation. Thebisimilarstateshavethesamebackward

behaviour in the sense that they are reached by the predecessors from one lumped state with the same

probabilistic quantity, which becomes the rate in the abstract system. It is however less established and

only applied in very few approaches for stochastic modelling [30]. The backward uniform bisimulation

is an instance of a backward bisimulation with an additional constraint that only the equally-probable

states may be aggregated.

Definition 23. Given a WLTS W = (X ,L,w,π0), and (∼,∼L) a pair of equivalence relations respec-

tively over X and L, we define a function δF: X ×L/∼L×X/∼→ R+

δF(xi,˜l, ˜ xj) =∑{|w(xi,l,xj)| | l ∈˜l and xj∈ ˜ xj}.

Furthermore, given a family of probability distributions over the partitions γ ∈ ΓX ,∼, we define the

quantity δB: X/∼×L/∼L×X → R+

δB(˜ xi,˜l,xj) =∑{γ(˜ xi,xi)·|w(xi,l,xj)| | l ∈˜l,xi∈ ˜ xi}

0:

0:

γ(˜ xj,xj)

.

Specifically, if we have that γ is a uniform distribution over the equivalence classes, we can express

the latter expression in terms of cardinalities of the equivalence classes:

δBU(˜ xi,˜l,xj) =|˜ xj|

|˜ xi|∑{|w(xi,l,xj)| | l ∈˜l,xi∈ ˜ xi}.

Page 15

156Lumpability Abstractions of Rule-based Systems

Definition 24. (Forward and backward Markov bisimulation) Given a WLTS W = (X ,L,w,π0), and

(∼,∼L) a pair of equivalence relations respectively over X and L, we say that (∼,∼L) is a

1. Forward Markov Bisimulation, if for all xiand xj, the following is satisfied: xi∼ xj, iff for all

equivalence classes ˜ x ∈ X/∼,˜l ∈ L/∼L, we have that a(xi) = a(xj) and δF(xi,˜l, ˜ x) = δF(xj,˜l, ˜ x).

Remark. Note that this involves the bisimulation in the classical sense: if xihas a successor in

some class, xjhas it as well, and they are related by appropriate labels (and probabilities in this

case).

2. Backward Markov bisimulation, if for all xiand xj, there exists an γ ∈ ΓX ,∼, such that the follow-

ing is satisfied: xi∼xj, iff for all equivalence classes ˜ x∈X/∼,˜l ∈L/∼L, we have that a(xi)=a(xj)

and δB(˜ x,˜l,xi) = δB(˜ x,˜l,xj).

Theorem 3. (Forward Markov bisimulation implies sound abstraction) Let W = (X ,L,w,π0) be a

WLTS. If (∼,∼L) induces a forward Markov bisimulation, then for any aggregates ˜ xi,˜l, and ˜ xj, we can

define

˜ w(˜ xi,˜l, ˜ xj) = δF(xi,˜l, ˜ xj).

The so defined abstraction ˜

W =(X/∼,L/∼L, ˜ w, ˜ π0) is sound. We then say that W refines ˜

Markov bisimulation (∼,∼L), written W ?F,(∼,∼L)

Theorem 4. (Backward Markov bisimulation implies sound and complete abstraction) Given a WLTS

W = (X ,L,w,π0), if (∼,∼L) induces a backward Markov bisimulation with conditional probabilities

over the aggregates γ ∈ ΓX ,∼, then for any aggregates ˜ xi,˜l, and ˜ xj, we can define

˜ w(˜ xi,˜l, ˜ xj) = δB(˜ xi,˜l,xj).

˜

W = (X/∼,L/∼L, ˜ w, ˜ π0) is sound and complete. We then

say that W refines

˜

W by a backward Markov bisimulation (∼,∼L) with conditional distributions γ,

written W ?B,(∼,∼L),γ

In particular, if we know that γ is uniform, the equation (4) becomes ˜ w(˜ xi,˜l, ˜ xj) = δBU(xi,˜l, ˜ xj),

written also W ?BU,(∼,∼L)

W by a forward

˜

W .

(4)

If γ(˜ x) = π0|˜ x, then the so defined abstraction

˜

W .

˜

W .

4.3Proving bisimulations

The forward bisimulation relation for abstracting the transition systems with CTMC semantics has been

established and used in applications (eg, [16, 15]). Moreover, computing the backward uniform bisimu-

lation when γ is uniform is defined [5, 30]. It is based on an alternative characterization of the backward

uniform Markov bisimulation, which eases the analysis.

Lemma 2. (Proving backward uniform Markov bisimulation) Let W = (X ,L,w,π0) be a WLTS and

(∼,∼L) be a pair of equivalence relations respectively over X and L. For any state x?∈ X , and any

pair of classes ˜ x,˜l ∈ X/∼×L/∼L, let us define the set Pred(˜ x,˜l,x?) of transitions from a state in ˜ x to the

state x?and with a label in˜l as follows:

Pred(˜ x,˜l,x?) = {(x,l) ∈ ˜ x×˜l | w(x,l,x?) > 0}.

Assume that: (1) π0|X/∼= ˜ π0, and (2) for any x?

L/∼L, there exists a bijective map φ between Pred(˜ x,˜l,x?

Pred(˜ x,˜l,x?

ThenwehavethatW isthebackwarduniformbisimulationoftheabstraction ˜

i.e. W ?BU,(∼,∼L)

i,x?j∈ X such that x?

i) and Pred(˜ x,˜l,x?j), such that for any (xi,li) ∈

i) = w(xj,lj,x?j).

i∼ x?jand any ˜ x ∈ X/∼,˜l ∈

i), if φ(xi,li) = (xj,lj), then we have that w(xi,li,x?

W =(X/∼,L/∼, ˜ w, ˜ π0),

˜

W .