Content uploaded by Gerrit Grossmann

Author content

All content in this area was uploaded by Gerrit Grossmann on May 17, 2018

Content may be subject to copyright.

Lumping the Approximate Master Equation for

Multistate Processes on Complex Networks

Gerrit Großmann1, Charalampos Kyriakopoulos1, Luca Bortolussi2, and

Verena Wolf1

1Computer Science Department, Saarland University

2Department of Mathematics and Geosciences, University of Trieste

Abstract. Complex networks play an important role in human society

and in nature. Stochastic multistate processes provide a powerful frame-

work to model a variety of emerging phenomena such as the dynamics of

an epidemic or the spreading of information on complex networks. In re-

cent years, mean-ﬁeld type approximations gained widespread attention

as a tool to analyze and understand complex network dynamics. They

reduce the model’s complexity by assuming that all nodes with a similar

local structure behave identically. Among these methods the approxi-

mate master equation (AME) provides the most accurate description of

complex networks’ dynamics by considering the whole neighborhood of a

node. The size of a typical network though renders the numerical solution

of multistate AME infeasible. Here, we propose an eﬃcient approach for

the numerical solution of the AME that exploits similarities between the

diﬀerential equations of structurally similar groups of nodes. We clus-

ter a large number of similar equations together and solve only a single

lumped equation per cluster. Our method allows the application of the

AME to real-world networks, while preserving its accuracy in computing

estimates of global network properties, such as the fraction of nodes in

a state at a given time.

Keywords: Complex Networks, Multistate Processes, AME, Model Re-

duction, Lumping

1 Introduction

Various emerging phenomena of social, biological, technical, or economic nature

can be modeled as stochastic multistate processes on complex networks [1, 3,

24, 26]. Such networks typically consist of millions or even billions of nodes [1,

3], each one being in one of a ﬁnite number of states. The state of a node can

potentially change over time as a result of interaction with one of its neighbor-

ing nodes. The interactions among neighbors are speciﬁed by rules and occur

independently at random time points, governed by the exponential distribution.

Hence, the underlying process is a discrete-state space Markovian process in

continuous time (CTMC). Its state space consists of all labeled graphs repre-

senting all possible conﬁgurations of the complex network. For instance, in the

arXiv:1804.02981v1 [cs.SI] 9 Apr 2018

susceptible-infective (SI) model, which describes the spread of a simple epidemic

process, each node can either be susceptible or infected; infected nodes propagate

the infection to their susceptible neighbors [19, 5].

Monte-Carlo simulations can be carried out only for small networks [11, 19],

as they become very expensive for large networks, due to the large number

of simulation runs which are necessary to draw reliable conclusions about the

network’s dynamics.

An alternative and viable approach is based on mean-ﬁeld approximations, in

which nodes sharing a similar local structure are assumed to behave identically

and can be described by a single equation, capturing their mean behavior [18,

3, 4, 10, 12]. The heterogeneous (also called degree-based) mean-ﬁeld (DBMF)

approach proposes a system of ordinary diﬀerential equations (ODEs) with one

equation approximating the nodes of degree kwhich are in a certain state [25,

9, 18]. The approximate master equation (AME) provides a far more accurate

approximation of the network’s dynamics, considering explicitly the complete

neighborhood of a node in a certain state [16, 17, 14]. However, the corresponding

number of diﬀerential equations that have to be solved is of the order Ok|S|

max,

where kmax is the network’s largest degree and |S| the number of possible states.

A coarser approximation called pair approximation (PA) can be derived from

AME by imposing the multinomial assumption for the number of neighbors in

a state [16, 17]. Nevertheless, solving PA instead of AME is faster but for many

networks not accurate enough [17].

Lumping is a popular model reduction technique for Markov-chains and sys-

tems of ODEs [21, 6, 28, 7, 8]. It has also been applied to the underlying model

of epidemic contact processes [27, 19] and has recently been shown to be ex-

tremely eﬀective for the DBMF equation as well as for the PA approach [20].

In this work, we generalize the approach of [20] providing a lumping scheme for

the AME, leveraging the observation that nodes with a large degree having a

similar neighborhood structure have also typically very similar behaviors. We

show that it is possible to massively reduce the number of equations of the AME

while preserving the accuracy of global statistical properties of the network. Our

contributions, in particular, are the following: (i) we provide a fully automated

aggregation scheme for the multistate AME; (ii) we introduce a heuristic to ﬁnd

a reasonable trade-oﬀ between number of equations and accuracy; (iii) we evalu-

ate our method on diﬀerent models from literature and compare our results with

the original AME and Monte-Carlo simulation; (iv) we provide an open-source

tool3written in Python, which takes as input a model speciﬁcation, generates

and solves the lumped (or original) AME.

The remainder of this paper is organized as follows: In Section 2 we describe

multistate Markovian processes in networks and formally introduce the AME. In

Section 3 we derive lumped equations for a given clustering scheme and in Section

4 we propose and evaluate a clustering algorithm for grouping similar equations

together. Case studies are presented in Section 5. We draw ﬁnal conclusions and

identify open research problems in Section 6.

3https://github.com/gerritgr/LumPyQest

2 The Multistate Approximate Master Equation

In this section, we ﬁrst deﬁne contact processes and introduce our notation and

terminology for the multistate AME.

2.1 Multistate Markovian Processes

We describe a contact process in a network (G,S, R, L) by a ﬁnite undirected

graph G= (V, E ), a ﬁnite set of states S, a set of rules R, and an initial state

for each agent (node) of the graph L:V→ S. We use s, s0, s00 and s1, s2, . . . to

denote elements of S. At each time point t≥0, each node v∈Vis in a state

s∈ S. The rules Rdeﬁne how neighboring nodes inﬂuence the probability of

state transitions. A rule consists of a consumed state, a produced state, and a

transition rate, which depends on the neighborhood of the node. We use integer

vectors to model a node’s neighborhood. For a given set of states Sand maximal

degree kmax, the set of all potential neighborhood vectors is M={m∈Z|S|

≥0|

Ps∈S m[s] ≤kmax}, where we write m[s] to refer to the number of neighbors in

state s.

A rule r∈Ris a triplet r= (s, f, s0) with s, s0∈ S , s 6=s0and rate function

f:M → R≥0corresponding to the exponential distribution. A rule r(also

denoted as sf

−→ s0) can be applied at every node in state s, and, when applied,

it transforms this node into state s0. Note that this general formulation of a rule

containing the rate function can express all types of rules that are described in

[16, 17, 14] such as spontaneous changes of a node’s state (independent rules) or

changes due to the state of a neighbor (contact rules). The delay until a certain

rule is applied is exponentially distributed with rate f(m), with rules competing

in a race condition where the one with the shortest delay is executed. This results

in an underlying stochastic model described by a CTMC.

In the following, we indicate with Rs+={(s0, f, s)∈R, s0∈ S} all the rules

that change the state of a node into s, and with Rs−={(s, f, s0)∈R, s0∈ S}

all rules that change an s-node into a diﬀerent state.

Example In the SIS model, a susceptible node can become infected by one

of its neighbors. An infected node becomes susceptible again, independently of

its neighbors. Hence, the infection rule is S λ1·m[I]

−−−−−→ I and the recovery rule is

Iλ2

−→ S,where m[I] denotes the number of infected neighbors and λ1, λ2∈R≥0

are rule-speciﬁc rate constants.

2.2 Multistate AME

Here, we brieﬂy present the multistate AME, similarly to [14, 20]. The AME

assumes that all nodes in a certain state and with the same neighborhood struc-

ture are indistinguishable. We deﬁne Mk={m∈ M | Ps∈S m[s] = k}to be

the subset of neighborhood vectors referring to nodes of degree k. In addition,

for s1, s2∈ S and m∈ M, we use m{s

+

1,s−

2}to denote a neighborhood vector

where all entries are equal to those of m, apart from the s1-th entry, which is

equal to m[s1] + 1, and the s2-th entry, which is equal to m[s2]−1.

Let xs,m(t) be the fraction of network nodes that are in state sand have

a neighborhood mat time t, and assume the initial state xs,m(0) is known.

Formally, the AME approximates the time evolution of xs,mwith the following

set of deterministic ODEs4:

∂xs,m

∂t =X

(s0,f,s)∈Rs+

f(m)xs0,m−X

(s,f,s0)∈Rs−

f(m)xs,m

+X

(s1,s2)∈S2

s16=s2

βss1→ss2xs,m{s

+

1,s−

2}m{s

+

1,s−

2}[s1]

−X

(s1,s2)∈S2

s16=s2

βss1→ss2xs,mm[s1],

(1)

where, the term βss1→ss2is the the average rate at which an (s, s1)-edge changes

into an (s, s2)-edge, if s, s1, s2∈ S with s16=s2.

The ﬁrst term in the right hand side models the inﬂow into (s, m) nodes

from (s0,m) nodes, while the second term models the outﬂow from (s, m) due

to the application of a rule. The other two terms describe indirect eﬀects on a

(s, m) node due to changes in its neighboring nodes, again considering inﬂow

and outﬂow (cf. Fig. 1). In particular, a node in the neighborhood mof (s, m),

say in state s1, changes to state s2by the ﬁring of a rule.

To compute βss1→ss2we need to deﬁne the subset of rules which consume a

s1-node and produce an s2-node: Rs1→s2={(s1, f, s2)∈R|f:M → R≥0}.

Then

βss1→ss2=P

m∈M P

(s1,f,s2)∈Rs1→s2

f(m)xs1,mm[s]

P

m∈M

xs1,mm[s],(2)

where in the denominator we normalize dividing by the fraction of (s, s1) edges.

The total number of equations of AME is determined by the number of states

|S| and the maximal degree kmax , and equals:

kmax +|S|

|S| − 1(kmax + 1) .(3)

The binomial arises from the number of ways in which, for a ﬁxed degree k, one

can distribute kneighbors into |S| diﬀerent states, see [20] for the proof.

As xs,mare fractions of network nodes, the following identity holds for all t:

X

s,m∈S×M

xs,m(t) = 1 (4)

4we omit tfor the ease of notation

xs,(2,2)

s’

s

s’

s

xs,(1,3)

s’

s’s’

s

xs’,(2,2)

s’

s

s’

s

xs,(3,1)

s

s

s’

s

node changes neighborhood changes

3βss’→ssxs,(1,3)

2βss→ss’xs,(2,2)

f1(2,2)xs’,(2,2)

f2(2,2)xs,(2,2)

2βss’→ssxs,(2,2)

3βss→ss’xI,(3,1)

Fig. 1: Illustration of how the AME governs the fraction of xs,(2,2) in a two-state

model with rules (s0, f1, s), (s, f2, s0). The inﬂow and outﬂow between xs,(2,2)

and xs0,(2,2) is induced by the direct change of a node’s state from sto s0or vice

versa. The inﬂow and outﬂow between xs,(2,2) and xs,(3,1),xs,(1,3) is attributed

to the change of state of a node’s neighbor.

Moreover, we use xsto denote the global fraction of nodes in a ﬁxed state s,

which we get by summing over all possible neighborhood vectors

xs(t) = X

m∈M

xs,m(t),(5)

again with Ps∈S xs(t) = 1. Intuitively, xsis the probability that a randomly

chosen node from the network is in state s. This is the value of primary interest

in many applications, e.g. [3, 24, 26]. Finally, the degree distribution P(k) gives

the probability that a randomly chosen node is of degree k(0 ≤k≤kmax). If

we sum up all xs,mwhich belong to a speciﬁc k(i.e. m∈ Mk), as the network

structure is assumed to be static, we will necessarily obtain the corresponding

degree probability. Hence, for each t≥0, we have

X

s,m∈S×Mk

xs,m(t) = P(k).(6)

3 Lumping

The key idea of this paper is to group together equations of the AME which have

a similar structure and to solve only a single lumped equation per group. This

lumped equation will capture the evolution of the sum of the AME variables in

each group.

Therefore, we divide the set {xs,m|s∈ S,m∈ M} into groups or clusters,

constructing our clustering such that two equations xs,m,xs0,m0can only end

up in the same group if s=s0and mis ‘suﬃciently’ similar to m0. This ensures

that the fractions within a cluster as well as their time derivatives are similar,

provided the change in the rate as a function of mis relatively small when mis

large.

In the sequel, we consider a clustering Cdeﬁned as a partition over M, i.e.,

C ⊂ 2Mand SC∈C C=Mand all clusters Care disjoint and non-empty. Before

we discuss in detail the construction of Cin Section 4, we derive the lumped

equations for a given clustering C.

First, recall that we want to approximate the global fractions for each state

(cf. Eq. (5)), which can be split into sums over the clusters

xs(t) = X

C∈C X

m∈C

xs,m(t).(7)

Our goal is now to construct a smaller equation system, where the variables

zs,C approximate the sum over all xs,mwith m∈C

zs,C (t)≈X

m∈C

xs,m(t).(8)

Henceforth, we can approximate the global fractions as

xs(t)≈X

C∈C

zs,C (t).(9)

The number of equations is then given by |S| · |C |. As one might expect, there is

a trade-oﬀ between the accuracy of zs,C(t) and the computational cost, propor-

tional to the number of clusters.

3.1 Lumping the Initial State and the Time Derivative

As the initial values of xs,mare given, we deﬁne the initial lumped values

zs,C (0) = X

m∈C

xs,m(0) .(10)

To achieve the criterion in Eq. (8) for the fractions computed at t > 0, we

seek for time derivatives which fulﬁll

∂zs,C

∂t ≈X

m∈C

∂xs,m

∂t .(11)

Note that an exact lumping is in general not possible as ∂zs,C

∂t is a function of the

individual xs,m(t). In order to close the equations for zs,C , we need to express

xs,mas an approximate function of zs,C. The naive idea is to assume that the

true fractions xs,mare similar for all mthat belong to the same cluster, i.e., if

m,m0∈Cthen xs,m≈xs,m0, leading to an approximation of xs,mas zs,C/|C|.

This is however problematic, as it neglects the fact that neighbors of nodes

of diﬀerent degree have diﬀerent size. In fact, even if for two degrees k1< k2

in the same cluster we have P(k1) = P(k2) (while typically P(k1)> P (k2)),

the number of possible diﬀerent neighbors m2of a k2-node is larger than the

number of diﬀerent neighbors m1of a k1-node, |Mk1|<|Mk2|, hence typically

xs,m2< xs,m1, as the mass of P(k2) has to be split among more variables. In

order to correct for this asymmetry between degrees in each cluster, we introduce

the following assumption:

Assumption:All fractions xs,minside a cluster Cthat refer to the same degree

contribute equally to the sum zs,C . Equations of diﬀerent degree contribute pro-

portionally to their degree probability P(k)and inversely proportionally to the

neighborhood size for that degree.

Based on the above assumption, we deﬁne a degree dependent scaling-factor

wC,k ∈R≥0, which only depends on the corresponding cluster Cand degree k.

According to the above assumption wC,k ∝P(k)

|Mk|. To ensure that the weights of

one cluster sum up to one, we deﬁne

wC,k =P(k)

|Mk|· X

m∈C

P(km)

|Mkm|!−1

,(12)

where km=Ps∈S m[s] is the degree of a neighborhood m. We compute approx-

imations of xs,mbased on zs,C as

xs,m≈zs,C ·wC,km.(13)

3.2 Building the Lumped Equations

To deﬁne a diﬀerential equation for the lumped fraction zs,C , we consider again

Eq. (11) and replace ∂ xs,m

∂t by the l.h.s. of Eq. (1). Then we substitute every

occurrence of xs,mby its corresponding lumped variable multiplied with the

scaling factor, i.e., zs,C ·wC,km,where m∈C. Since m∈Cdoes generally

not imply that m{s

+

1,s−

2}∈C, the substitution of xs,m{s

+

1,s−

2}is somewhat more

complicated. Let C(m) denote the cluster mbelongs to. If mlies “at the border”

of a cluster then C(m{s+

1,s−

2}) might be diﬀerent than C(m). The lumped AME

takes then the following form:

∂zs,C

∂t =X

(s0,f,s)∈Rs+

zs0,C X

m∈C

wC,kmf(m)

−X

(s,f,s0)∈Rs−

zs,C X

m∈C

wC,kmf(m)

+X

(s1,s2)∈S2

s16=s2

βss1→ss2

LX

m∈C

wC(m{s

+

1,s−

2}),km

zs,C(m{s

+

1,s−

2})m{s

+

1,s−

2}[s1]

−X

(s1,s2)∈S2

s16=s2

βss1→ss2

Lzs,C X

m∈C

wC,kmm[s1],

(14)

where

βss1→ss2

L=P

C∈C

zs1,C P

(s1,f,s2)∈Rs1→s2P

m∈C

f(m)wC,kmm[s]

P

C∈C

zs1,C P

m∈C

wC,kmm[s].(15)

To gain a signiﬁcant speedup compared to the original equation system, it is

necessary that the lumped equations can be eﬃciently evaluated. In particular,

we want the number of terms in the lumped equation system to be proportional

to the number of fractions zs,C and not to the number of xs,m. This is possi-

ble for Eq. (14), because each time we have a sum over m∈C, for instance

Pm∈CwC,kmf(m), we can precompute this value during the generation of the

equations and do not have to evaluate it at every step of the ODE solver. The

sum

X

m∈C

zs,C(m{s

+

1,s−

2})m{s

+

1,s−

2}[s1]

can be evaluated eﬃciently since we only have to consider lumped variables

that correspond to clusters C(m{s

+

1,s−

2}) that are close to C(m), i.e., that can

be reached from a state in C(m) by the application of a rule. The number of

such neighboring clusters is typically small, due to our deﬁnition of clusters, see

Section 4.

Remark 1. For large kmax, the number of neighbor vectors in M, i.e. the size of

the AME, becomes prohibitively large. For instance, for a maximum degree of

the order of 10 thousands, quite common in real networks, the size of Mbecomes

of the order of 1012. Even summing a number of elements of this order while

generating equations becomes very costly. To overcome this limit, the solution is

to approximate terms involving summations in Eq. (14). Consider for instance

Pm∈CwC,kmf(m). Instead of evaluating fat every m∈Cand averaging it

w.r.t. wC,km, we can only evaluate fat the mean neighborhood vector hmiC,

where each coordinate is deﬁned as hmiC[s] = Pm∈CwC,kmm[s]. We can then

approximate Pm∈CwC,kmf(m)≈fhmiC. See Appendix A for details.

4 Partitioning of the Neighborhood Set

In this section we describe an algorithm to partition M, and construct the clus-

tering C. Our algorithm builds partitions with a varying granularity to control

the trade-oﬀ between accuracy and execution speed. We consider three main

criteria: the similarity of diﬀerent equations, their impact on the global error,

and how fast is the evaluation of the lumped equations. Furthermore, as the size

of Mcan be extremely large, we cannot rely on typical hierarchical clustering

algorithms having a cubic runtime in the number of elements to be clustered.

Our solution is to decouple each minto two components: its degree km(encod-

ing its length) and its projection to the unit simplex (encoding its direction).

We cluster these two components independently.

4.1 Hierarchical Clustering for Degrees

Since our clustering is degree-dependent, we ﬁrst partition the set of degrees

{0, . . . , kmax}. Let K ⊂ 2{0,...,kmax }be a degree partitioning, i.e., the disjoint

union of all K∈ K is the set of degrees. The goal of the degree clustering is to

merge together consecutive degrees with small probability while putting degrees

with high probability mainly in separate clusters. This is particularly relevant for

the power-law distribution, which is predominantly found in real world networks

[2, 1] as it allows us to cluster a large number of high degrees with low total

probability all together without losing much information.

We use an iterative procedure inspired by bottom-up hierarchical clustering

to determine K. We start by assigning to each degree an individual cluster and

iteratively join the two consecutive clusters that increase the cost function L

by the least amount. The cost function Lpunishes disparity in the spread of

probability mass over clusters, leading to clusters that have approximately the

same total probability mass. It is deﬁned as

L(K) = X

K∈K X

k∈K

P(k)2.(16)

Note that L(K) is minimal when all Pk∈KP(k) have equal values. The algorithm

needs O(k2

max) comparisons to determine the degree cluster of each element. At

the end of this procedure, each m∈ M has a corresponding degree-cluster K

with km∈K.

4.2 Proportionality Clustering

Independently of K, we partition Malong the diﬀerent components of vectors

m∈ M. First, observe that if we normalize mby dividing each dimension by

km, we can embed each Mkinto the unit simplex in R|S| . The idea is then

to partition the unit simplex, and apply the same partition to all Mk. More

speciﬁcally, we construct such partition coordinate-wise. As each element of the

normalized mtakes values in [0,1], we split the unit interval in p+1 subintervals

(a) (b)

Fig. 2: Left: Clustering of Mfor a 2–state model with kmax = 20 and |K| =

|P| = 7. Right: Proportionality cluster of a 3–state model with kmax = 50 and

|P| = 5. Only the plane M50 is shown.

P={[0,1

p),[1

p,2

p),...,[p−1

p,1]}. Then, two normalized neighbor vectors are in

the same proportionality cluster if and only if their coordinates all belong to the

subinterval P∈ P, possibly diﬀerent for each coordinate.

4.3 Joint Clusters

Finally, we construct Csuch that two points m,m0are in the same cluster if

and only if they are in the same degree-cluster (i.e., ∃K∈ K :km, km0∈K) and

in the same proportionality cluster, (i.e., for each dimension s∈ S, there exists

aP∈ P, such that m[s]

km,m0[s]

km0∈P).

The eﬀect of combining degree and proportionality clusters, for a model with

two diﬀerent states, is shown in Fig. 2a, where the proportionality clustering gives

equally sized triangles that are cut at diﬀerent degrees by the degree clustering. If

we ﬁx a degree k, each cluster has only two neighbors (one in each direction). In

the 3d-case, the proportionality clustering creates tetrahedra, which correspond

to triangles if we ﬁx a degree k(cf. Fig. 2b).

The above clustering admits some advantageous properties: (1) If we ﬁx a

degree, all clusters have approximately the same size and spatial shape; (2) The

number of ‘direct spatial neighboring clusters’ of each cluster is always small,

which simpliﬁes the identiﬁcation of clusters in the ‘border’ cases and eases

the generation and evaluation of the lumped AME. Hence, the clusters can be

eﬃciently computed even if Mis very large. Next, we discuss how to choose the

size of the clusterings Kand P.

4.4 Stopping Heuristic

To ﬁnd an adequate number of clusters, we solve the lumped AME of the model

multiple times while increasing the number of clusters. We stop when the diﬀer-

ence between diﬀerent lumped solutions converges. The underlying assumption

is that the approximations become more accurate with an increasing number

of clusters and that the respective diﬀerence between consecutive lumped solu-

tions becomes evidently smaller when the error starts to level oﬀ. Our goal is to

stop when the increase in the number of clusters does not bring an appreciable

increase in accuracy.

Let z0(t), z00(t) be two solution vectors, i.e., containing the fractions of nodes

in each state at time t, of the lumped AME that correspond to two diﬀerent

clusterings C0and C00.We deﬁne the diﬀerence between two such solutions z0,

z00 as their maximal Euclidean distance over time.

(z0,z00) = max

0≤t≤HsX

s∈S z0

s(t)−z00

s(t)2.(17)

For the initial clustering we choose |K| =|P| =c0. In each step, we increase the

number of clusters by multiplying the previous ciwith a ﬁxed constant, thus

ci+1 =brcic(r > 1). We ﬁnd this to be a more robust approach than increasing

ciby only a ﬁxed amount in each step. We stop when the diﬀerence between

two consecutive solutions are smaller than stop >0. We consistently observe in

all our case studies that (z0,z00) is a very good indicator on the behavior of the

real error (cf. Fig. 6b). For our experiments we set empirically c0= 10, r= 1.3,

and stop = 0.01.

5 Case Studies

We demonstrate our approach on three diﬀerent processes, namely the well-

known SIR model, a rumor spreading model, and a SIS model with competing

pathogens [22, 13]. We test how the number of clusters, the accuracy, and the

runtime of our lumping method relate. In addition, we compare the dynamics

of the original and lumped AME with the outcome of Monte-Carlo simulations

on a synthetic network of 105nodes [15, 23]. We performed our experiments

on an Ubuntu machine with 8 GB of RAM and quad-core AMD Athlon II X4

620 processor. The code is written in Python 3.5 using SciPy’s vode5ODE

solver. The lumping error we provide is the diﬀerence between lumped solutions

(corresponding to diﬀerent granularities) and the outcome of the original AME.

That is, for the original solution xand a lumped solution z, we deﬁne the

lumping errors of zas (x,z). To generate the error curves, we start with |P| =

|K| = 5 and increase both quantities by one in each step. Note that we test our

approach on models with comparably small kmax. In general, this undermines

the eﬀectiveness of our lumping approach; however using a larger kmax would

have hindered the generation of the complete error curve.

(a) (b)

Fig. 3: SIR model. (a): Lumping error and runtime of the ODE solver w.r.t. the

number of clusters. (b): Fractions of S,I,R nodes over time, as predicted by the

original AME (solid line), by the lumped AME (dashed line), and based on

Monte-Carlo simulations (diamonds).

5.1 SIR

First, we examine the well-known SIR model, where infected nodes (I) go through

a recovery state (R) before they become susceptible (S) again:

Sλ1·m[I]

−−−−−→ I I λ2

−−→ R R λ3

−−→ S.

We choose (λ1, λ2, λ3) = (3.0,2.0,1.0) and assume a network structure with

kmax = 60 and a truncated power-law degree distribution with γ= 2.5. The

initial distribution is (xI(0), xR(0), xS(0)) = (0.25,0.25,0.5).

In this model the lumping is extremely accurate. In particular, we see that the

lumping error of our method becomes quickly very small (Fig. 3a) and that we

only need a few hundred ODEs to get a reasonable approximation of the original

AME. The lumped solution zwe get from the stopping heuristic, consisting of less

than 5% of the original equations, is almost indistinguishable from the original

AME solution xand the Monte-Carlo simulation (Fig. 3b). The lumping error is

(x,z)=0.0015.The lumped solution used here 1791 clusters with a runtime of

235 seconds while solving the original AME we needed 39711 clusters and 7848

seconds.

5.2 Rumor Spreading

In the rumor spreading model [13], agents are either ignorants (I) who do not

know about the rumor, spreaders (S) who spread the rumor, or stiﬂers (R)

who know about the rumor, but are not interested in spreading it. Ignorants

learn about the rumor from spreaders and spreaders lose interest in the rumor

when they meet stiﬂers or other spreaders. Thus, the rules of the model are the

following:

Iλ1·m[S]

−−−−−→ S S λ2·m[R]

−−−−−→ R S λ3·m[S]

−−−−−→ R.

5https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.integrate.ode.html

(a) (b)

Fig. 4: Rumor spreading model. (a): Lumping error and runtime of the ODE

solver w.r.t. the number of clusters. (b): Fractions of nodes in each state over

time given by the original AME (solid line), by the lumped AME (dashed line),

and based on Monte–Carlo simulations (diamonds).

(a) (b)

Fig. 5: Competing pathogens dynamics. (a): Fractions of nodes in I, J, S: origi-

nal AME (solid line); lumped AME (dashed line); Monte-Carlo simulations (di-

amonds). (b): Comparison of pair approximation with Monte-Carlo simulation

(diamonds).

We assume (λ1, λ2, λ3) = (6.0,0.5,0.5) with kmax = 60 and γ= 3.0. The initial

distribution is set to (xI(0), xR(0), xS(0)) = (3

5,1

5,1

5). Again, we ﬁnd that Monte–

Carlo simulations, original AME, and lumped AME are in excellent agreement

(Fig. 4b). The error curve, however, converges slower to zero than in the SIR

model but it gets fast enough close to it (Fig. 4a). The lumped solution cor-

responds to 1032 clusters with a lumping error of 0.0059 and a runtime of 35

seconds compared to 39711 clusters of the original AME solution the runtime of

which was 1606 seconds.

5.3 Competing Pathogens

We, ﬁnally, examine an epidemic model with two competing pathogens [22]. The

pathogens are denoted by I and J and the susceptible state by S:

Sλ1·m[I]

−−−−−→ I S λ2·m[J]

−−−−−→ J I λ3

−−→ S J λ4

−−→ S.

We assume that both pathogens have the same infection rate and diﬀer only in

their respective recovery rates. Speciﬁcally, we set

(λ1, λ2, λ3, λ4) = (5.0,5.0,1.5,1.0) and assume network parameters of kmax = 55

and γ= 2.5. The initial distribution is (xI(0), xJ(0), xS(0)) = (0.2,0.1,0.7).

This model is the most challenging case study for our approach. AME solution

and naturally lumped AME are not in perfect alignment with Monte Carlo

simulations (Fig. 5a) and our lumping approach needs a, comparably to the

previous cases, larger number of clusters to get a reasonably good approximation

of the AME (Fig. 6a). The computational gain is, however, large as well. The

lumped solution that comes with an approximation error of 0.02 corresponds to

2135 clusters and a runtime of 961 seconds compared to 30856 clusters and 17974

seconds of the original AME solution. Pair approximation approach (Fig. 5b)

even tough faster (40 seconds) than the lumped AME would have here resulted

to a much larger approximation error than our method (cf. Fig. 5b).

At last, the slow convergence of the error curve makes the competing pathogen

model a good test case the for our stopping heuristic. The heuristic evaluates

the model for three diﬀerent clusterings (509, 986, 2135 clusters). It stops as

the diﬀerence between the two last clusterings is smaller than stop, showing

its eﬀectiveness also for challenging models. In Fig. 6b we show the alignment

between the true lumping error and the surrogate error used by the heuristic.

(a) (b)

Fig. 6: Competing pathogens lumping. (a): Lumping error and runtime of the

ODE solver w.r.t. the number of clusters. (b): Lumping error compared to the

error used by the heuristic.

6 Conclusions and Future Work

In this paper, we present a novel model-reduction technique to overcome the

large computational burden of the multistate AME and make it tractable for

real world problems. We show that it is possible to describe complex global

behavior of dynamical processes using only an extremely small fraction of the

original equations. Our approach exploits the high similarity among the original

equations as well as the comparably small impact of equations belonging to the

tail of the power-law degree distribution. In addition, we propose an approach

for ﬁnding a reasonable trade-oﬀ between accuracy and runtime of our method.

Our approach is particularly useful in situations where several evaluations of

the AME are necessary such as for the estimation of parameters or for model

selection.

For future work, we plan to develop a method for on-the-ﬂy clustering, which

joins equations and breaks them apart during integration. This would allow the

clustering to take into account the concrete (local) dynamics and to analyze

adaptive networks with a variable degree distribution.

7 Acknowledgments

This research was been partially funded by the German Research Council (DFG)

as part of the Collaborative Research Center “Methods and Tools for Under-

standing and Controlling Privacy”. We thank James P. Gleeson for his comments

regarding the performance of AME on speciﬁc models and Michael Backenk¨ohler

for his comments on the manuscript.

References

1. Albert-L´aszl´o Barab´asi. Network science. Cambridge university press, 2016.

2. Albert-L´aszl´o Barab´asi and R´eka Albert. Emergence of scaling in random networks.

science, 286(5439):509–512, 1999.

3. Alain Barrat, Marc Barthelemy, and Alessandro Vespignani. Dynamical processes

on complex networks. Cambridge university press, 2008.

4. Luca Bortolussi, Jane Hillston, Diego Latella, and Mieke Massink. Continuous

approximation of collective system behaviour: A tutorial. Performance Evaluation,

70(5):317–349, 2013.

5. Fred Brauer. Mathematical epidemiology: Past, present, and future. Infectious

Disease Modelling, 2(2):113–127, 2017.

6. Peter Buchholz. Exact and ordinary lumpability in ﬁnite markov chains. Journal

of applied probability, 31(1):59–75, 1994.

7. Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin. Erode:

a tool for the evaluation and reduction of ordinary diﬀerential equations. In Inter-

national Conference on Tools and Algorithms for the Construction and Analysis of

Systems, pages 310–328. Springer, 2017.

8. Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin. Syntactic

markovian bisimulation for chemical reaction networks. In Models, Algorithms,

Logics and Tools, 2017.

9. Claudio Castellano and Romualdo Pastor-Satorras. Thresholds for epidemic

spreading in networks. Physical review letters, 105(21):218701, 2010.

10. Eric Cator and Piet Van Mieghem. Second-order mean-ﬁeld susceptible-infected-

susceptible epidemic threshold. Physical review E, 85(5):056111, 2012.

11. Wesley Cota and Silvio C Ferreira. Optimized gillespie algorithms for the sim-

ulation of markovian epidemic processes on large and heterogeneous networks.

Computer Physics Communications, 219:303–312, 2017.

12. G Demirel, F Vazquez, GA B¨ohme, and T Gross. Moment-closure approximations

for discrete adaptive networks. Physica D: Nonlinear Phenomena, 267:68–80, 2014.

13. Nick Fedewa, Emily Krause, and Alexandra Sisson. Spread of a rumor. Society for

Industrial and Applied Mathematics. Central Michigan University, 25, 2013.

14. P. G. Fennell. Stochastic processes on complex networks: techniques and explo-

rations. PhD thesis, University of Limerick, 2015.

15. Bailey K Fosdick, Daniel B Larremore, Joel Nishimura, and Johan Ugander.

Conﬁguring random graph models with ﬁxed degree sequences. arXiv preprint

arXiv:1608.00607, 2016.

16. James P Gleeson. High-accuracy approximation of binary-state dynamics on net-

works. Physical Review Letters, 107(6):068701, 2011.

17. James P Gleeson. Binary-state dynamics on complex networks: Pair approximation

and beyond. Physical Review X, 3(2):021004, 2013.

18. James P Gleeson, Sergey Melnik, Jonathan A Ward, Mason A Porter, and Pe-

ter J Mucha. Accuracy of mean-ﬁeld theory for dynamics on real-world networks.

Physical Review E, 85(2):026106, 2012.

19. Istvan Z Kiss, Joel C Miller, and P´eter L Simon. Mathematics of epidemics on

networks: from exact to approximate models. Forthcoming in Springer TAM series,

2016.

20. Charalampos Kyriakopoulos, Gerrit Grossmann, Verena Wolf, and Luca Borto-

lussi. Lumping of degree-based mean-ﬁeld and pair-approximation equations for

multistate contact processes. Physical Review E, 97(1):012301, 2018.

21. Genyuan Li and Herschel Rabitz. A general analysis of approximate lumping in

chemical kinetics. Chemical engineering science, 45(4):977–1002, 1990.

22. Naoki Masuda and Norio Konno. Multi-state epidemic processes on complex net-

works. Journal of Theoretical Biology, 243(1):64–75, 2006.

23. Mark EJ Newman. The structure and function of complex networks. SIAM review,

45(2):167–256, 2003.

24. Romualdo Pastor-Satorras, Claudio Castellano, Piet Van Mieghem, and Alessandro

Vespignani. Epidemic processes in complex networks. Reviews of modern physics,

87(3):925, 2015.

25. Romualdo Pastor-Satorras and Alessandro Vespignani. Epidemic spreading in

scale-free networks. Physical review letters, 86(14):3200, 2001.

26. Mason Porter and James Gleeson. Dynamical systems on networks: A tutorial,

volume 4. Springer, 2016.

27. P´eter L Simon, Michael Taylor, and Istvan Z Kiss. Exact epidemic models on

graphs using graph-automorphism driven lumping. Journal of mathematical biol-

ogy, 62(4):479–508, 2011.

28. James Wei and James CW Kuo. Lumping analysis in monomolecular reaction sys-

tems. analysis of the exactly lumpable system. Industrial & Engineering chemistry

fundamentals, 8(1):114–123, 1969.

A Simpliﬁcation of Equation Generation

We constructed the lumped equations such that they can be evaluated eﬃciently

in each step of the ODE solver. However, for very large kmax, the size of M

becomes enormous. This makes the generation of the equations costly because

we iterate multiple times over M. This iteration is necessary, each time we

compute a scalar of the form Pm∈CwC,kmf(m) . In this section we introduce an

approximative scheme to generate lumped equations without the computational

burden of looking at each individual m∈ M.

Our main idea is to only consider the center of each cluster w.r.t. wC,km, and

not at each cluster element. We use hmiCto denote the center of cluster C, each

entry being deﬁned as:

hmiC[s] = X

m∈C

wC,kmm[s].(18)

We can eﬃciently compute hmiCby only considering the direction of a cluster

(which only depends on the associated proportionality cluster) and the mean

degree of a cluster (which can be computed by only considering the degree dis-

tribution).

Next, we approximate the average cluster rate by only evaluating the rate

function of each rule at the cluster mean:

X

m∈C

wC,kmf(m)≈fhmiC.(19)

This, of course, only makes sense if the rate function is reasonably smooth (which

is the case in our models).

Likewise, inside the βss1→ss2, we approximate:

X

m∈C

f(m)wC,kmm[s]≈fhmiC·wC,km· hmiC[s].(20)

Finally, we approximate the in- and outﬂow related to the βs. Note that, by

design, our clustering has the property that for given s1, s2∈ S and C∈ C there

is only exactly one neighboring cluster in which probability mass can ﬂow by

adding (resp. subtracting) state s1(resp. s2) from the neighborhood vector. We

now assume a ﬁxed s1,s2and use C0∈ C to denote this cluster. We deﬁne

CNB ={m|m{s

+

1,s−

2}∈C}

CB={m|m{s

+

1,s−

2}/∈C}={m|m{s

+

1,s−

2}∈C0}.(21)

We see that all m∈CNB occur in the third term (inﬂow) and in the fourth

term (outﬂow) in the lumped AME. Hence, they cancel out and we can ignore

them. To determine the ﬂow of probability mass between two clusters only CB

is of interest. Since the ﬂow is symmetrical (i.e., the inﬂow of C0is the outﬂow

of C) it is suﬃcient to approximate one direction. We use

βss1→ss2

L·zs,C0· hmiCB[s1]·|CB|

|C|(22)

to approximate the ﬂow from Cto C0. Hence, we add (resp. substract) this

value in the equation corresponding to C0(resp. C). Note that hmiCBdenotes

an approximation of the mean value of CB. As these points lie at the border to

C0, we use:

hmiCB[s1] = 1

2hmiC[s1] + 1

2hmiC0[s1].(23)

|CB|

|C|is a scaling factor which corresponds to the size of the border between

the clusters (the larger the border area, the more probability ﬂows between the

clusters). Note that also the cardinality of Cand the cardinality of CBcan

be eﬃciently approximated by combinatoric reasoning without looking at the

individual elements.

First, consider |CB|. If we ﬁx a k, each cluster has approximately the same

number of elements (cf. Fig. 2b). We get this number by dividing |Mk|(the

number of neighbor vectors for that degree) by the number of proportionality

clusters. To determine |C|, we simply aggregate this value over all degrees which

occur inside C. Next, consider |CB|. It denotes the number of points inside C

but next to one particular neighboring cluster. Luckily, our clustering has a nice

geometrical structure (namely a triangular one), which we exploit here. The size

of a face (surface area in one direction) of each cluster C∈ C in ndimensions

for a ﬁxed degree kis exactly the number of elements in a cluster for n−1

dimensions for that k. For example, the face of a tetrahedron is a triangle, and

the size of the triangle can be determined by clustering three dimensions instead

of four.

We present two examples of approximative equation generation in Fig. 7.

First, we compare these with equations which are generated using the old ap-

proach (Fig. 7a). We ﬁnd that their respective dynamics does not diﬀer signiﬁ-

cantly.

In addition, we test the approximative equation on a model for which the

traditional generation would introduce a signiﬁcant overhead and where testing

the original AME is practically impossible (Fig. 7b). We choose a SIR model

(where nodes are trapped in state R) with kmax = 500, γ= 2.5, an infection rate

of 3.0·m[I], and a recovery rate of 0.3. Again, we see that the lumped AME is

in excellent agreement with the numerical simulations.

(a) (b)

Fig. 7: Dynamics of approximative equations. (a): The same SIR model as in

Fig. 3: Lumping of the AME (solid line) and approximation of the lumped equa-

tion (dashed line) corresponding to 10 proportionality clusters and 20 degree

clusters. (b): New SIR model with 50 degree clusters and 15 proportionality

clusters. Approximative equations (solid line) are compared with Monte-Carlo

simulations (diamonds). The lumped AME has 8583 clusters. In contrast to more

than 21 million clusters of the original model.