ArticlePDF Available

Exchangeable Random Measures for Sparse and Modular Graphs with Overlapping Communities

Authors:

Abstract and Figures

We propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of completely random measures, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges.
Content may be subject to copyright.
Exchangeable Random Measures for Sparse and
Modular Graphs with Overlapping Communities
Adrien Todeschini and Fran¸cois Caron
INRIA & Institut de Math´ematiques de Bordeaux, France
e-mail: Adrien.Todeschini@inria.fr
Department of Statistics, University of Oxford, UK
e-mail: caron@stats.ox.ac.uk
Abstract: We propose a novel statistical model for sparse networks with overlapping community
structure. The model is based on representing the graph as an exchangeable point process, and naturally
generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our
construction builds on vectors of completely random measures, and has interpretable parameters, each
node being assigned a vector representing its level of affiliation to some latent communities. We develop
methods for simulating this class of random graphs, as well as to perform posterior inference. We show
that the proposed approach can recover interpretable structure from two real-world networks and can
handle graphs with thousands of nodes and tens of thousands of edges.
Keywords and phrases: Networks, Random Graphs, Multiview Networks, Multigraphs, Completely
Random Measures, L´evy measure, Multivariate Subordinator, Sparsity, Non-Negative Factorization,
Exchangeability, Point Processes.
1. Introduction
There has been a growing interest in the analysis, understanding and modeling of network data over the recent
years. A network is composed of a set of nodes, or vertices, with connections between them. Network data
arise in a wide range of fields, and include social networks, collaboration networks, communication networks,
biological networks, food webs and are a useful way of representing interactions between sets of objects. Of
particular importance is the elaboration of random graph models, which can capture the salient properties
of real-world graphs. Following the seminal work of Erd¨os and R´enyi (1959), various network models have
been proposed; see the overviews of Newman (2003b,2009), Kolaczyk (2009), Bollob´as (2001), Goldenberg
et al. (2010), Fienberg (2012) or Jacobs and Clauset (2014). In particular, a large body of the literature has
concentrated on models that can capture some modular or community structure within the network. The
first statistical network model in this line of research is the popular stochastic block-model (Holland et al.,
1983;Snijders and Nowicki,1997;Nowicki and Snijders,2001). The stochastic block-model assumes that
each node belongs to one of platent communities, and the probability of connection between two nodes is
given by a p×pconnectivity matrix. This model has been extended in various directions, by introducing
degree-correction parameters (Karrer and Newman,2011), by allowing the number of communities to grow
with the size of the network (Kemp et al.,2006), or by considering overlapping communities (Airoldi et al.,
2008;Miller et al.,2009;Latouche et al.,2011;Palla et al.,2012;Yang and Leskovec,2013). Stochastic
block-models and their extensions have shown to offer a very flexible modeling framework, with interpretable
parameters, and have been successfully used for the analysis of numerous real-world networks. However, as
outlined by Orbanz and Roy (2015), when one makes the usual assumption that the ordering of the nodes
is irrelevant in the definition of the statistical network model, the Bayesian probabilistic versions of those
models lead to dense networks1: that means that the number of edges grows quadratically with the number
of nodes. This property is rather undesirable, as many real-world networks are believed to be sparse.
Recently, Caron and Fox (2014) proposed an alternative framework for statistical network modeling. The
framework is based on representing the graph as an exchangeable random measure on the plane. More
precisely, the nodes are embedded at some location θiR+and, for simple graphs, a connection exists
1We refer to graphs whose number of edges scales quadratically with the number of nodes as dense, and sparse if it scales
sub-quadratically.
1
arXiv:1602.02114v1 [stat.ME] 5 Feb 2016
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 2
between two nodes iand jif there is a point at locations (θi, θj) and (θj, θi). An undirected simple graph is
therefore represented by a symmetric point process Zon the plane
Z=X
i,j
zij δ(θij)(1)
where zij =zji = 1 if iand jare connected, 0 otherwise; see Figure 1for an illustration. Caron and Fox (2014)
noted that jointly exchangeable random measures, a notion to be defined in Eq. (21), admit a representation
theorem due to Kallenberg (1990), providing a general construction for exchangeable random measures hence
random graphs represented by such objects. This connection is further explored by Veitch and Roy (2015)
and Borgs et al. (2016), who provide a detailed description and extensive theoretical analysis of the associated
class of random graphs, which they name Kallenberg exchangeable graphs or graphon processes. Within this
class of models, Caron and Fox (2014) consider in particular the following simple generative model, where
two nodes i6=jconnect with probability
Pr(zij = 1|(w`)`=1,2,...)=1e2wiwj(2)
where the (wi, θi)i=1,2,... are the points of a Poisson point process on R2
+. The parameters wi>0 can
be interpreted as sociability parameters. Depending on the properties of the mean measure of the Poisson
process, the authors show that it is possible to generate both dense and sparse graphs, with potentially
heavy-tailed degree distributions, within this framework. The construction (2) is however rather limited in
terms of capturing structure in the network. Herlau et al. (2015) proposed an extension of (2), which can
accommodate a community structure. More precisely, introducing latent community membership variables
ci∈ {1, . . . , p}, two nodes i6=jconnect with probability
Pr(zij = 1|(w`, c`)`=1,2,...,(ηk` )1k,`p)=1e2ηcicjwiwj(3)
where the (wi, ci, θi)i=1,2,... are the points of a (marked) Poisson point process on R+×{1, . . . , p}×R+and ηk`
are positive random variables parameterizing the strength of interaction between nodes in community kand
nodes in community `. The model is similar in spirit to the degree-corrected stochastic block-model (Karrer
and Newman,2011), but within the point process framework (1), and can thus accommodate both sparse and
dense networks with community structure. The model of Herlau et al. (2015) however shares the limitations
of the (degree-corrected) stochastic block-model, in the sense that it cannot model overlapping community
structures, each node being assigned to a single community; see Latouche et al. (2011) and Yang and Leskovec
(2013) for more discussion along these lines. Other extensions with block structure or mixed membership block
structure are also suggested by Borgs et al. (2016).
In this paper, we consider that each node iis assigned a set of latent non-negative parameters wik,
k= 1, . . . , p, and that the probability that two nodes i6=jconnect is given by
Pr(zij = 1|(w`1, . . . , w`p)`=1,2,... )=1e2Pp
k=1 wikwj k .(4)
These non-negative weights can be interpreted as measuring the level of affiliation of node ito the latent
communities k= 1, . . . , p. For example, in a friendship network, these communities can correspond to col-
leagues, family, or sport partners, and the weights measure the level of affiliation of an individual to each
community. Note that as individuals can have high weights in different communities, the model can capture
overlapping communities. The link probability (4) builds on a non-negative factorization; it has been used
by other authors for network modeling (Yang and Leskovec,2013;Zhou,2015) and is also closely related
to the model for multigraphs of Ball et al. (2011). The main contribution of this paper is to use the link
probability (4) within the point process framework of Caron and Fox (2014). To this aim, we consider that
the node locations and weights (wi1, . . . , wip, θi)i=1,2,... are drawn from a Poisson point process on Rp+1
+with
a given mean measure ν. The construction of such multivariate point process relies on vectors of completely
random measures (or equivalently multivariate subordinators). In particular, we build on the flexible though
tractable construction recently introduced by Griffin and Leisen (2014).
The proposed model generalizes that of Caron and Fox (2014) by allowing the model to capture more
structure in the network, while retaining its main features, and is shown to have the following properties:
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 3
0
zij
θi
θj
wi1wi2
wi3
wj1
wj2
wj3
Fig 1. Representation of a undirected graph via a point process Z. Each node iis embedded in R+at some location θiand is
associated with a set of positive attributes (wi1,...,wip). An edge between nodes θiand θjis represented by a point at locations
(θi, θj)and (θj, θi)in R2
+.
Interpretability: each node is assigned a set of positive parameters, which can be interpreted as mea-
suring the levels of affiliation of a node to latent communities; once those parameters are learned, they
can be used to undercover the latent structure in the network.
Sparsity: we can generate graphs whose number of edges grows subquadratically with the number of
nodes.
Exchangeability: in the sense of Kallenberg (1990).
Additionally, we develop a Markov chain Monte Carlo (MCMC) algorithm for posterior inference with this
model, and show experiments on two real-world networks with a thousand of nodes and tens of thousands of
edges.
The article is organized as follows. The class of random graph models is introduced in Section 2. Properties
of the class of graphs and simulation are described in Section 3. We derive a scalable MCMC algorithm for
posterior inference in Section 4. In Section 5we provide illustrations of the proposed method on simulated
data and on two networks: a network of citations between political blogs and a network of connections between
US airports. We show that the approach is able to discover interpretable structure in the data.
2. Sparse graph models with overlapping communities
In this section, we present the statistical model for simple graphs. The construction builds on vectors of
completely random measures (CRM, Kingman,1967). We only provide here the necessary material for the
definition of the network model; please refer to Appendix Afor additional background on vectors of CRMs.
The model described in this section can also be extended to bipartite graphs; see Appendix D.
2.1. General construction using vectors of CRMs
We consider that each node iis embedded at some location θiR+, and has some set of positive weights
(wi1, . . . , wip)Rp
+. The points (wi1, . . . , wip, θi)i=1,...,are assumed to be drawn from a Poisson process
with mean measure
ν(dw1, . . . , dwp, dθ) = ρ(dw1, . . . , dwp)λ() (5)
where λis the Lebesgue measure and ρis a σ-finite measure on Rp
+, concentrated on Rp
+\{0}, which satisfies
ZRp
+
min 1,
p
X
k=1
wk!ρ(dw1, . . . , dwp)<.(6)
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 4
θ1
θ2θ3
θ4
3
1
1
1
2
θ1
θ2θ3
4
2
1
3
θ1
θ2θ3
θ4
(a) (b) (c)
Fig 2. An example of (a) the restriction on [0,1]2of the two atomic measures D1and D2, (b) the corresponding multiview
directed multigraphs (top: view 1; bottom: view 2) and (c) corresponding undirected graph.
Under this condition (Skorohod,1991;Barndorff-Nielsen et al.,2001), we can describe the set of weights and
locations using a vector of completely random measures (W1, . . . , Wp) on R+:
Wk=
X
i=1
wikδθi,for k= 1, . . . , p. (7)
We simply write
(W1, . . . , Wp)CRM(ρ, λ).(8)
Mimicking the hierarchical construction of Caron and Fox (2014), we introduce integer-valued random
measures Dkon R2
+,k= 1, . . . , p,
Dk=
X
i=1
X
j=1
nijk δ(θij)(9)
where the nijk are natural integers. The vector of random measures (D1, . . . , Dp) can be interpreted as
representing a multiview (a.k.a. multiplex or multi-relational) directed multigraph (Verbrugge,1979;Salter-
Townshend and McCormick,2013), where nijk represents the number of interactions from node ito node j
in the view k; see Figure 2for an illustration. Conditionally on the vector of CRMs, the measures Dkare
independently drawn from a Poisson process2with mean measure Wk×Wk
Dk|(W1, . . . , Wp)Poisson (Wk×Wk) (10)
that is, the nijk are independently Poisson distributed with rate wikwjk .
Finally, the point process Zrepresenting the graph (Eq. (1)) is deterministically obtained from (D1, . . . , Dp)
by setting zij = 1 if there is at least one directed connection between iand jin any view, and 0 otherwise,
therefore zij = min(1,Pp
k=1 nijk +njik). To sum up, the graph model is described as follows:
Wk=P
i=1 wikδθi(W1, . . . , Wp)CRM(ρ, λ)
Dk=P
i=1 P
j=1 nijk δ(θij)Dk|WkPoisson (Wk×Wk)
Z=P
i=1 P
j=1 min(1,Pp
k=1 nijk +njik)δ(θij).
(11)
2Note that we consider a generalized definition of a Poisson process, where the mean measure is allowed to have atoms; see
e.g. Daley and Vere-Jones (2008a, Section 2.4).
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 5
(a) Wk×Wk(b) Integer point processes Dk(c) Point process Z
Fig 3. An example, for p= 2, of (a) the product measures Wk×Wk, (b) a draw of the directed multigraph measures Dk|Wk
Poisson(Wk×Wk)and (c) corresponding undirected measure Z=P
i=1 P
j=1 min(1,Pp
k=1 nijk +njik )δ(θij).
The model construction is illustrated in Figure 3. Integrating out the measures Dk,k= 1, . . . , p, the con-
struction can be expressed as, for ij
zij |(w`1, . . . , w`p)`=1,2,... Ber(1 exp(2Pp
k=1 wikwj k )) i6=j
Ber(1 exp(Pp
k=1 w2
ik)) i=j(12)
and zji =zij ; see Figure 1.
Graph Restrictions. Except in trivial cases, we have Wk(R+) = a.s. and therefore Z(R2
+) = a.s.,
so the number of points over the plane is infinite a.s. For α > 0, we consider restrictions of the measures Wk,
k= 1, . . . , p, to the interval [0, α] and of the measures Dkand Zto the box [0, α]2, and write respectively
W,Dkα and Zαthese restrictions. Note that condition (6) ensures that W([0, α]) <a.s. hence
D([0, α]2)<and Zα([0, α]2)<a.s. As a consequence, for a given α > 0, the model yields a finite
number of edges a.s., even though there may be an infinite number of points (wi, θi)R+×[0, α]; see
Section 3.
Remark 1 The model defined above can also be used for random multigraphs, where nij =Pp
k=1 nijk is the
number of directed interactions between iand j. Then we have
nij |(w`1, . . . , w`p)`=1,2,... Poisson p
X
k=1
wikwj k !
which is a Poisson non-negative factorization (Lee,1999;Cemgil,2009;Psorakis et al.,2011;Ball et al.,
2011;Gopalan et al.,2015).
Remark 2 The model defined by Eq. (12)allows to model networks which exhibit assortativity (Newman,
2003a), meaning that two nodes with similar characteristics (here similar set of weights) are more likely to
connect than nodes with dissimilar characteristics. The link function can be generalized to (see e.g. Zhou,
2015)
zij Ber 1exp
p
X
k=1
p
X
`=1
ηk`wik wj `!!
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 6
Fig 4. Graph sampled from the model with three latent communities, identified by colors red, green, blue. For each node,
the intensity of each color is proportional to the value of the associated weight in that community. Pure red/green/blue color
indicates the node is only strongly affiliated to a single community. A mixture of those colors indicates balanced affiliations to
different communities. Graph generated with the software Gephi (Bastian et al.,2009).
where ηk` 0, in order to be able to capture both assortative and dissortative mixing in the network. In
particular, setting larger values off-diagonal than on the diagonal of the matrix (ηk `)1k,`pallows to capture
dissortative mixing. The properties and algorithms for simulation and posterior inference can trivially be
extended to this more general case. In order to keep the notations as simple as possible, we focus here on the
simpler link function (12).
2.2. Particular model based on compound CRMs
The key component in our statistical network model is the multivariate L´evy measure ρin (8). Various ap-
proaches have been developed for constructing multivariate L´evy measures (Tankov,2003;Cont and Tankov,
2003;Kallsen and Tankov,2006;Barndorff-Nielsen et al.,2001;Skorohod,1991), or more specifically vectors
of completely random measures (Epifani and Lijoi,2010;Leisen and Lijoi,2011;Leisen et al.,2013;Griffin
et al.,2013;Lijoi et al.,2014). We will in this paper consider the following particular form:
ρ(dw1, . . . , dwp) = ePp
k=1 γkwkZ
0
wp
0Fdw1
w0
,...,dwp
w0ρ0(dw0) (13)
where F(1, . . . dβp) is some score probability distribution on Rd
+, with moment generating function M(t1, . . . , tp),
ρ0is a base L´evy measure on R+and γk0 are exponentially tilting parameters for k= 1, . . . , p. The model
defined by (5) and (13) is a special case of the compound completely random measure (CCRM) model
proposed by Griffin and Leisen (2014). It admits the following hierarchical construction, which makes inter-
pretability, characterization of the conditionals and analysis of this class of models particularly easy. Let
W0=
X
i=1
wi0δθiCRM(eρ0, λ) (14)
where eρ0is a measure on R+defined by eρ0(dw0) = M(w0γ1,...,w0γp)ρ0(dw0), and for k= 1, . . . , p and
i= 1,2, . . .
wik =βikwi0
where the scores βik have the following joint distribution
(βi1, . . . , βip)|wi0
ind
H(·|wi0) (15)
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 7
with His an exponentially tilted version of F:
H(1, . . . , dβp|w0) = ew0Pp
k=1 γkβkF(1, . . . , dβp)
RRp
+ew0Pp
k=1 γke
βkFde
β1, . . . , d e
βp.(16)
Additionally, the set of points (wi0, βi1, . . . , βip)i=1,2,... is a Poisson point process with mean measure
ew0Pp
k=1 γkβkF(1, . . . , dβp)ρ0(dw0).(17)
Dependence between the different CRMs is both tuned by the shared scaling parameter wi0and potential
dependency between the scores (βi1, . . . , βip). The hierarchical construction has the following interpretation:
The weight wi0is an individual scaling parameter for node iwhose distribution is tuned by the base
evy measure ρ0. It can be considered as a degree correction, as often used in network models (Karrer
and Newman,2011;Zhao et al.,2012;Herlau et al.,2015). As shown in Section 3,ρ0tunes the overall
sparsity properties of the network.
The community-related scores βik tune the level of affiliation of node ito community k; this is controlled
by both the score distribution Fand the tilting coefficients γk. These parameters tune the overlapping
block-structure of the network.
An example of such a graph with three communities is displayed in Figure 4.
Specific choices for Fand ρ0.We now give here specific choices of score distribution Fand base L´evy
measure ρ0, which lead to scalable inference algorithms. As in Griffin and Leisen (2014), we consider that F
is a product of independent gamma distributions
F(1, . . . , dβp) =
p
Y
k=1
βak1
kebkβkbak
k
Γ(ak)k(18)
where ak>0,bk>0, k= 1, . . . , p, which leads to
H(dw1, . . . , dwp|w0)
p
Y
k=1
wak1
kebkwk
w0γkwkdwk
which is also a product of gamma distributions.
ρ0is set to be the mean measure of the jump part of a generalized gamma process (Hougaard,1986;
Brix,1999), which has been extensively used in BNP models due to its generality, the interpretability of its
parameters and its attractive conjugacy properties (James,2002;Lijoi et al.,2007;Saeedi and Bouchard-Cˆot´e,
2011;Caron,2012;Caron et al.,2014). The L´evy measure in this case is
ρ0(dw0) = 1
Γ(1 σ)w1σ
0exp(w0τ)dw0(19)
where the parameters (σ, τ ) verify
σ(0,1), τ 0 or σ(−∞,0], τ > 0.(20)
The gamma process (σ= 0), the inverse Gaussian process (σ=1
2) and the stable process (σ(0,1),
τ= 0) are special cases. Using (18) and (19), the multivariate L´evy measure has the following analytic form
ρ(dw1, . . . , dwp) = 2ePp
k=1 γkwk
Γ(1 σ)"p
Y
k=1
wak1
kbak
k
Γ(ak)#τ
Pp
k=1 bkwkκ
2
Kκ
2sτX
k
bkwk
dw1. . . dwp
where κ=σ+Pp
k=1 akand Kis the modified Bessel function of the second kind.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 8
3. Properties and Simulation
3.1. Exchangeability
The point process Zdefined by (11) is jointly exchangeable in the sense of Kallenberg (1990,2005). For any
h > 0 and any permutation πof N
(Z(Ai×Aj)) d
= (Z(Aπ(i)×Aπ(j))) for (i, j)N2(21)
where Ai= [h(i1), hi]. This follows directly from the fact that the vector of CRMs (W1, . . . , Wp) has
independent and identically distributed increments, hence
(W1(Ai), . . . , Wp(Ai)) d
= (W1(Aπ(i)), . . . , Wp(Aπ(i))).(22)
The model thus falls into the general representation theorem for exchangeable point processes (Kallenberg,
1990).
3.2. Sparsity
In this section, following the asymptotic notations of Janson (2011), we derive the sparsity properties of our
graph model, first for the general construction of Section 2.1, then for the specific construction on compound
CRMs of Section 2.2. Similarly to the notations in Caron and Fox (2014), let Zαbe the restriction of Zto
the box [0, α]2. Let (Nα)α0and (N(e)
α)α0be counting processes respectively corresponding to the number
of nodes and edges in Zα:
Nα= card({θi[0, α]|Z({θi} × [0, α]) >0})
N(e)
α=Z({(x, y)R2
+|0xyα}).
Note that in the propositions below, we discard the trivial case RRp
+ρ(dw1, . . . , dwp) = 0 which implies
N(e)
α=Nα= 0 a.s.
General construction. The next proposition characterizes the sparsity properties of the random graph
depending on the properties of the L´evy measure ρ. In particular, if
ZRp
+
ρ(dw1, . . . , dwp) = (23)
then, for any α > 0, there is a.s. an infinite number of θi[0, α] for which Pkwik >0 and the vector of
CRMs is called infinite-activity. Otherwise, it is finite-activity.
Proposition 3 Assume that, for any k= 1, . . . , p,
ZRp
+
wkρ(dw1, . . . , dwp)<(24)
Then
N(e)
α=Θ(N2
α)if (W1, . . . , Wp)is finite-activity
o(N2
α)otherwise
a.s. as αtends to .
The proof is given in Appendix B.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 9
Construction based on CCRMs. For the CCRM L´evy measure (13), the sparsity properties are solely
tuned by the base L´evy measure ρ0. Ignoring trivial degenerate cases for the score distribution F, it is easily
shown that the CCRM model defined by (5) and (13) is infinite-activity iff the L´evy measure ρ0verifies
Z
0
ρ0(dw) = .(25)
In this case all CRMs W0, W1, . . . , Wpare infinite-activity. Otherwise they are all finite-activity and the vector
of CRMs is finite-activity. In the particular case of a CCRM with independent gamma distributed scores (18)
and generalized gamma process base measure (19), the condition (25) is satisfied whenever σ0. The next
proposition characterizes the sparsity of the network depending on the properties of the base L´evy measure
ρ0.
Proposition 4 Assume that Z
0
0(dw)<(26)
and Fis not degenerated at 0. Then
N(e)
α=Θ(N2
α)if R
0ρ0(dw)<
o(N2
α)otherwise
a.s. as αtends to . Furthermore, if the tail L´evy intensity ρ0defined by
ρ0(x) = Z
x
ρ0(dw),(27)
is a regularly varying function, i.e.
ρ0(x)
xσ`(1/x)1as x0
for some σ(0,1) where `is a slowly varying function verifying limt→∞ `(at)/`(t)=1for any a > 0and
limt→∞ `(t)>0, then
N(e)
α=O(N2/(1+σ)
α)
a.s. as αtends to . In the particular case of a CCRM with independent gamma distributed scores (18)and
generalized gamma process base measure (19), condition (26)is equivalent to having τ > 0. In this case, we
therefore have
N(e)
α=
Θ(N2
α)if σ < 0
o(N2
α)if σ0
O(N2/(1+σ)
α)if σ(0,1).
The proof is given in Appendix B. Figure 5(a) provides an empirical illustration of Proposition 4for a
CCRM with independent gamma scores and generalized gamma based L´evy measure. Figure 5(b) shows
empirically that the degree distribution also exhibits a power-law behaviour when σ(0,1).
3.3. Simulation
The point process Zis defined on the plane. We describe in this section how to sample realizations of
restrictions Zαof Zto the box [0, α]2.
General construction. The hierarchical construction given by Eq. (11) suggests a direct way to sample
from the model:
1. Sample (wi1, .. . , wip, θi)i=1,2,... from a Poisson process with mean measure ν(dw1, . . . , dwp, dθ)1θ[0].
2. For each pair of points, sample zij from (12).
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 10
Number of nodes
10 210 3
Number of edges
10 2
10 3
10 4
10 5σ = -0.5
σ = 0.2
σ = 0.5
σ = 0.8
(a)
(b)
Fig 5. Empirical analysis of the properties of CCRM based graphs generated with parameters p= 2,τ= 1,ak= 0.2,bk=1
p
and averaging over various α. (a) Number of edges versus the number of nodes and (b) degree distributions on a log-log scale
for various σ: one finite-activity CCRM (σ=0.5) and three infinite-activity CCRMs (σ= 0.2,σ= 0.5and σ= 0.8). In (a)
we note growth at a rate Θ(N2
α)for σ=0.5and O(N2/(1+σ)
α)for σ(0,1).
There are two caveats to this strategy. First, for infinite-activity CRMs, the number of points in Rp
+×[0, α]
is almost surely infinite; even for finite-activity CRMs, it may be so large that it is not practically feasible.
We need therefore to resort to an approximation, by sampling from a Poisson process with an approximate
mean measure νε(dw1, . . . , dwp, dθ)1θ[0]=ρε(dw1, . . . , dwp)λ()1θ[0]where
ZRp
+
ρε(dw1, . . . , dwp)<
with ε > 0 controlling the level of approximation. The approximation is specific to the choice of the mean
measure, and we describe such an approximation for CCRMs below.
The second caveat is that, for applying Eq. (12), we need to consider all pairs ij, which can be
computationally problematic. We can instead, similarly to Caron and Fox (2014), use the hierarchical Poisson
construction as follows:
1. Sample (wi1, .. . , wip, θi)i=1,2,...,K from a Poisson process with mean measure νε(dw1, . . . , dwp, dθ)1θ[0].
Let Wε
k,α =PK
i=1 wikδθibe the associated truncated CRMs and Wε
k,α =PK
i=1 wik their total masses.
2. For k= 1, . . . , p, sample D
k,α|Wε
k,α Poisson((Wε
k,α)2).
3. For k= 1, . . . , p,`= 1, . . . , D
k,α,j= 1,2, sample Uk `j |Wε
k,α
ind
Wε
k,α
Wε
k,α .
4. Set Dε
k,α =PD
k,α
`=1 δUk`1,k`2.
5. Obtain Zfrom (D1, .. . , Dp) as in (11).
Construction based on CCRMs. The hierarchical construction of compound CRMs suggests an algo-
rithm to simulate a vector of CRMS. We consider the following (truncated) mean measure
ρε(dw1, . . . , dwp) = ePp
k=1 γkwkZ
ε
wp
0Fdw1
w0
,...,dwp
w0ρ0(dw0) (28)
with ε0. We can sample from the (truncated) CCRM as follows
1. (a) Sample (wi0, θi)i=1,...,K from a Poisson point process with mean measure eρ0(dw0)λ()1{w0>ε,θ[0]}.
(b) For i= 1, . . . , K and k= 1, . . . , p, set wik =βik wi0where (βi1, . . . , βip)|wi0is drawn from (15).
The truncation level εis set to 0 for finite-activity CCRMs, and ε > 0 otherwise. We explain in Appendix C
how to perform step 1.(a) in the case of a tilted generalized gamma process.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 11
4. Posterior inference
In this section, we describe a MCMC algorithm for posterior inference of the model parameters and hyper-
parameters in the statistical network model defined in Section 2. We first describe the data augmentation
scheme and characterization of conditionals. We then describe the sampler for a general L´evy measure ρ, and
finally derive the sampler for compound CRMs.
4.1. Characterization of conditionals and data augmentation
Assume that we have observed a set of connections (zij)1i,j Nα, where Nαis the number of nodes with
at least one connection. We aim at inferring the positive parameters (wi1, . . . , wip)i=1,...,Nαassociated to
the nodes with at least one connection. We also want to estimate the positive parameters associated to the
other nodes with no connection. The number of such nodes may be large, and even infinite for infinite-
activity CRMs; but under our model, these parameters are only identifiable through their sum, denoted
(w1, . . . , wp). Note that the node locations θiare not likelihood identifiable, and we will not try to infer
them. We assume that there is a set of unknown hyperparameters φof the mean intensity ρ, with prior p(φ).
We assume that the L´evy measure ρis absolutely continuous with respect to the Lebesgue measure on Rd,
and write simply ρ(dw1, . . . , dwp;φ) = ρ(w1, . . . , wp;φ)dw1. . . dwp. The parameter αis also assumed to be
unknown, with some prior αGamma(aα, bα) with aα>0, bα>0. We therefore aim at approximating
p((w1k, . . . , wNαk, wk)k=1,...,p, φ, α|(zij )1i,j Nα).
As a first step, we characterize the conditional distribution of the restricted vector of CRMs (W1α, . . . , W)
given the restricted measures (D1α, . . . , D). Proposition 5below extends Theorem 12 in Caron and Fox
(2014) to the multivariate setting.
Proposition 5 Let (θ1, . . . , θNα),Nα0be the support points of (D1α, . . . , D), with
D=X
1i,jNα
nijk δ(θij).
The conditional distribution of (W1α, . . . , W)given (D1α, . . . , D)is equivalent to the distribution of
f
W1+
Nα
X
i=1
wi1δθi,...,f
Wp+
Nα
X
i=1
wipδθi!(29)
where (f
W1,...,f
Wp)is a vector of discrete random measures, which depends on (D1α, . . . , D)only through
the total masses wk=f
Wk([0, α]).
The set of weights (wik)i=1,...,Nα;k=1,...,p and (wk)k=1,...,p are dependent, with joint conditional distribution
p((w1k, . . . , wNαk, wk)k=1,...,p|(nij k)1i,j Nα;k=1,...,p, φ, α)
"Nα
Y
i=1
p
Y
k=1
wmik
ik #ePp
k=1(wk+PNα
i=1 wik)2"Nα
Y
i=1
ρ(wi1, . . . , wip;φ)#αNαgα(w1, . . . , wp;φ) (30)
where mik =PNα
j=1 nijk +njik and gα(w1, . . . , wp;φ)is the probability density function of the random
vector (W1([0, α]), . . . , Wp([0, α])).
The proof can be straightforwardly adapted from that of Caron and Fox (2014), or from Proposition 5.2 of
James (2014) and is omitted here. It builds on other posterior characterizations in Bayesian nonparametric
models (Pr¨unster,2002;James,2002,2005;James et al.,2009).
Data augmentation. Similarly to Caron and Fox (2014), we introduce latent count variables enijk =
nijk +njik with
(enij1,...,enijp )|w, z δ(0,...,0) if zij = 0
tPoisson(2wi1wj1,...,2wipwjp ) if zij = 1, i6=j
enij1
2,..., enijp
2|w, z tPoisson(w2
i1, . . . , w2
ip) if zij = 1, i=j(31)
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 12
where tPoisson(λ1, . . . , λp) is the multivariate Poisson distribution truncated at zero, whose pmf is
tPoisson(x1,...xp;λ1, . . . , λp) = Qp
k=1 Poisson(xk;λk)
1exp(Pp
k=1 xkλk)1{Pp
k=1 xk>0}.
One can sample from this distribution by first sampling x=Pp
k=1 xkfrom a zero-truncated Poisson distri-
bution with rate Pp
k=1 λk, and then (x1, . . . , xp)|(λ1, . . . , λp), x Multinomial x, λ1
Pλk, . . . λp
Pλk.
4.2. Markov chain Monte Carlo algorithm: General construction
Using the data augmentation scheme together with the posterior characterization (30), we can derive the
following MCMC sampler, which uses Metropolis-Hastings (MH) and Hamiltonian Monte Carlo (HMC)
updates within a Gibbs sampler, and iterates as described in Algorithm 1.
Algorithm 1 Markov chain Monte Carlo sampler for posterior inference.
At each iteration
1. Update (wi1,...,wip), i= 1, . . . ,Nαgiven the rest using MH or HMC.
2. Update hyperparameters (φ, α) and total masses (w1,...,wp) given the rest using MH.
3. Update the latent variables given the rest using (31).
In general, if the L´evy intensity ρcan be evaluated pointwise, one can use a MH update for step 1, but it
would scale poorly with the number of nodes. Alternatively, if the L´evy intensity ρis differentiable, one can
use a Hamiltonian Monte Carlo update (Duane et al.,1987;Neal,2011).
The challenging part of the Algorithm 1is Step 2. From Eq. (30) we have
p((wk)k=1,...,p, φ, α|rest) p(φ)p(α)ePp
k=1(wk+PNα
i=1 wik)2"Nα
Y
i=1
ρ(wi1, . . . , wip;φ)#αNαgα(w1, . . . , wp;φ).
This conditional distribution is not of standard form and involves the multivariate pdf gα(w1, . . . , wp)
of the random vector (W1([0, α]), . . . , Wp([0, α])) for which there is typically no analytical expression. All is
available is its Laplace transform, which is given by
EhePp
k=1 tkWk([0])i=eαψ(t1,...,tp;φ)(32)
where
ψ(t1, . . . , tp;φ) = ZRp
+1ePp
k=1 tkwkρ(dw1, . . . , dwp;φ) (33)
is the multivariate Laplace exponent, which involves a p-dimensional integral. We propose to use a Metropolis-
Hastings step, with proposal
q(ew1:p,e
φ, eα|w1:p, φ, α) = q(ew1:p|w1:p,e
φ, eα)×q(e
φ|φ)×q(eα|α, e
φ, w1:p) (34)
where
q(eα|α, e
φ, w1:p) = Gamma(eα;aα+Nα, bα+ψ(λ1, . . . , λp;e
φ)) (35)
and the proposal for w1:pis an exponentially tilted version of gα
q(( ewk)k=1,...,p|(wk)k=1,...,p ,e
φ) = ePp
k=1 λkewkgeα(ew1,..., ewp;e
φ)
eeαψ(λ1,...,λp;e
φ)(36)
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 13
where λk=wk+ 2 PNα
i=1 wik and q(e
φ|φ) can be freely specified by the user. This leads to the following
acceptance ratio
r=p(e
φ)q(φ|e
φ)
p(φ)q(e
φ|φ)
ρ(ew1,..., ewp;e
φ)
ρ(w1, . . . , wp;φ) bα+ψ(e
λ1,...,e
λp;φ)
bα+ψ(λ1, . . . , λp;e
φ)!aα+Nα
ePp
k=1[w2
kew2
k]
where e
λk=ewk+ 2 PNα
i=1 wik. This acceptance ratio involves evaluating the multivariate exponent (33).
In the general case, the MCMC algorithm 1thus requires to be able to
(a) evaluate pointwise the L´evy intensity ρ, and potentially differentiate it,
(b) evaluate pointwise the Laplace exponent (33) and
(c) sample from the exponentially tilted distribution (36).
Regarding point (c), the random variable with pdf (36) has the same distribution as the random vector
W0
1([0, α]), . . . , W 0
p([0, α])where (W0
1, . . . , W 0
p)CRM(ρ0, λ) with ρ0is an exponentially tilted version of ρ
ρ0(w1, . . . , wp) = ePkλkwkρ(w1, . . . , wp).(37)
By considering an approximate tilted intensity ρε0(w1, . . . , wp), one can approximately sample from (36) by
simulating points from a Poisson process with mean measure αρε0(w1, . . . , wp) and summing them up.
4.3. Markov chain Monte Carlo algorithm: Compound CRMs
The hierarchical construction of CCRMs enables to derive a certain number of simplifications in the algorithm
described in the previous section. Using the construction wik =βik wi0where the points (wi0, βi1, . . . , βip)i=1,2,...
have L´evy measure (17), we aim at approximating the posterior
p((w10, . . . , wNα0),(β1k, . . . , βNαk, wk)k=1,...,p , φ, α|(zij )1i,jNα).(38)
Conditional on the latent count variables defined in (31), we have the following conditional characterization,
similar to (30)
p((w10, . . . , wNα0),(β1k, . . . , βNαk, wk)k=1,...,p |(nijk )1i,jNα;k=1,...,p , φ, α)
"Nα
Y
i=1
wmi
i0
p
Y
k=1
βmik
ik #ePp
k=1(wk+PNα
i=1 wik)2PNα
i=1 wi0(Pp
k=1 γkβik)
×"Nα
Y
i=1
f(βi1, . . . , βip;φ)ρ0(wi0;φ)#αNαgα(w1, . . . , wp;φ) (39)
where mi=Pp
k=1 mik and fand ρ0are densities of Fand ρ0with respect to the Lebesgue measure.
If fand ρ0are differentiable, one can use a HMC update for Step 1 of Algorithm 1. In particular, when
they take the form (18) and (19), we obtain the following simple expressions for the gradient:
∂U (q)
d(log wi0)=miσwi0
τ+ 2
p
X
k=1
βik
wk+
Nα
X
j=1
wj0βjk
, i = 1, . . . , Nα,
∂U (q)
d(log βik)=mik +akβik
bk+ 2wi0
wk+
Nα
X
j=1
wj0βjk
, i = 1, . . . , Nα, k = 1, . . . , p,
where U(q) = log p(q|rest) with q= (log wi0,log βi1,...,log βip)i=1,...,Nα.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 14
Regarding Step 2 of Algorithm 1, the Laplace exponent for CCRM takes the simple form
ψ(t1, . . . , tp) = Z
0
[M(w0γ1,...,w0γp)M(w0(t1+γ1),...,w0(tp+γp)] ρ0(dw0) (40)
which only requires evaluating a one-dimensional integral, whatever the number pof communities, and this
can be done numerically. For the specific model defined by (18) and (19), we obtain
ψ(t1, . . . , tp) = 1
Γ(1 σ)Z
0"1
p
Y
k=1 1 + w0tk
bk+w0γkak#" p
Y
k=1 1 + w0γk
bkak#w1σ
0ew0τdw0.
Finally, we need to sample total masses (w1, . . . , wp) from (36), and this can be done by simulating points
(wi0, βi1, . . . , βip)i=1,2,... from a Poisson process with exponentially tilted L´evy intensity
αew0Pp
k=1(γk+λk)βkf(β1, . . . , βp)ρ0(w0) (41)
and summing up the weights wk=Pi=1,2,... wi0βik for k= 1, . . . , p. For infinite-activity CRMs, this is not
feasible, and we suggest to resort to the approximation of Cohen and Rosinski (2007). More precisely, we
write
(w1, . . . , wp) = Xε+Xε
where the random vectors XεRp
+and XεRp
+are defined as Xε=Pi|wi0wi0(βi1, . . . , βip) and
Xε=Pi|wi0wi0(βi1, . . . , βip). We can sample a realization of the random vector Xεexactly by simulating
the points of a Poisson process with mean intensity
αew0Pp
k=1(γk+λk)βkf(β1, . . . , βp)ρ0(w0)1w0(42)
See Section 3.3 and Appendix Cfor details. The positive random vector Xεis approximated by a truncated
Gaussian random vector with mean µεand variance Σεsuch that
µε=αZRp
+
w1:pρε(dw1, . . . , dwp)
Σε=αZRp
+
w1:pwT
1:pρε(dw1, . . . , dwp)
where
ρε(dw1, . . . , dwp) = ePp
k=1(γk+λk)wkZε
0
wp
0Fdw1
w0
,...,dwp
w0ρ0(dw0).
Note that µεand Σεcan both be expressed as one-dimensional integrals using the gradient and Hessian of
the moment generating function Mof F. Theorem 6in Appendix E, which is an adaptation of the results of
Cohen and Rosinski (2007) to CCRM, gives the conditions on the parameters of CCRM under which
Σ1/2
ε(Xεµε)d
→ N(0, Ip) as ε0
and thus the approximation is asymptotically valid. The Gaussian approximation is in particular asymptot-
ically valid for the CCRM defined by (18) and (19) when σ(0,1), hence is valid for all infinite-activity
cases except σ= 0.
Note that due to the Gaussian approximation in the proposal distribution for (wα), Algorithm 1does not
actually admit the posterior distribution (38) as invariant distribution, and is an approximation of an exact
MCMC algorithm targeting this distribution. We observe in the experimental section that this approximation
provides very reasonable results for the examples considered.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 15
5. Experiments
5.1. Simulated data
We first study the convergence of the MCMC algorithm on synthetic data simulated from the CCRM based
graph model described in Section 2where Fand ρ0take the form (18) and (19). We generate an undirected
graph with p= 2 communities and parameters α= 200, σ= 0.2, τ= 1, bk=b=1
p,ak=a= 0.2 and
γk=γ= 0. The sampled graph has 1121 nodes and 12,180 edges. For the inference, we consider that band γ
are known and we assume a vague prior Gamma(0.01,0.01) on the unknown parameters αand φ= (1σ, τ , a).
We run 3 parallel MCMC chains with different initial values. Each chain starts with 10,000 iterations using
our model with only one community where the scores βare fixed to 1, which is equivalent to the model of
Caron and Fox (2014). We then run 200,000 iterations using our model with pcommunities. We use ε= 103
as a truncation level for simulating w1:pand L= 10 leapfrog steps for the HMC. The stepsizes of both
the HMC and the random walk MH on (log(1 σ),log τ, log a) are adapted during the first 50,000 iterations
so as to target acceptance ratios of 0.65 and 0.23 respectively. The computations take around 5h20 using
Matlab on a standard desktop computer. Trace plots of the parameters log α,σ,τ,aand w=1
pPp
k=1 wk
and histograms based on the last 50,000 iterations are given in Figure 6. Posterior samples clearly converge
around the sampled value. Choosing a threshold value 103does not lead to any noticeable change in
the MCMC histograms, suggesting that the target distribution of our approximate MCMC is very close to
the posterior distribution of interest.
MCMC iterations ×105
0 0.5 1 1.5 2
logα
4.4
4.6
4.8
5
5.2
5.4
5.6
5.8
6
Chain 1
Chain 2
Chain 3
True
logα
4.8 5 5.2 5.4 5.6 5.8 6
Nb MCMC samples
0
50
100
150
200
250
True
(a) log α
MCMC iterations ×105
0 0.5 1 1.5 2
σ
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Chain 1
Chain 2
Chain 3
True
σ
0.05 0.1 0.15 0.2 0.25 0.3 0.35
Nb MCMC samples
0
50
100
150
200
250
True
(b) σ
MCMC iterations ×105
0 0.5 1 1.5 2
τ
0
0.5
1
1.5
2
2.5
3
3.5
Chain 1
Chain 2
Chain 3
True
τ
0.5 1 1.5 2 2.5
Nb MCMC samples
0
50
100
150
200
250
True
(c) τ
MCMC iterations ×105
0 0.5 1 1.5 2
a
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Chain 1
Chain 2
Chain 3
True
a
0.14 0.16 0.18 0.2 0.22 0.24
Nb MCMC samples
0
20
40
60
80
100
120
140
160
180
True
(d) a
MCMC iterations ×105
0 0.5 1 1.5 2
w
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
Chain 1
Chain 2
Chain 3
True
w
1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
Nb MCMC samples
0
20
40
60
80
100
120
140
160
180
200
True
(e) w
Fig 6. MCMC trace plots (left) and histograms (right) of parameters (a) log α, (b) σ, (c) τ, (d) aand (e) wfor a graph
generated with parameters p= 2,α= 200,σ= 0.2,τ= 1,b=1
p,a= 0.2and γ= 0.
Our model is able to accurately recover the mean parameters of both low and high degree nodes and
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 16
to provide reasonable credible intervals, as shown in Figure 7(a-b). By generating 5000 graphs from the
posterior predictive we assess that our model fits the empirical power-law degree distribution of the sparse
generated graph as shown in Figure 7(c). We demonstrate the interest of our nonparametric approach by
comparing these results to the ones obtained with the parametric version of our model. To achieve this, we
fix wk= 0 and force the model to lie in the finite-activity domain by assuming σ(−∞,0) and using the
prior distribution σGamma(0.01,0.01). Note that in this case, the model is equivalent to that of Zhou
(2015). As shown in Figure 8(a-b), the parametric model is able to recover the mean parameters of nodes
with high degrees, and credible intervals are similar to that obtained with the full model; however, it fails
to provide reasonable credible intervals for nodes with low degree. In addition, as shown in Figure 8(c), the
posterior predictive degree distribution does not fit the data, illustrating the unability of this parametric
model to capture power-law behaviour.
Index of node (sorted by dec. degree)
10 20 30 40 50
Mean sociability parameters
0
0.5
1
1.5
2
2.5
3
95% credible intervals
True value
(a) 50 nodes with highest degrees
(b) 50 nodes with lowest degrees
Degree
10010 110 210 3
Distribution
10-5
10-4
10-3
10-2
10-1
100
95% posterior predictive
Data
(c) Degree distribution
Fig 7. 95% posterior credible intervals and true values of (a) the mean parameters wi=1
pPp
k=1 wik of the 50 nodes with
highest degrees and (b) the log mean parameters log wiof the 50 nodes with lowest degrees. (c) Empirical degree distribution
and 95% posterior predictive credible interval. Results obtained for a graph generated with parameters p= 2,α= 200,σ= 0.2,
τ= 1,b=1
p,a= 0.2and γ= 0.
Index of node (sorted by dec. degree)
10 20 30 40 50
Mean sociability parameters
0
0.5
1
1.5
2
2.5
3
95% credible intervals
True value
(a) 50 nodes with highest degrees
(b) 50 nodes with lowest degrees
Degree
10010 110 210 3
Distribution
10-5
10-4
10-3
10-2
10-1
100
95% posterior predictive
Data
(c) Degree distribution
Fig 8. See Figure 7. Results obtained on the same generated graph by inferring a finite-activity model with wk= 0 and σ0.
5.2. Real-world graphs
We now apply our methods to learn the latent communities of two real-world undirected simple graphs.
The first network to be considered, the polblogs network (Adamic and Glance,2005), is the network of
the American political blogosphere in February 20053. Two blogs are considered as connected if there is at
3http://www.cise.ufl.edu/research/sparse/matrices/Newman/polblogs
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 17
least one hyperlink from one blog to the other. Additional information on the political leaning of each blog
(left/right) is also available. The second network, named USairport, is the network of connections between
US airports in 20104.
Table 1
Size of the networks, number of communities and computational time.
Name Nb nodes Nb edges Nb communities pTime
polblogs 1224 33,430 2 50m
USairport 1867 44,304 4 5h20m
The sizes of the different networks are given in Table 1. We consider γk= 0 is known and we assume a vague
prior Gamma(0.01,0.01) on the unknown parameters α, 1 σ,τ,akand bk. We take p= 2 communities for
polblogs and p= 4 communities for USairport. We run 3 parallel MCMC chains, each with 10,000+200,000
iterations, using the same procedure as used for the simulated data; see Section 5.1. Computation times are
reported in Table 1. The simulation of w1:prequires more computational time when σ0 (infinite-activity
case). This explain the larger computation times for USairport compared to polblogs.
Degree
10010 110 210 3
Distribution
10-8
10-6
10-4
10-2
100
95% posterior predictive
Data
(a) polblogs
Degree
10010 110 210 3
Distribution
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
95% posterior predictive
Data
(b) USairport
Fig 9. Empirical degree distribution (red) and posterior predictive (blue) of the (a) polblogs and (b) USairport networks.
We interpret the communities based on the minimum Bayes risk point estimate where the cost function
is a permutation-invariant absolute loss on the weights w= (wik)i=1,...,Nα;k=1,...,p. Let Spbe the set of
permutations of {1, . . . , p}and consider the cost function
C(w, w?) = min
π∈Sp"p
X
k=1
Nα
X
i=1 w(k)w?
ik+
p
X
k=1 wπ(k)w?
k#
whose evaluation requires solving a combinatorial optimization problem in Op3using the Hungarian
method. We therefore want to solve
bw= arg min
w?
E[C(w, w?)|Z]
where E[C(w, w?)|Z]'1
NPN
t=1 Cw(t), w?and w(t)t=1,...,N are from the MCMC output. For simplicity,
we limit the search of bwto the set of MCMC samples giving
bw= arg min
w?{w(1),...,w(N)}
1
N
N
X
t=1
Cw(t), w?.
Table 2reports the nodes with highest weights in each community for the polblogs network. Figure 10
also shows the weight associated to each of the two community alongside the true left/right class for each
4http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=292
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 18
Table 2
Nodes with highest weight in each community for the polblogs network. Blog URLs are followed by known political leaning:
(L) for left-wing and (R) for right-wing.
Community 1: “Liberal” Community 2: “Conservative”
dailykos.com (L)
atrios.blogspot.com (L)
talkingpointsmemo.com (L)
washingtonmonthly.com (L)
liberaloasis.com (L)
talkleft.com (L)
digbysblog.blogspot.com (L)
newleftblogs.blogspot.com (L)
politicalstrategy.org (L)
juancole.com (L)
instapundit.com (R)
blogsforbush.com (R)
powerlineblog.com (R)
drudgereport.com (R)
littlegreenfootballs.com/weblog (R)
michellemalkin.com (R)
lashawnbarber.com (R)
wizbangblog.com (R)
hughhewitt.com (R)
truthlaidbear.com (R)
Table 3
Nodes with highest weights in each community for the USairport network.
Community 1: “Hub” Community 2: “East” Community 3: “West” Community 4: “Alaska”
Miami, FL
New York, NY
Newark, NJ
Los Angeles, CA
Atlanta, GA
Washington, DC
Chicago, IL
Boston, MA
Houston, TX
Orlando, FL
Cleveland, OH
Detroit, MI
Nashville, TN
Chicago, IL
Knoxville, TN
Atlanta, GA
Louisville, KY
Indianapolis, IN
Memphis, TN
Charlotte, NC
Denver, CO
Las Vegas, NV
Los Angeles, CA
Salt Lake City, UT
Seattle, WA
Burbank, CA
Phoenix, AZ
Oakland, CA
Portland, OR
Albuquerque, NM
Anchorage, AK
Fairbanks, AK
Bethel, AK
St. Mary’s, AK
King Salmon, AK
McGrath, AK
Unalakleet, AK
Galena, AK
Aniak, AK
Kotzebue, AK
blog. The two learned communities, which can be interpreted as “Liberal” and “Conservative”, clearly re-
cover the political leaning of the blogs. Figure 12 shows the adjacency matrices obtained by reordering the
nodes by community membership, where each node is assigned to the community whose weight is maxi-
mum, clearly showing the block-structure of this network. The obtained clustering yields a 93.95% accuracy
when compared to the ground truth classification. Figure 11(a) shows the relative community proportions
for a subset of the blogs. dailykos.com and washingtonmonthly.com are clearly described as liberal while
blogsforbush.com,instapundit.com and drudgereport.com are clearly conservative. Other more moder-
ate blogs such as danieldrezner.com/blog and andrewsullivan.com have more balanced values in both
communities. Figure 9(a) shows that the posterior predictive degree distribution provides a good fit to the
data.
For USairport, the four learned communities can also be easily interpreted, as seen in Table 3. The
first community, labeled “Hub”, represents highly connected airports with no preferred location, while the
three others, labeled “East”, “West” and “Alaska”, are communities based on the location of the airport.
In Figure 11(b), we can see that some airports have a strong level of affiliation in a single community: New
York and Miami for “Hub”, Lansing for “East”, Seattle for “West” and Bethel and Anchorage for “Alaska”.
Other airports have significant weights in different communities: Raleigh/Durham and Los Angeles are hubs
with strong regional connections, Nashville and Minneapolis share a significant number of connections with
both East and West of the USA. Anchorage has a significant “Hub” weight, while most airports in Alaska are
disconnected from the rest of the world as can be seen in Figure 12(b). “Alaska” appears as a separate block
while substantial overlaps are observed between the “Hub”, “East” and “West” communities. Figure 9(b)
shows that the posterior predictive degree distribution also provides a good fit to the data.
Acknowledgments
The authors thank George Deligiannidis for pointing out the article of Asmussen and Rosi´nski (2001). FC
acknowledges the support of the European Commission under the Marie Curie Intra-European Fellowship
Programme. Part of this work has been supported by the BNPSI ANR project no ANR-13-BS-03-0006-01.
A. Todeschini and F. Caron/Exchangeable Random Measures for Sparse and Modular Graphs 19
w1
0
1
2
3
4
100monkeystyping.com
12thharmonic.com/wordpress
750volts.blogspot.com
95theses.blogspot.com
abbadabbaduo.blogspot.com
aboutpolitics.blogspot.com
achingandspinning.blogspot.com
ackackack.com
adamtalib.blogspot.com
adviceforlefty.blogspot.com
agonist.org
aintnobaddude.com
ajbenjaminjr.blogspot.com
alicublog.blogspot.com
allanjenkins.typepad.com
allspinzone.blogspot.com
alphapredator.com/wdtgw.htm
alternateworlds.blogspot.com
althippo.blogspot.com
alvintostig.typepad.com
americablog.blogspot.com
americablog.org
americanmuslim.blogs.com
americanpolitics.com
americansforbayh.blogspot.com
amleft.blogspot.com
amliberal.com/blog
amptoons.com/blog
anarchyxero.robfindlay.org
andifeelfine.blogspot.com
andymatic.com
angrybear.blogspot.com
angrydesi.typepad.com
angryfinger.org
angryhomo.blogspot.com
annatopia.com/home.html
anoldsoul.blogspot.com
anonymoussources.blogspot.com
answerguy.blogspot.com
anybodybutbushyall.blogspot.com
anywhichway.net
arancaytar.blogspot.com
archpundit.com
arkansastonight.com
askhoudari.blogspot.com
astroturf.uni.cc
atease.blogspot.com
atrios.blogspot.com
atrios.blogspot.com/
awards5.tripod.com/tarasblog
b-patton.blogspot.com
bakshi.us/redskunk
barkingdingo.blogspot.com
bartcop.com
battleground-wisconsin.blogspot.com
battlegroundstates.blogspot.com
beastsbelly.blogspot.com
bentcorner.blogspot.com
bettyblog.com
billmon.org
billsrants.blogspot.com
blog.dccc.org
blog.dennisfox.net
blog.glinka.com
blog.johnkerry.com
blog.kenlan.net
blog.mintruth.com
blog.veggiedude.com
blog01.kintera.com/dnccblog
blog01.kintera.com/emilysblog
bloganonymous.com
blogforamerica.com
blogforarizona.com
blogfreeordie.blogspot.com
blogitics.com
blogs.salon.com/0002874
blogs.salon.com/0003364
blogslut.com
blosteam.blogspot.com
bluegrassroots.blogspot.com
bluelemur.com
blueoregon.com
bluestateredstate.blogspot.com
bodyandsoul.typepad.com
boffo.typepad.com
boloboffin.blogspot.com
bopnews.com
boston.com/news/blogs/dnc
bowles2004.com/weblog
boycottsbg.com
bradcarson.com/blog
bradfriedman.com/bradblog
brilliantatbreakfast.blogspot.com
browncross.com/usualsuspects
btcnews.com/btcnews
bullmooseblog.com
burntorangereport.com
bushlies.net/pages/10/index.htm
bushmisunderestimated.blogspot.com
busybusybusy.com
buzzmachine.com
carpetblogger.com
cbaker.org/blog
cemonitor.blogspot.com
centerpiece.blogspot.com
changeforamerica.com/blog
charleypatton.blogspot.com
charlineandjamie.com/dotnetweb01a/blogdisplay.aspx?logname=jamie&#38;logcatid=48
chepooka.com
chiron.typepad.com
churchofcriticalthinking.com
clarified.blogspot.com
claudialong.com/blog
cleancutkid.com
collegedems.com/blog
conniptions.net
conservativessuck.blogspot.com
contrapositive.blogspot.com
corrente.blogspot.com
counterspin.blogspot.com
crankylittleblog.blogspot.com
criticalviewer.com
crookedtimber.org
crooksandliars.com
dailyblatt.blogspot.com
dailyhowler.com
dailykos.com
dailywarnews.blogspot.com
damntheman.net
dashnier.blogspot.com
davidsirota.com
dawnofnewamerica.blogspot.com
deathpenaltyusa.blogspot.com
debunker.net
dedspace.blogspot.com
delraysteve.com/blog
demagogue.blogspot.com
demleft.blogspot.com
democraticunderground.com
democratreport.blogspot.com
democrats.org/blog
democratvoice.org
dems2000.net
dems2004.org/blog
denisdekat.com
dewar.journalspace.com
digbysblog.blogspot.com
digestiblenews.blogspot.com
digital-democrat.blogspot.com
digitaljay.blogspot.com
dimmykarras.blogspot.com
dir.salon.com/topics/joe_conason/index.html
discourse.net
diypolitics.blogdrive.com
dneiwert.blogspot.com
dogfight04.typepad.com
dohiyimir.typepad.com
dolphinsdock.com/blog
donkey2004.blogspot.com
donspoliticalblog.blogspot.com
duckdaotsu.blogspot.com
dystopia.blog-city.com
eastbaykerry.com
elayneriggs.blogspot.com
elderbearden.blogspot.com
electablog.com
elemming2.blogspot.com
elissa.typepad.com
eltcamerica.blogspot.com
emergingdemocraticmajority.com
emergingdemocraticmajorityweblog.com/donkeyrising
emptyisform.com
endthenightmare.blogspot.com
enemykombatant.blogspot.com
ergio.blogspot.com
estropundit.blogspot.com
etherealgirl.blogspot.com
fafblog.blogspot.com
fairshot.org
farleft.blogspot.com
fiercepoet.com
flagrancy.net
folkbum.blogspot.com
forewarned.blogspot.com
frederickclarkson.com
fromtheroots.org
fuckthisblog.blogspot.com
gadflyer.com
garbage-house.com
georgewbuy.blogspot.com
gisleson.com/blog
global-equality.org/news/blog/index.shtml
goose3five.blogspot.com
grassrootsmom.blogspot.com
grassrootsnation.com
greaterdemocracy.org
greendogdemocrat.blogspot.com
gregpalast.com
grupo-utopia.com/blog/isou
hal9000report.blogdrive.com
hategun.com
hellafaded.blogspot.com
hellskitchennyc.blogspot.com
hereswhatsleft.typepad.com
hereswhatsleft.typepad.com/home
higherpieproductions.com
home.att.net/~gentle.reader
home.earthlink.net/~fsrhine
home.earthlink.net/~kevin.omeara
home.houston.rr.com/skeptical
homepage.mac.com/elisa_camahort/iblog
hotflashesfromthecampaigntrail.blogspot.com
hotflashreport.com
howienfriends.blogspot.com
hugozoom.blogspot.com
ibe.blogspot.com
ibrutus.com
iddybud.blogspot.com
idiosyncratictendencies.com
idlehandsmag.com/yellowdog04
iflipflop.com
ignatz.blogspot.com
ihatepatrobertson.com
ilovecynics.com
insanepreschoolmom.blogspot.com
interestingtimes.blogspot.com
interestingtimes.typepad.com
ironmouth.com
irregulartimes.com/blogger.html
islanddave.blogspot.com
itlookslikethis.blogeasy.com
ivotedforgeorge.com
iwantmycountryback.org
j-bradford-delong.net/movable_type
jadbury.com
jadbury.com/blog
jadedreality.blogspot.com
jameswolcott.com
janm.blogspot.com
jbcougar.blogspot.com
jdeeth.blogspot.com
jeremybrendan.blogspot.com
jetage.blogspot.com
jewssansfrontieres.blogspot.com
jimgilliam.com
jimtreacher.com
jinkythecat.blogspot.com
jmbzine.com
joebrent.blogspot.com
john.hoke.org
johnmccrory.com
johnnynobody.blogspot.com
journals.aol.com/breuvers/madashellnotgoingtotakeitanymore
juancole.com
julietterossant.com/superchefblog/superchefblog.html
juliusblog.blogspot.com
justmark.com
karlhenderson.blogspot.com
karmicsoup.blogspot.com
kcdems.blogspot.com
kensain.com
kerryforpresident2008.blogspot.com
kicktheleftist.blogspot.com
kiosan.blogharbor.com
kippsblog.com
kurtkk.blogspot.com
kydem.blogspot.com
lastdaysoftherepublic.fatoprofugus.net
lawdork.blogspot.com
leanleft.com
left2right.typepad.com
lefterer.com
lefti.blogspot.com
leftinthereign.blogspot.com
leftthought.blogspot.com
lefttimes.com
leftwingnation.com
lendmesomesugar.blogs.com/shut_up_already
lennonreport.blogspot.com
leonards-digest.blogspot.com
liberalangst.org
liberalavenger.com
liberaleric.blogspot.com
liberaloasis.com
liberalreview.blogspot.com
liberalthunderdome.blogspot.com
lightupthedarkness.org
lightupthedarkness.org/blog/default.asp
limericksavant.blogspot.com
liquidlist.com
livejournal.com/users/hoptkov
livejournal.com/users/jmhm
loadedmouth.com
lorenjavier.blogspot.com
loud-liberal.blogspot.com
loveamericahatebush.com
lowrentrat.blogdrive.com
ltobs.blogspot.com
ludis.blogspot.com
ludovicspeaks.typepad.com
ludovicspeaks.typepad.com/real_deal
madkane.com/notable.html
maggiespants.co.uk
mahablog.com
marcbrazeau.blogspot.com
margaretcho.com/blog/blog.htm
margieburns.com
markadams.blogdrive.com
markarkleiman.com
markschmitt.typepad.com/decembrist
markwarnerforpresident2008.blogspot.com
maruthecrankpot.blogspot.com
massachusetts-liberal.com
mathewgross.com
mathewgross.com/blog
matthewyglesias.com
maultasch.us
maxspeak.org/mt
mayflowerhill.blogspot.com
mcwil.blogspot.com
meanspirit.blogspot.com
meateatingleftist.com
mediajunkie.com/edgewise
mediaprima.com/nv1962
messengerpuppet.com/blog
metropol47.blogspot.com
michael124.blogspot.com
michaelberube.com
michaelphillips.blogspot.com
mickikrimmel.com/redcarpet
midguard.blogspot.com
minagahet.blogspot.com
mirocat.com
misschin.bloggage.com
mixtersmix.blogspot.com
montages.blogspot.com
moxiegrrrl.com
mrleft.org
msnbc.msn.com/id/3449870
mstabile.blogspot.com
mydd.com
myleftbrain.com
myleftthoughts.blogspot.com
mysterypollster.com/main
nastyboys.blogspot.com
nathancallahan.com
nathannewman.org/log
nationalreview.com/frum/frum-diary.asp
ndnblog.org
nearouterheaven.blogspot.com
nebursworld.blogspot.com
needlenose.com
nerofiddled.blogspot.com
netpolitik.blogspot.com
netweed.com/postmodernanarchist
newappeal.blogspot.com
newdemocrat.blogdrive.com
newdonkey.com
newleftblogs.blogspot.com
nielsenhayden.com/electrolite
nielsenhayden.com/makinglight
njfordemocracy.org
node707.com
nofearoffreedom.blogspot.com
nomoremister.blogspot.com
norbizness.com
noseyonline.com
nosmallplans.com/rants
notbush.com
notgeniuses.com
nycitystoop.com
obamablog.com
obsidianwings.blogs.com
oddhours.blogspot.com
offthekuff.com/mt
oipp.blogspot.com
oliverwillis.com
one38.org
openeyesmemo.com
orient-lodge.com
ospolitics.org
oxblog.blogspot.com
pacdaily.com
pacificviews.org
pandagon.net
parabasis.typepad.com
patriotboy.blogspot.com
pattymurray.com/blog.php
paulchasman.com
peacegarden.blogspot.com
penheaded.blogspot.com
pennywit.com/drupal
permanentred.com
peskytherat.com/pesky
phillybri.blogspot.com
philoponia.blogspot.com
philosophers-stone.blogspot.com
philvbblog.blogspot.com
phoblographer.com
pinkofeministhellcat.typepad.com/pinko_feminist_hellcat
planetdave.com/blogs.php3?mid=1
politicalmonitor.us/blog
politicalstrategy.org
politicalthought.net
politicalwire.com
politics.feedster.com
politicswithrichard.blogspot.com
polizeros.com
pollsterseek.blogspot.com
polstate.com
portapulpit.com
praguewriter.typepad.com
presidentboxer.blogspot.com
prestoncaldwellblog.blogspot.com
profgoose.blogspot.com
progressiveink.com
progressivesociety.com/blog
proponentofreason.blogspot.com
prospect.org/weblog
pudentilla.blogspot.com
punditician.blogspot.com
purplestates.blogspot.com
qando.net
question-dean.blogspot.com
radgeek.com