Page 1

1

On Modularity Clustering

Ulrik Brandes1, Daniel Delling2, Marco Gaertler2, Robert G¨ orke2,

Martin Hoefer1, Zoran Nikoloski3, Dorothea Wagner2

Abstract—Modularity is a recently introduced quality measure

for graph clusterings. It has immediately received considerable

attention in several disciplines, and in particular in the complex

systems literature, although its properties are not well under-

stood. We study the problem of finding clusterings with maximum

modularity, thus providing theoretical foundations for past and

present work based on this measure. More precisely, we prove

the conjectured hardness of maximizing modularity both in the

general case and with the restriction to cuts, and give an Integer

Linear Programming formulation. This is complemented by first

insights into the behavior and performance of the commonly

applied greedy agglomerative approach.

Index Terms—Graph Clustering, Graph Partitioning, Modu-

larity, Community Structure, Greedy Algorithm

I. INTRODUCTION

Graph clustering is a fundamental problem in the analysis of

relational data. Studied for decades and applied to many settings,

it is now popularly referred to as the problem of partitioning

networks into communities. In this line of research, a novel graph

clustering index called modularity has been proposed recently [1].

The rapidly growing interest in this measure prompted a series

of follow-up studies on various applications and possible adjust-

ments (see, e.g., [2], [3], [4], [5], [6]). Moreover, an array of

heuristic algorithms has been proposed to optimize modularity.

These are based on a greedy agglomeration [7], [8], on spectral

division [9], [10], simulated annealing [11], [12], or extremal

optimization [13] to name but a few prominent examples. While

these studies often provide plausibility arguments in favor of the

resulting partitions, we know of only one attempt to characterize

properties of clusterings with maximum modularity [2]. In partic-

ular, none of the proposed algorithms has been shown to produce

optimal partitions with respect to modularity.

In this paper we study the problem of finding clusterings with

maximum modularity, thus providing theoretical foundations for

past and present work based on this measure. More precisely,

we proof the conjectured hardness of maximizing modularity

both in the general case and the restriction to cuts, and give an

integer linear programming formulation to facilitate optimization

without enumeration of all clusterings. Since the most commonly

employed heuristic to optimize modularity is based on greedy

agglomeration, we investigate its worst-case behavior. In fact, we

give a graph family for which the greedy approach yields an

This work was partially supported by the DFG under grants BR 2158/2-

3, WA 654/14-3, Research Training Group 1042 ”Explorative Analysis and

Visualization of Large Information Spaces” and by EU under grant DELIS

(contract no. 001907).

1Department of Computer & Information Science, University of Konstanz,

{brandes,hoefer}@inf.uni-konstanz.de

2

Facultyof Informatics,

{delling,gaertler,rgoerke,wagner}@ira.uka.de

3Max-Planck Institute for Molecular Plant Physiology, Bioinformatics

Group, nikoloski@mpimp-golm.mpg.de

Universit¨ atKarlsruhe (TH),

approximation factor no better than two. In addition, our examples

indicate that the quality of greedy clusterings may heavily depend

on the tie-breaking strategy utilized. In fact, in the worst case,

no approximation factor can be provided. These performance

studies are concluded by partitioning some previously considered

networks optimally, which does yield further insight.

This paper is organized as follows. Section II shortly introduces

preliminaries, formulations of modularity, an ILP formulation of

the problem. Basic and counterintuitive properties of modularity

are observed in Sect. III. Our NP-completeness proofs are given

in Section IV, followed by an analysis of the greedy approach

in Section V. The theoretical investigation is extended by char-

acterizations of the optimum clusterings for cliques and cycles

in Section VI. Our work is concluded by revisiting examples

from previous work in Section VII and a brief discussion in

Section VIII.

II. PRELIMINARIES

Throughout this paper, we will use the notation of [14]. More

precisely, we assume that G = (V,E) is an undirected connected

graph with n := |V | vertices, m := |E| edges. Denote by C =

{C1,...,Ck} a partition of V . We call C a clustering of G and

the Ci, which are required to be non-empty, clusters; C is called

trivial if either k = 1 or k = n. We denote the set of all possible

clusterings of a graph G with A(G). In the following, we often

identify a cluster Ciwith the induced subgraph of G, i.e., the

graph G[Ci] := (Ci,E(Ci)), where E(Ci) := {{v,w} ∈ E :

v,w ∈ Ci}. Then E(C) :=?k

intra-cluster edges is denoted by m(C) and the number of inter-

cluster edges by m(C). The set of edges that have one end-node

in Ciand the other end-node in Cjis denoted by E(Ci,Cj).

i=1E(Ci) is the set of intra-cluster

edges and E \E(C) the set of inter-cluster edges. The number of

A. Definition of Modularity

Modularity is a quality index for clusterings. Given a simple

graph G = (V,E), we follow [1] and define the modularity q(C)

of a clustering C as

q(C) :=

?

Note that C?ranges over all clusters, so that edges in E(C)

are counted twice in the squared expression. This is to adjust

proportions, since edges in E(C,C?), C ?= C?, are counted twice

as well, once for each ordering of the arguments. Note that we

can rewrite Equation (1) into the more convenient form

?

m

C∈C

|E(C)|

m

−

?

|E(C)| +?

C?∈C|E(C,C?)|

2m

?2

.

(1)

q(C) =

?

C∈C

|E(C)|

−

??

v∈Cdeg(v)

2m

?2?

.

(2)

Page 2

2

This reveals an inherent trade-off: To maximize the first term,

many edges should be contained in clusters, whereas the mini-

mization of the second term is achieved by splitting the graph

into many clusters with small total degrees each. Note that the

first term |E(C)|/m is also known as coverage [14].

B. Maximizing Modularity via Integer Linear Programming

The problem of maximizing modularity can be cast into a

very simple and intuitive integer linear program (ILP). Given a

graph G = (V,E) with n := |V | nodes, we define n2decision

variables Xuv ∈ {0,1}, one for every pair of nodes u,v ∈ V .

The key idea is that these variables can be interpreted as an

equivalence relation (over V ) and thus form a clustering. In order

to ensure consistency, we need the following constraints, which

guarantee

reflexivity

symmetry

∀ u: Xuu = 1 ,

∀ u,v: Xuv = Xvu , and

1

2m

(u,v)∈V2

?

0

, otherwise

transitivity

∀ u,v,w:

Xuv+ Xvw− 2 · Xuw

Xuw+ Xuv− 2 · Xvw

Xvw+ Xuw− 2 · Xuv

≤

≤

≤

1

1

1

.

The objective function of modularity then becomes

?

1

?

, if (u,v) ∈ E

Euv−deg(u)deg(v)

2m

?

Xuv ,

with

Euv =.

Note that this ILP can be simplified by pruning redundant

variables and constraints, leaving only

constraints.

?n

2

?

variables and

?n

3

?

III. FUNDAMENTAL OBSERVATIONS

In the following, we identify basic structural properties that

clusterings with maximum modularity fulfill. We first focus on

the range of modularity, for which Lemma 3.1 gives the lower

and upper bound.

Lemma 3.1: Let G be an undirected and unweighted graph

and C ∈ A(G). Then −1/2 ≤ q(C) ≤ 1 holds.

Proof: Let mi = |E(C)| be the number of edges inside

cluster C and me =

C?=C?∈C

edges having exactly one end-node in C. Then the contribution

of C to q(C) is:

mi

m−

This expression is strictly decreasing in meand, when varying mi,

the only maximum point is at mi= (m−me)/2. The contribution

of a cluster is minimized when miis zero and me is as large as

possible. Suppose now mi= 0, using the inequality (a + b)2≥

a2+ b2for all non-negative numbers a and b, modularity has a

minimum score for two clusters where all edges are inter-cluster

edges. The upper bound is obvious from our reformulation in

Equation (2), and has been observed previously [2], [3], [15]. It

can only be actually attained in the specific case of a graph with

no edges, where coverage is defined to be 1.

As a result, any bipartite graph Ka,bwith the canonic clustering

C = {Ca,Cb} yields the minimum modularity of −1/2. The

following four results characterize the structure of a clustering

with maximum modularity.

?

??E(C,C?)??be the number of

?2

?mi

m+me

2m

.

Corollary 3.2: Isolated nodes have no impact on modularity.

Corollary 3.2 directly follows from the fact that modularity

depends on edges and degrees, thus, an isolated node does not

contribute, regardless of its association to a cluster. Therefore, we

exclude isolated nodes from further consideration in this work,

i.e., all nodes are assumed to be of degree greater than zero.

Lemma 3.3: A clustering with maximum modularity has no

cluster that consists of a single node with degree 1.

Proof: Suppose for contradiction that there is a clustering C

with a cluster Cv = {v} and deg(v) = 1. Consider a cluster

Cu that contains the neighbor node u. Suppose there are a

number of miintra-cluster edges in Cuand meinter-cluster edges

connecting Cu to other clusters. Together these clusters add

mi

m−(2mi+ me)2+ 1

4m2

to q(C). Merging Cv with Cu results in a new contribution of

mi+ 1

m

The merge yields an increase of

−(2mi+ me+ 1)2

4m2

1

m−2mi+ me

2m2

> 0

in modularity, because mi+ me ≤ m and me ≥ 1. This proves

the lemma.

Lemma 3.4: There is always a clustering with maximum mod-

ularity, in which each cluster consists of a connected subgraph.

Proof:Consider for contradiction a clustering C with a

cluster C of miintra- and me inter-cluster edges that consists

of a set of more than one connected subgraph. The subgraphs in

C do not have to be disconnected in G, they are only disconnected

when we consider the edges E(C). Cluster C adds

mi

m−(2mi+ me)2

4m2

to q(C). Now suppose we create a new clustering C?by splitting C

into two new clusters. Let one cluster Cvconsist of the component

including node v, i.e. all nodes, which can be reached from a

node v with a path running only through nodes of C, i.e. Cv =

?∞

Let Cv have mv

new clusters add

m−(2mv

to q?C??. For a,b ≥ 0 obviously a2+ b2≤ (a + b)2, and hence

Corollary 3.5: A clustering of maximum modularity does not

include disconnected clusters.

Corollary 3.5 directly follows from Lemma 3.4 and from the

exclusion of isolated nodes. Thus, the search for an optimum

can be restricted to clusterings, in which clusters are connected

subgraphs and there are no clusters consisting of nodes with

degree 1.

i=1Ci

and C0

v, where Ci

v= {v}. The other nonempty cluster is given by C − Cv.

iintra- and mv

v= {w | ∃(w,wi) ∈ E(C) with wi∈ Ci−1

v

}

einter-cluster edges. Together the

mi

i+ mv

e)2+ (2(m − mv

4m2

i) + m − mv

e)2

q?C??≥ q(C).

A. Counterintuitive Behavior

In the last section, we listed some intuitive properties like

connectivity within clusters for clusterings of maximum modular-

ity. However, due to the enforced balance between coverage and

the sums of squared cluster degrees, counter-intuitive situations

Page 3

3

(a)(b)

(c) (d)

Fig. 1.

behavior. Clusters are represented by colours.

(a,b) Non-local behavior; (c) a clique K3 with leaves; (d) scaling

arise. These are non-locality, scaling behavior, and sensitivity to

satellites.

a) Non-Locality.: At first view, modularity seems to be

a local quality measure. Recalling Equation (2), each cluster

contributes separately. However, the example presented in Fig-

ures 1(a) and 1(b) exhibit a typical non-local behavior. In these

figures, clusters are represented by color. By adding an additional

node connected to the leftmost node, the optimal clustering is

altered completely. According to Lemma 3.3 the additional node

has to be clustered together with the leftmost node. This leads to

a shift of the rightmost black node from the black cluster to the

white cluster, although locally its neighborhood structure has not

changed.

b) Sensitivity to Satellites.: A clique with leaves is a graph

of 2n nodes that consists of a clique Kn and n leaf nodes of

degree one, such that each node of the clique is connected to

exactly one leaf node. For a clique we show in Section VI that

the trivial clustering with k = 1 has maximum modularity. For

a clique with leaves, however, the optimal clustering changes to

k = n clusters, in which each cluster consists of a connected pair

of leaf and clique nodes. Figure 1(c) shows an example.

c) Scaling Behavior.: Figures 1(c) and 1(d) display the

scaling behavior of modularity. By simply doubling the graph pre-

sented in Figure 1(c), the optimal clustering is altered completely.

While in Figure 1(c) we obtain three clusters each consisting of

the minor K2, the clustering with maximum modularity of the

graph in Figure 1(d) consists of two clusters, each being a graph

equal to the one in Figure 1(c).

This behavior is in line with the previous observations in [2],

[4] that size and structure of clusters in the optimum clustering

depend on the total number of links in the network. Hence,

clusters that are identified in smaller graphs might be combined

to a larger cluster in a optimum clustering of a larger graph.

The formulation of Equation 2 mathematically explains this

observation as modularity optimization strives to optimize the

trade-off between coverage and degree sums. This provides a

rigorous understanding of the observations made in [2], [4].

IV. NP-COMPLETENESS

It has been conjectured that maximizing modularity is hard [8],

but no formal proof was provided to date. We next show that

that decision version of modularity maximization is indeed NP-

complete.

Fig. 2.

3-PARTITION. Node labels indicate the corresponding numbers ai∈ A.

An example graph G(A) for the instance A = {2,2,2,2,3,3} of

Problem 1 (MODULARITY): Given a graph G and a number

K, is there a clustering C of G, for which q(C) ≥ K?

Note that we may ignore the fact that, in principle, K could

be a real number in the range [−1/2,1], because 4m2· q(C) is

integer for every partition C of G and polynomially bounded in

the size of G. Our hardness result for MODULARITY is based on

a transformation from the following decision problem.

Problem 2 (3-PARTITION): Given 3k positive integer numbers

a1,...,a3ksuch that the sum?3k

these numbers into k sets, such that the numbers in each set sum

up to b?

We show that an instance A = {a1,...,a3k} of 3-PARTITION can

be transformed into an instance (G(A),K(A)) of MODULARITY,

such that G(A) has a clustering with modularity at least K(A),

if and only if a1,...,a3kcan be partitioned into k sets of sum

b = 1/k ·?k

i.e. the problem remains NP-complete even if the input is

represented in unary coding. This implies that no algorithm can

decide the problem in time polynomial even in the sum of the

input values, unless P = NP. More importantly, it implies that

our transformation need only be pseudo-polynomial.

The reduction is defined as follows. Given an instance A of 3-

PARTITION, construct a graph G(A) with k cliques (completely

connected subgraphs) H1,...,Hkof size a =?3k

connect it to ainodes in each of the k cliques in such a way that

each clique member is connected to exactly one element node.

It is easy to see that each clique node then has degree a and

the element node corresponding to element ai∈ A has degree

kai. The number of edges in G(A) is m = k/2 · a(a + 1). See

Figure 2 for an example. Note that the size of G(A) is polynomial

in the unary coding size of A, so that our transformation is indeed

pseudo-polynomial.

Before specifying bound K(A) for the instance of MODULAR-

ITY, we will show three properties of maximum modularity clus-

terings of G(A). Together these properties establish the desired

characterization of solutions for 3-PARTITION by solutions for

MODULARITY.

i=1ai= kb and b/4 < ai< b/2

for an integer b and for all i = 1,...,3k, is there a partition of

i=1aieach.

It is crucial that 3-PARTITION is strongly NP-complete [16],

i=1aieach. For

each element ai∈ A we introduce a single element node, and

Page 4

4

Lemma 4.1: In a maximum modularity clustering of G(A),

none of the cliques H1,...,Hkis split.

We prove the lemma by showing that every clustering that violates

the above condition can be modified in order to strictly improve

modularity.

Proof: We consider a clustering C that splits a clique

H ∈ {H1,...,Hk} into different clusters and then show how

to obtain a clustering with strictly higher modularity. Suppose

that C1,...,Cr ∈ C, r > 1, are the clusters that contain nodes of

H. For i = 1,...,r we denote by nithe number of nodes of H

contained in cluster Ci, mi= |E(Ci)| the number edges between

nodes in Ci, fithe number of edges between nodes of H in Ci

and element nodes in Ci, dibe the sum of degrees of all nodes

in Ci. The contribution of C1,...,Cr to q(C) is

1

m

r

?

i=1

mi−

1

4m2

r

?

i=1

d2

i .

Now suppose we create a clustering C?by rearranging the nodes

in C1,...,Cr into clusters C?,C?

exactly the nodes of clique H, and each C?

remaining elements of Ci (if any). In this new clustering the

number of covered edges reduces by?r

connecting the clique nodes to other non-clique nodes of Cias

inter-cluster edges. For H itself there are

edges that are now additionally covered due to the creation of

cluster C?. In terms of degrees the new cluster C?contains a

nodes of degree a. The sums for the remaining clusters C?

reduced by the degrees of the clique nodes, as these nodes are

now in C?. So the contribution of these clusters to q?C??is given

1,...,C?r, such that C?contains

i, 1 ≤ i ≤ r, the

i=1fi, because all nodes

i. This labels the edgesfrom H are removed from the clusters C?

?r

i=1

?r

j=i+1ninj

iare

by

1

m

r

?

i=1

mi+

a4+

r

?

r

?

j=i+1

ninj− fi

?

−

1

4m2

?

i=1

(di− nia)2

.

Setting ∆ := q?C??− q(C), we obtain

∆=

1

m

1

4m2

r

?

i=1

r

?

?

?

?

j=i+1

??

ninj− fi

+

r

i=1

r

2dinia − n2

ia2

?

− a4

?

=

1

4m2

?

?

(4m

i=1

2dia − nia2??

r

?

j=i+1

ninj− 4m

r

?

?

i=1

fi

+

r

?

i=1

ni

− a4

.

Using

?r

the

j?=ininj, substituting m =k

equationthat

2?r

i=1

?r

j=i+1ninj

=

i=1

?

2a(a + 1) and rearranging

terms we get

∆=

a

4m2

?

− a3− 2k(a + 1)

?

r

?

i=1

fi

+

r

?

a

4m2

i=1

ni

2di− nia + k(a + 1)

?

j?=i

nj

??

≥

?

− a3− 2k(a + 1)

?

r

?

i=1

fi

+

r

?

i=1

ni

nia + 2kfi+ k(a + 1)

r

?

j?=i

nj

??

.

For the last inequality we use the fact that di≥ nia + kfi. This

inequality holds because Ci contains at least the ni nodes of

degree a from the clique H. In addition, it contains both the

clique and element nodes for each edge counted in fi. For each

such edge there are k − 1 other edges connecting the element

node to the k − 1 other cliques. Hence, we get a contribution of

kfiin the degrees of the element nodes. Combining the terms ni

and one of the terms?

j?=injwe obtain

∆

≥

a

4m2

?

− a3− 2k(a + 1)

?

i=1

r

?

i=1

fi

?

+

a

4m2

r

?

ni

?

a

r

?

j=1

nj+ 2kfi

+((k − 1)a + k)

r

?

j?=i

nj

??

=

a

4m2

?

− 2k(a + 1)

r

?

i=1

fi

+

r

?

i=1

ni

?

2kfi+ ((k − 1)a + k)

r

?

j?=i

nj

??

=

a

4m2

?

r

?

i=1

2kfi(ni− a − 1))

+((k − 1)a + k)

r

?

i=1

r

?

j?=i

ninj

?

≥

a

4m2

?

r

?

i=1

2kni(ni− a − 1)

+((k − 1)a + k)

r

?

i=1

r

?

j?=i

ninj

?

,

For the last step we note that ni≤ a − 1 and ni− a − 1 < 0

for all i = 1,...,r. So increasing fi decreases the modularity

difference. For each node of H there is at most one edge to a

node not in H, and thus fi≤ ni.

Page 5

5

By rearranging terms and using the inequality a ≥ 3k we get

?

∆

≥

a

4m2

r

?

i=1

ni

2k(ni− a − 1)

+((k − 1)a + k)

r

?

j?=i

nj

?

=

a

4m2

r

?

i=1

ni

−2k + ((k − 1)a − k)

4m2((k − 1)a − 3k)

r

?

j?=i

nj

≥

a

r

?

i=1

r

?

j?=i

ninj

≥

3k2

4m2(3k − 6)

r

?

i=1

r

?

j?=i

ninj .

As we can assume k > 2 for all relevant instances of 3-

PARTITION, we obtain ∆ > 0. This shows that any clustering can

be improved by merging each clique completely into a cluster.

Next, we observe that the optimum clustering places at most one

clique completely into a single cluster.

Lemma 4.2: In a maximum modularity clustering of G(A),

every cluster contains at most one of the cliques H1,...,Hk.

Proof: Consider a maximum modularity clustering.

Lemma 4.1 shows that each of the k cliques H1,...,Hk is

entirely contained in one cluster. Assume that there is a cluster

C which contains at least two of the cliques. If C does not

contain any element nodes, then the cliques form disconnected

components in the cluster. In this case it is easy to see that the

clustering can be improved by splitting C into distinct clusters,

one for each clique. In this way we keep the number of edges

within clusters the same, however, we reduce the squared degree

sums of clusters.

Otherwise, we assume C contains l > 1 cliques completely and

in addition some element nodes of elements aj with j ∈ J ⊆

{1,...,k}. Note that inside the l cliques la(a − 1)/2 edges are

covered. In addition, for every element node corresponding to an

element ajthere are lajedges included. The degree sum of the

cluster is given by the la clique nodes of degree a and some

number of element nodes of degree kaj. The contribution of C

to q(C) is thus given by

j∈J

1

m

l

2a(a − 1) + l

?

aj

−

1

4m2

la2+ k

?

j∈J

aj

2

.

Now suppose we create C?by splitting C into C?

that C?

number of edges covered within the cliques the same, however,

all edges from H to the included element nodes eventually drop

out. The degree sum of C?

of C?

1and C?

2such

1completely contains a single clique H. This leaves the

1is exactly a2, and so the contribution

2to q?C??is given by

1

m

j∈J

1and C?

l

1

4m2

2a(a − 1) + (l − 1)

?

?

aj

−

(l − 1)a2+ k

j∈J

aj

2

+ a4

.

Considering the difference we note that

q?C??− q(C)=

−1

m

?

j∈J

1

4m2

2(l − 1)a4+ 2ka2?

−4m?

2(l − 1)a4− 2ka?

9k3

2m2(9k − 1)

0,

aj

+

?

(2l − 1)a4+ 2ka2?

j∈J

aj− a4?

=

j∈Jaj

4m2

j∈Jaj

4m2

=

j∈Jaj

4m2

≥

>

as k > 0 for all instances of 3-PARTITION.

Since the clustering is improved in every case, it is not optimal.

This is a contradiction.

The previous two lemmas show that any clustering can be

strictly improved to a clustering that contains k clique clusters,

such that each one completely contains one of the cliques

H1,...,Hk(possibly plus some additional element nodes). In

particular, this must hold for the optimum clustering as well. Now

that we know how the cliques are clustered we turn to the element

nodes.

As they are not directly connected, it is never optimal to create a

cluster consisting only of element nodes. Splitting such a cluster

into singleton clusters, one for each element node, reduces the

squared degree sums but keeps the edge coverage at the same

value. Hence, such a split yields a clustering with strictly higher

modularity. The next lemma shows that we can further strictly

improve the modularity of a clustering with a singleton cluster of

an element node by joining it with one of the clique clusters.

Lemma 4.3: In a maximum modularity clustering of G(A),

there is no cluster composed of element nodes only.

Proof: Consider a clustering C of maximum modularity and

suppose that there is an element node vi corresponding to the

element ai, which is not part of any clique cluster. As argued

above we can improve such a clustering by creating a singleton

cluster C = {vi}. Suppose Cminis the clique cluster, for which

the sum of degrees is minimal. We know that Cmincontains all

nodes from a clique H and eventually some other element nodes

for elements aj with j ∈ J for some index set J. The cluster

Cmincovers all a(a − 1)/2 edges within H and?

k?

of C and Cminto q(C) of

?

2

j∈J

Again, we create a different clustering C?by joining C and Cmin

to a new cluster C?. This increases the edge coverage by ai. The

new cluster C?has the sum of degrees of both previous clusters.

The contribution of C?to q?C??is given by

1

m2

j∈J

j∈Jajedges

to element nodes. The degree sum is a2for clique nodes and

j∈Jajfor element nodes. As C is a singleton cluster, it covers

no edges and the degree sum is kai. This yields a contribution

1

m

a(a − 1)

+

?

aj

?

−

1

4m2

??

a2+ k

?

j∈J

aj

?2

+ k2a2

i

?

.

?

a(a − 1)

+ ai+

?

aj

?

−

1

4m2

?

a2+ kai+ k

?

j∈J

aj

?2

,

Page 6

6

so that

q?C??− q(C)=

ai

m−

1

4m2

?

?

2ka2ai+ 2k2ai

?

j∈J

aj

?

=

1

4m2

2ka(a + 1)ai− 2ka2ai

−2k2ai

?

j∈J

aj

?

=

ai

4m2

2ka − 2k2?

j∈J

aj

.

At this point recall that Cmin is the clique cluster with the

minimum degree sum. For this cluster the elements corresponding

to included element nodes can never sum to more than a/k. In

particular, as viis not part of any clique cluster, the elements of

nodes in Cmincan never sum to more than (a − ai)/k. Thus,

?

and so q?C??−q(C) > 0. This contradicts the assumption that C

We have shown that for the graphs G(A) the clustering of

maximum modularity consists of exactly k clique clusters, and

each element node belongs to exactly one of the clique clusters.

Combining the above results, we now state our main result:

Theorem 4.4: MODULARITY is strongly NP-complete.

Proof: For a given clustering C of G(A) we can check in

polynomial time whether q(C) ≥ K(A), so clearly MODULAR-

ITY ∈ NP.

For NP-completeness we transform an instance A

{a1,...,a3k} of 3-PARTITION into an instance (G(A),K(A)) of

MODULARITY. We have already outlined the construction of the

graph G(A) above. For the correct parameter K(A) we consider

a clustering in G(A) with the properties derived in the previous

lemmas, i.e., a clustering with exactly k clique clusters. Any such

clustering yields exactly (k −1)a inter-cluster edges, so the edge

coverage is given by

?

= 1 −2(k − 1)a

Hence, the clustering C = (C1,...,Ck) with maximum modular-

ity must minimize deg(C1)2+ deg(C2)2+ ... + deg(Ck)2. This

requires a distribution of the element nodes between the clusters

which is as even as possible with respect to the sum of degrees

per cluster. In the optimum case we can assign to each cluster

element nodes corresponding to elements that sum to b = 1/k·a.

In this case the sum up of degrees of element nodes in each clique

cluster is equal to k · 1/k · a = a. This yields deg(Ci) = a2+ a

for each clique cluster Ci, i = 1,...,k, and gives

j∈J

aj≤1

k(a − ai) <1

ka ,

is optimal.

=

C∈C

|E(C)|

m

=m − (k − 1)a

m

ka(a + 1)= 1 −

2k − 2

k(a + 1)

.

deg(C1)2+ ... + deg(Ck)2≥ k(a2+ a)2= ka2(a + 1)2.

Equality holds only in the case, in which an assignment of b to

each cluster is possible. Hence, if there is a clustering C with

q(C) of at least

2k − 2

K(A) = 1 −

k(a + 1)−ka2(a + 1)2

k2a2(a + 1)2=(k − 1)(a − 1)

k(a + 1)

then we know that this clustering must split the element nodes

perfectly to the k clique clusters. As each element node is

contained in exactly one cluster, this yields a solution for the

instance of 3-PARTITION. With this choice of K(A) the instance

(G(A),K(A)) of MODULARITY is satisfiable only if the instance

A of 3-PARTITION is satisfiable.

Otherwise, suppose the instance for 3-PARTITION is satisfiable.

Then there is a partition into k sets such that the sum over each

set is 1/k · a. If we cluster the corresponding graph by joining

the element nodes of each set with a different clique, we get

a clustering of modularity K(A). This shows that the instance

(G(A),K(A)) of MODULARITY is satisfiable if the instance A

of 3-PARTITION is satisfiable. This completes the reduction and

proves the theorem.

This result naturally holds also for the straightforward gen-

eralization of maximizing modularity in weighted graphs [17].

Instead of using the numbers of edges the definition of modularity

employs the sum of edge weights for edges within clusters,

between clusters and in the total graph.

A. Special Case: Modularity with Bounded Number of Clusters

A common clustering approach is based on iteratively identi-

fying cuts with respect to some quality measures, see for exam-

ple [18], [19], [20]. The general problem being NP-complete, we

now complete our hardness results by proving that the restricted

optimization problem is hard as well. More precisely, we consider

the two problems of computing the clustering with maximum

modularity that splits the graph into exactly or at most two

clusters. Although these are two different problems, our hardness

result will hold for both versions, hence, we define the problem

cumulatively.

Problem 3 (k-MODULARITY): Given a graph G and a number

K, is there a clustering C of G into exactly/at most k clusters,

for which q(C) ≥ K?

We provide a proof using a reduction that is similar to the one

given recently for showing the hardness of the MinDisAgree[2]

problem of correlation clustering [21]. We use the problem MIN-

IMUM BISECTION FOR CUBIC GRAPHS (MB3) for the reduction:

Problem 4 (MINIMUM BISECTION FOR CUBIC GRAPHS):

Given a 3-regular graph G with n nodes and an integer c, is

there a clustering into two clusters of n/2 nodes each such that

it cuts at most c edges?

This problem has been shown to be strongly NP-complete in [22].

We construct an instance of 2-MODULARITY from an instance of

MB3 as follows. For each node v from the graph G = (V,E) we

attach n−1 new nodes and construct an n-clique. We denote these

cliques as cliq(v) and refer to them as node clique for v ∈ V .

Hence, in total we construct n different new cliques, and after

this transformation each node from the original graph has degree

n+2. Note that a cubic graph with n nodes has exactly 1.5n edges.

In our adjusted graph there are exactly m = (n(n − 1) + 3)n/2

edges.

We will show that an optimum clustering which is denoted

as C∗of 2-MODULARITY in the adjusted graph has exactly

two clusters. Furthermore, such a clustering corresponds to a

minimum bisection of the underlying MB3 instance. In particular,

we give a bound K such that the MB3 instance has a bisection

cut of size at most c if and only if the corresponding graph has

2-modularity at least K.

Page 7

7

We begin by noting that there is always a clustering C with

q(C) > 0. Hence, C∗must have exactly two clusters, as no more

than two clusters are allowed. This serves to show that our proof

works for both versions of 2-modularity, in which at most or

exactly two clusters must be found.

Lemma 4.5: For every graph constructed from a MB3 instance,

there exists a clustering C = {C1,C2} such that q(C) > 0. In

particular, the clustering C∗has two clusters.

Proof: Consider the following partition into two clusters.

We pick the nodes of cliq(v) for some v ∈ V as C1 and the

remaining graph as C2. Then

q(C)=1 −3

−(n(n − 1) + 3)2+ ((n − 1)(n(n − 1) + 3))2

4m2

2n − 2

n2

0 ,

m

=

−3

m=2

n−

2

n2−3

m

>

as n ≥ 4 for every cubic graph. Hence q(C) > 0 and the lemma

follows.

Next, we show that in an optimum clustering, all the nodes of

one node clique cliq(v) are located in one cluster:

Lemma 4.6: For every node v ∈ V there exists a cluster C ∈

C∗such that cliq(v) ⊆ C.

Proof: For contradiction we assume a node clique cliq(v)

for some v ∈ V is split in two clusters C1 and C2 of the

clustering C = {C1,C2}. Let ki:= |Ci∩ cliq(v)| be the number

of nodes located in the corresponding clusters, with 1 ≤ ki≤

n − 1. Note that k2= n − k1. In addition, we denote the sum of

node degrees in both clusters excluding nodes from cliq(v) by d1

and d2:

?

di=

u∈Ci,u?∈cliq(v)

deg(u).

Without loss of generality assume that d1≥ d2. Finally, we denote

by m?the number of edges covered by the clusters C1and C2.

We define a new clustering C?as {C1\ cliq(v),C2∪ cliq(v)}

and denote the difference of the modularity as ∆ := q?C??−q(C).

was located with respect to C: In the first case v ∈ C2 and we

obtain:

We distinguish two cases depending in which cluster the node v

q(C)=

m?

m−(d1+ k1(n − 1))2

+(d2+ (n − k1)(n − 1) + 3)2

4m2

m?+ k1(n − k1)

m

−d2

4m2

k1(n − k1)

m

+(d1+ k1(n − 1))2

4m2

+(d2+ (n − k1)(n − 1) + 3)2

4m2

4m2

,

q?C??

=

1+ (d2+ n(n − 1) + 3)2

and

∆=

−d2

1+ (d2+ n(n − 1) + 3)2

4m2

.

We simplify expression of ∆ as follows:

?

+(d1+ k1(n − 1))2

+(d2+ (n − k1)(n − 1) + 3)2?

=

4m2

−6k1(n − 1) + 2(d1− d2)k1(n − 1)

k1

4m2

−2(n − k1)(n − 1)2− 6(n − 1)

We can bound the expression in the bracket in the following way

by using the assumption that d1≥ d2and 1 ≤ k1≤ n − 1:

(n − k1)

≥ (n − k1)

?

and, thus, it remains to show that B > 0. By filling in the value

of m and using the facts that 2n2(n − 1) > 2(n − 1)2and 6n >

6(n − 1) for all n ≥ 4, we obtain B > 0 and thus modularity

strictly improves if all nodes are moved from cliq(v) to C2.

In the second case the node v ∈ C1and we get the following

equations:

∆=

1

4m2

4mk1(n − k1) − d2

1− (d2+ n(n − 1) + 3)2

1

?

4mk1(n − k1) + (2k2

1− 2nk1)(n − 1)2

?

≥

?

4m(n − k1)

?

.

?

4m − 2(n − 1)2?

4m − 2(n − 1)2− 6(n − 1)

=:B

− 6(n − 1)

?

???

?

(3)

q(C)=

m?

m−(d1+ k1(n − 1) + 3)2

+(d2+ (n − k1)(n − 1))2

4m2

m?+ k1(n − k1)

m

−d2

4m2

k1(n − k1)

m

+(d1+ k1(n − 1) + 3)2

4m2

+(d2+ (n − k1)(n − 1))2

4m2

4m2

,

q?C??

=

1+ (d2+ n(n − 1) + 3)2

, and

∆=

−d2

1+ (d2+ n(n − 1) + 3)2

4m2

.

We simplify expression of ∆ as follows:

4m2∆=4mk1(n − k1) + (2k2

−6(n − k1)(n − 1)

+2(d1− d2)(k1(n − 1) + 3)

4mk1(n − k1) − 2k1(n − k1)(n − 1)2

−6(n − k1)(n − 1))

1− 2nk1)(n − 1)2

≥

Recall 1 ≤ k1≤ n − 1, and filling in the value of m, we obtain

4mk1− 2k1(n − 1)2− 6(n − 1)

= 2k1(n2(n − 1) − (n − 1)2) + 6nk1− 6(n − 1) > 0 ,

which holds for all k1 ≥ 1 and n ≥ 4. Also in this case,

modularity strictly improves if all nodes are moved from cliq(v)

to C2.

The final lemma before defining the appropriate input param-

eter K for the 2-MODULARITY and thus proving the correspon-

dence between the two problems shows that the clusters in the

optimum clusterings have the same size.

Page 8

8

Lemma 4.7: In C∗, each cluster contains exactly n/2 complete

node cliques.

Proof: Suppose for contradiction that one cluster C1 has

l1 < n/2 cliques. For completeness of presentation we use m?

to denote the unknown (and irrelevant) number of edges covered

by the clusters. For the modularity of the clustering is given in

Equation (4).

q?C∗?

=

m?

m−l2

−(n − l1)2(n(n − 1) + 3)2

4m2

1(n(n − 1) + 3)2

4m2

(4)

We create a new clustering C?by transferring a complete node

clique from cluster C2to cluster C1. As the graph G is 3-regular,

we lose at most 3 edges in the coverage part of modularity:

q?C??

≥

m?− 3

m

+(n − l1− 1)2(n(n − 1) + 3)2

4m2

−(l1+ 1)2(n(n − 1) + 3)2

4m2

(5)

.

We can bound the difference in the following way:

q?C??− q(C)

≥−3

−(n − l1− 1)2)(n(n − 1) + 3)2

4m2

−3

n2

−3

0 ,

m+(l2

1+ (n − l1)2

4m2

=

m+(2n − 4l1− 2)

2

n2=

≥

>

m+

2

n2−

6

n3− n2+ 3n

for all n ≥ 4. The analysis uses the fact that we can assume n to

be an even number, so l1≤n

This shows that we can improve every clustering by balancing

the number of complete node cliques in the clusters – independent

of the loss in edge coverage.

Finally, we can state theorem about the complexity of 2-

MODULARITY:

Theorem 4.8: 2-MODULARITY is strongly NP-complete.

Proof: Let (G,c) be an instance of MINIMUM BISECTION

FOR CUBIC GRAPHS, then we construct a new graph G?as stated

above and define K := 1/2 − c/m.

As we have shown in Lemma 4.7 that each cluster of C∗that

is an optimum clustering of G?with respect to 2-MODULARITY

has exactly n/2 complete node cliques, the sum of degrees in

the clusters is exactly m. Thus, it is easy to see that if the

clustering C∗meets the following inequality

q?C∗?≥ 1 −c

then the number of inter-cluster edges can be at most c. Thus

the clustering C∗induces a balanced cut in G with at most c cut

edges.

This proof is particularly interesting as it highlights that max-

imizing modularity in general is hard due to the hardness of

minimizing the squared degree sums on the one hand, whereas in

the case of two clusters this is due to the hardness of minimizing

the edge cut.

2− 1 and thus 4l1≤ 2n − 4.

m−2m2

4m2=1

2−c

m= K ,

V. THE GREEDY ALGORITHM

In contrast to the abovementioned iterative cutting strategy,

another commonly used approach to find clusterings with good

quality scores is based on greedy agglomeration [14], [23]. In the

case of modularity, this approach is particularly widespread [7],

[8].

Algorithm 1: GREEDY ALGORITHM FOR MAXIMIZING

MODULARITY

Input: graph G = (V,E)

Output: clustering C of G

C ← singletons

initialize matrix ∆

while |C| > 1 do

find {i,j} with ∆i,jis the maximum entry in the matrix

∆

merge clusters i and j

update ∆

return clustering with highest modularity

The greedy algorithm starts with the singleton clustering and

iteratively merges those two clusters that yield a clustering with

the best modularity, i.e., the largest increase or the smallest

decrease is chosen. After n−1 merges the clustering that achieved

the highest modularity is returned. The algorithm maintains a

symmetric matrix ∆ with entries ∆i,j

where C is the current clustering and Ci,j is obtained from C

by merging clusters Ciand Cj. Note that there can be several

pairs i and j such that ∆i,jis the maximum, in these cases the

algorithm selects an arbitrary pair. The pseudo-code for the greedy

algorithm is given in Algorithm 1. An efficient implementation

using sophisticated data-structures requires O

Note that, n−1 iterations is an upper bound and one can terminate

the algorithms when the matrix ∆ contains only non-positive

entries. We call this property single-peakedness, it is proven in [8].

Since it is NP-hard to maximize modularity in general graphs,

it is unlikely that this greedy algorithm is optimal. In fact, we

sketch a graph family, where the above greedy algorithm has

an approximation factor of 2, asymptotically. In order to prove

this statement, we introduce a general construction scheme given

in Definition 5.2. Furthermore, we point out instances where a

specific way of breaking ties of merges yield a clustering with

modularity of 0, while the optimum clustering has a strictly

positive score.

Modularity is defined such that it takes values in the interval

[−1/2,1] for any graph and any clustering. In particular the

modularity of a trivial clustering placing all vertices into a single

cluster has a value of 0. We use this technical peculiarity to show

that the greedy algorithm has an unbounded approximation ratio.

Theorem 5.1: There is no finite approximation factor for the

greedy algorithm for finding clusterings with maximum modular-

ity.

Proof: We present a class of graphs, on which the algorithm

obtains a clustering of value 0, but for which the optimum

clustering has value close to 1/2. A graph G of this class is given

by two cliques (V1,E1) and (V2,E2) of size |V1| = |V2| = n/2,

and n/2 matching edges Em connecting each vertex from V1

to exactly one vertex in V2and vice versa. See Figure 3 for an

example with n = 14. Note that we can define modularity by

associating weights w(u,v) with every existing and non-existing

edge in G as follows:

:= q?Ci,j

?

− q(C),

?

n2logn

?

runtime.

w(u,v) =Euv

2m−deg(u)deg(v)

4m2

,

Page 9

9

(a)(b)

Fig. 3.

close to1

(a) Clustering with modularity 0; (b) Clustering with modularity

2

where Euv = 1 if (u,v) ∈ E and 0 otherwise. The modularity of

a clustering C is then derived by the summing the weights of the

edges covered by C

q(C) =

C∈C

Note that in this formula we have to count twice the weight for

each edge between different vertices u and v (once for every

ordering) and once the weight for a non-existing self-loop for

every vertex u. Thus, the change of modularity by merging two

clusters is given by twice the sum of weights between the clusters.

Now consider a run of the greedy algorithm on the graph of

Figure 3. Note that the graph is n/2-regular, and thus has m =

n2/4 edges. Each existing edge gets a weight of 2/n2− 1/n2=

1/n2, while every non-existing edge receives a weight of −1/n2.

As the self-loop is counted by every clustering, the initial trivial

singleton clustering has modularity value of −1/n. In the first

step each cluster merge along any existing edge results in an

increase of 2/n2. Of all these equivalent possibilities we suppose

the algorithm chooses to merge along an edge from Em to create

a cluster C?. In the second step merging a vertex with C?results

in change of 0, because one existing and one non-existing edge

would be included. Every other merge along an existing edge

still has value 2/n2. We suppose the algorithm again chooses to

merge two singleton clusters along an edge from Em creating a

cluster C??. Afterwards observe that merging clusters C?and C??

yields a change of 0, because two existing and two non-existing

edges would be included. Thus, it is again optimal to merge

two singleton clusters along an existing edge. If the algorithm

continues to merge singleton clusters along the edges from Em,

it will in each iteration make an optimal merge resulting in strictly

positive increase in modularity. After n/2 steps it has constructed

a clustering C of the type depicted in Figure 3(a). C consists of one

cluster for the vertices of each edge of Em and has a modularity

value of

q(C) =2

Due to the single-peakedness of the problem [8] all following

cluster merges can never increase this value, hence the algorithm

will return a clustering of value 0.

On the other hand consider a clustering C∗= {C1,C2} with

two clusters, one for each clique C1 = V1 and C2 = V2 (see

Figure 3(b)). This clustering has a modularity of

??

u,v∈C

w(u,v)

n−n

2·4n2

n4= 0.

q?C∗?=n(n − 2)

n2

− 24n2

16n2=1

2−2

n.

This shows that the approximation ratio of the greedy algorithm

can be infinitely large, because no finite approximation factor can

outweigh a value of 0 with one strictly greater than 0.

The key observation is, that the proof considers a worst-case

scenario in the sense that greedy is in each iteration supposed

to pick exactly the ”worst” merge choice of several equivalently

attractive alternatives. If greedy chooses in an early iteration to

merge along an edge from E1or E2, the resulting clustering will

be significantly better. As mentioned earlier, this negative result

is due to formulation of modularity, which yields values from the

interval [−1/2,1]. For instance, a linear remapping of the range

of modularity to the interval [0,1], the greedy algorithm yields a

value of 1/3 compared to the new optimum score of 2/3. In this

case the approximation factor would be 2.

Next, we provide a decreased lower bound for a different class

of graphs and no assumptions on the random choices of the

algorithm.

Definition 5.2: Let G = (V,E) and H = (V?,E?) be two non-

empty, simple, undirected, and unweighted graphs and let u ∈ V?

be a node. The product G?uH is defined as the graph (V??,E??)

with the nodeset V??:= V ∪ V × V?and the edgeset E??:=

E ∪ E??

E??

c

:=

E??

H

:=

| v ∈ V,v?,w?∈ V??,{v?,w?} ∈ E?

ure 4. The product G ?u H is a

graph that contains G and for each

node v of G a copy Hv of H. For

each copy the node in Hv corre-

sponding to u ∈ H is connected

to v. We use the notation (v,w?)

to refer to the copy of node w?of H, which is located in Hv. In

the following we consider only a special case: Let n ≥ 2 be an

integer, H = (V?,E?) be an undirected and connected graph with

at least two nodes, and u ∈ V?an arbitrary but fixed node. We

denote by Cg

applied to Kn?uH starting from singletons and performing at

most k steps that all have a positive increase in modularity.

Furthermore, let m be the number of edges in Kn?uH. Based on

the merging policy of the greedy algorithm we can characterize

the final clustering Cg

a vertex v of G and his copy of H.

Theorem 5.3: Let n ≥ 2 be an integer and H = (V?,E?) be a

undirected and connected graph with at least two nodes. If 2|E?|+

1 < n then the greedy algorithm returns the clustering Cg:=

?{v} ∪ {v} × V?| v ∈ V?for Kn?uH (for any fixed u ∈ H).

4m2· q?Cg?= 4m?(|E?| + 1) · n?− n?2|E?| + 1 + n?2

described above, is available from the authors or can alternatively

be found in an associated technical report [24]. The next corollary

reveals that the clustering, in which G and each copy of H

form individual clusters, has a greater modularity score. We first

observe an explicit expression for modularity.

Corollary 5.4: The clustering Csis defined as Cs:= {V } ∪

?{v} × V?| v ∈ V?and, according to Equation (2), its modular-

c∪ E??

Hwhere

?{v,(v,u)} | v ∈ V?

and

?{(v,v?),(v,w?)}

.

Fig. 4. The graph K4?uP1.

An example is given in Fig-

kthe clustering obtained with the greedy algorithm

n. It has n clusters, each of which includes

This clustering has a modularity score of

.

The proof of Theorem 5.3, which relies on the graph construction

Page 10

10

ity is

4m2· q?Cs?

=4m

?

|E?|n +

?

n

2

??

− n?2|E?| + 1?2

−(n · (n − 1 + 1))2.

If n ≥ 2 and 2|E?| + 1 < n, then clustering Cshas higher

modularity than Cg.

Theorem 5.5: The approximation factor of the greedy algo-

rithm for finding clusterings with maximum modularity is at least

2.

The quotient q(Cs)/q(Cg) asymptotically approaches 2 for n

going to infinity on Kn?uH with H a path of length 1/2√n.

The full proof of Theorem 5.5 is also available in [24].

VI. OPTIMALITY RESULTS

A. Characterization of Cliques and Cycles

In this section, we provide several results on the structure of

clusterings with maximum modularity for cliques and cycles. This

extends previous work, in particular [2], in which cycles and

cycles of cliques were used to reason about global properties of

modularity.

A first observation is that modularity can be simplified for

general d-regular graphs as follows.

Corollary 6.1: Let G = (V,E) be an unweighted d-regular

graph and C = {C1,...,Ck} ∈ A(G). Then the following

equality holds:

q(C) =|E(C)|

dn/2

−

1

n2

k

?

i=1

|Ci|2.

(6)

The correctness of the corollary can be read off the definition

given in Equation (2) and the fact that |E| = d|V |/2. Thus,

for regular graphs modularity only depends on cluster sizes and

coverage.

1) Cliques: We first deal with the case of complete graphs.

Corollary 6.2 provides a simplified formulation for modularity.

From this rewriting, the clustering with maximum modularity can

directly be obtained.

Corollary 6.2: Let Kn be a complete graph on n nodes

and C := {C1,...,Ck} ∈ A(Kn). Then the following equality

holds:

1

n − 1+

The simple proof of 6.2 can be found in the appendix. Thus,

maximizing modularity is equivalent to maximizing the squares

of cluster sizes. Using the general inequality (a + b)2≥ a2+

b2for non-negative real numbers, the clustering with maximum

modularity is the 1–clustering. More precisely:

Theorem 6.3: Let k and n be integers, Kknbe the complete

graph on k · n nodes and C a clustering such that each cluster

contains exactly n elements. Then the following equality holds:

?

For fixed k > 1 and as n tends to infinity, modularity is always

strictly negative, but tends to zero. Only for k = 1 modularity is

zero and thus is the global maximum.

As Theorem 6.3 deals with one clique, the following corollary

provides the optimal result for k disjoint cliques.

Corollary 6.4: The maximum modularity of a graph consisting

of k disjoint cliques of size n is 1 − 1/k.

q(C) = −

1

n2(n − 1)

k

?

i=1

|Ci|2.

(7)

q(C) =

−1 +1

k

?

·

1

kn − 1

.

The corollary follows from the definition of modularity in

Equation (2). Corollary 6.4 gives a glimpse on how previous

approaches have succeeded to upper bound modularity as it was

pointed out in the context of Lemma 3.1.

2) Cycles: Next, we focus on simple cycles, i.e., connected

2-regular graphs. According to Equation (6), modularity can be

expressed as given in Equation (8), if each cluster is connected

which may safely be assumed (see Corollary 3.5).

q(C) =n − k

n

−

1

n2

k

?

i=1

|Ci|2.

(8)

In the following, we prove that clusterings with maximum mod-

ularity are balanced with respect to the number and the sizes of

clusters. First we characterize the distribution of cluster sizes for

clusterings with maximum modularity, fixing the number k of

clusters. For convenience, we minimize F := 1 − q(C), where

the argument of F is the distribution of the cluster sizes.

Proposition 6.5: Let k and n be integers, the set D(k):=

?

fined as

x ∈ Nk????k

i=1xi= n

?

, and the function F : D(k)→ R de-

F(x) :=k

n+

1

n2

k

?

i=1

x2

i

for x ∈ D(k).

Then, F has a global minimum at x∗with x∗

1,...,k − r and x∗

r < k and r ≡ n mod k.

Proposition 6.5 is based on the fact, that, roughly speaking,

evening out cluster sizes decreases F. We refer the reader to the

appendix for the full proof. Due to the special structure of simple

cycles, we can swap neighboring clusters without changing the

modularity. Thus, we can safely assume that clusters are sorted

according to their sizes, starting with the smallest element. Then

x∗is the only optimum. Evaluating F at x∗leads to a term

that only depends on k and n. Hence, we can characterize the

clusterings with maximum modularity only with respect to the

number of clusters. The function to be minimized is given in

Lemma 6.6:

Lemma 6.6: Let

Cn

be

nodes, h: [1,...,n] → R a function defined as

?n

and k∗be the argument of the global minimum of h. Then every

clustering of Cn with maximum modularity has k∗clusters.

The proof of Lemma 6.6 builds upon Proposition 6.5, it can be

found in the appendix. Finally we obtain the characterization for

clusterings with maximum modularity for simple cycles.

Theorem 6.7: Let n be an integer and Cna simple cycle with n

nodes. Then every clustering C with maximum modularity has k

cluster of almost equal size, where

?

i=

?n

k

?for i =

i=?n

k

?for i = k − r + 1,...,k, where 0 ≤

a simple cyclewith

n

h(x) := x · n + n +

x

??

2n − x ·

?

1 +

?n

x

???

,

k ∈

n

?n +√n

− 1,1

2+

?

1

4+ n

?

.

Furthermore, there are only 3 possible values for k for sufficiently

large n.

The rather technical proof of Theorem 6.7 is based on the

monotonicity of h. This proof can also be found in the appendix.

Page 11

11

VII. EXAMPLES REVISITED

Applying our results about maximizing modularity gained so

far, we revisit three example networks that were used in related

work [25], [26], [9]. More precisely, we compare published

greedy solutions with respective optima, thus revealing two pecu-

liarities of modularity. First, we illustrate a behavioral pattern of

the greedy merge strategy and, second, we relativize the quality

of the greedy approach.

The first instance is the karate club network of Zachary

originally introduced in [25] and used for demonstration in [26].

The network models social interactions between members of a

karate club. More precisely, friendship between the members is

presented before the club split up due to an internal dispute. A

representation of the network is given in Figure 5. The partition

that has resulted from the split is given by the shape of the nodes,

while the colors indicate the clustering calculated by the greedy

algorithm and blocks refer to a optimum clustering maximizing

modularity, that has been obtained by solving its associated ILP.

The corresponding scores of modularity are 0.431 for the opti-

Fig. 5.

coded as follows: blocks represent the optimum clustering (with respect to

modularity), colors correspond to the greedy clustering, and shapes code the

split that occurred in reality.

Karate club network of Zachary [25]. The different clusterings are

mum clustering, 0.397 for the greedy clustering, and 0.383 for the

clustering given by the split. Even though this is another example

in which the greedy algorithm does not perform optimally, its

score is comparatively good. Furthermore, the example shows

one of the potential pitfalls the greedy algorithm can encounter:

Due to the attempt to balance the squared sum of degrees (over

the clusters), a node with large degree (white square) and one

with small degree (white circle) are merged at an early stage.

However, using the same argument, such a cluster will unlikely

be merged with another one. Thus, small clusters with skewed

degree distributions occur.

The second instance is a network of books on politics, compiled

by V. Krebs and used for demonstration in [9]. The nodes

represent books on American politics bought from Amazon.com

and edges join pairs of books that are frequently purchased

together. A representation of the network is given in Figure 6. The

optimum clustering maximizing modularity is given by the shapes

of nodes, the colors of nodes indicate a clustering calculated by

the greedy algorithm and the blocks show a clustering calculated

by Geometric MST Clustering (GMC) which is introduced in [27]

using the geometric mean of coverage and performance, both

of which are quality indices discussed in the same paper. The

corresponding scores of modularity are 0.527 for the optimum

Fig. 6. The networks of books on politics compiled by V. Krebs. The different

clusterings are coded as follows: blocks represent the clustering calculated

with GMC, colors correspond to the greedy clustering, and shapes code the

optimum clustering (with respect to modularity).

clustering, 0.502 for the greedy clustering, and 0.510 for the GMC

clustering. Similar to the first example, the greedy algorithm is

suboptimal, but relatively close to the optimum. Interestingly,

GMC outperforms the greedy algorithm although it does not

consider modularity in its calculations. This illustrates the fact that

there probably are many intuitive clusterings close to the optimum

clustering that all have relatively similar values of modularity. In

analogy to the first example, we observe the same merge-artifact,

namely the two nodes represented as dark-grey triangles.

Fig. 7. Social network of bottlenose dolphins introduced in [28] and clustered

in [29]. The different clusterings are coded as follows: blocks represent

the clustering with maximum modularity, colors represent the result of the

greedy clustering, and shapes code the community structure identified with

the iterative conductance cut algorithm presented in [20].

As a last example, Figure 7 reflects the social structure of

a family of bottlenose dolphins off the coast of New Zealand,

observed by Lusseau et al. [28], who logged frequent associations

between dolphins over a period of seven years. The clustering

with optimum modularity (blocks) achieves a modularity score

of 0.529 and, again, the greedy algorithm (colors) approaches

this value with 0.496. However, structurally the two clusterings

disagree on the two small clusters, whereas a clustering based

on iterative conductance cutting [20] (shapes) achieves the same

quality (0.492), but disagrees with the optimum only on the

smallest cluster and on the refinement of the leftmost cluster.

Summarizing, the three examples illustrated several interesting

Page 12

12

facts. First of all, an artifical pattern in the optimization process of

the greedy algorithm is revealed: The early merge of two nodes,

one with a high and one with a low degree, results in a cluster

which will not be merged with another one later on. In general,

this can prevent finding the optimum clustering. Nevertheless,

it performs relatively well on the given instances and is at most

10% off the optimum. However, applying other algorithms that do

not optimize modularity, we observe that the obtained clusterings

have similar scores. Thus, achieving good scores of modularity

does not seem to be too hard on these instances. On the one hand,

these clusterings roughly agree in terms of the overall structure, on

the other hand, they differ in numbers of clusters and even feature

artifacts such as small clusters of size one or two. Considering

that all three examples exhibit significant community structure,

we thus predict that there are many intuitive clusterings being

structurally close (with respect to lattice structure) and that most

suitable clustering algorithms probably identify one of them.

VIII. CONCLUSION

This paper represents the first formal assessment to optimiza-

tion of a popular clustering index known as modularity. We have

settled the open question about the complexity status of modular-

ity maximization by proving its NP-hardness, in particular, by

proving NP-completeness in the strong sense for the underlying

decision problem. We show that this even holds for the restricted

version with a bound of two on the number of clusters. This

justifies the further investigation of approximation algorithms and

heuristics, such as the widespread greedy approach. For the latter

we prove a first lower bound on the approximation factor. Our

analysis of the greedy algorithm also includes a brief comparison

with the optimum clustering which is calculated via ILP on

some real-world instances, thus encouraging a reconsideration of

previous results. Following is a list of the main results derived in

this paper.

• Modularity can be defined as a normalized tradeoff between

edges covered by clusters and squared cluster degree sums.

(see Equation (1))

• There is a formulation of modularity maximization as integer

linear program. (Section II-B)

• There is a clustering with maximum modularity without sin-

gleton clusters of degree 1 and without clusters representing

disconnected subgraphs. Isolated nodes have no impact on

modularity. (Corollary 3.2, Lemmata 3.3, 3.4)

• The clustering of maximum modularity changes in a global,

non-trivial fashion even for simplest graph perturbation.

(Section III-A)

• For any clustering C of any graph G the modularity value

1

2≤ q(C) < 1. (Lemma 3.1)

• Finding a clustering with maximum modularity is NP-hard,

both for the general case and when restricted to clusterings

with exactly or at most two clusters. (Theorems 4.4 and 4.8)

• With a worst tie-breaking strategy the greedy agglomeration

algorithm has no worst-case approximation factor (Theo-

rem 5.1), with an arbitrary tie-breaking strategy the worst-

case factor is at least 2. (Theorem 5.5)

• A clustering of maximum modularity for cliques of size

n consists of a single cluster (Theorem 6.3), for cycles

of size n of approximately√n clusters of size√n each.

(Theorem 6.7)

For the future we plan an extended analysis and the development

of a clustering algorithm with provable performance guarantees.

The special properties of the measure, its popularity in application

domains and the absence of fundamental theoretical insights

hitherto, render further mathematically rigorous treatment of

modularity necessary.

REFERENCES

[1] M. E. J. Newman and M. Girvan, “Finding and evaluating community

structure in networks,” Physical Review E, vol. 69, no. 026113, 2004.

[Online]. Available: http://link.aps.org/abstract/PRE/v69/e026113

[2] S. Fortunato and M. Barthelemy, “Resolution Limit in Community

Detection.” Proceedings of the National Academy of Sciences, vol. 104,

no. 1, pp. 36–41, 2007.

[3] E. Ziv, M. Middendorf, and C. Wiggins, “Information-Theoretic Ap-

proach to Network Modularity.” Physical Review E, vol. 71, no. 046117,

2005.

[4] S. Muff, F. Rao, and A. Caflisch, “Local Modularity Measure for

Network Clusterizations.” Physical Review E, vol. 72, no. 056107, 2005.

[5] P. Fine, E. D. Paolo, and A. Philippides, “Spatially Constrained Net-

works and the Evolution of Modular Control Systems.” in 9th Intl.

Conference on the Simulation of Adaptive Behavior (SAB), 2006.

[6] M. Gaertler, R. G¨ orke, and D. Wagner, “Significance-Driven Graph

Clustering,” in Proceedings of the 3rd International Conference on

Algorithmic Aspects in Information and Management (AAIM’07), ser.

Lecture Notes in Computer Science.

11–26.

[7] M. E. J. Newman, “Fast Algorithm for Detecting Community Structure

in Networks,” Physical Review E, vol. 69, no. 066133, 2004.

[8] A. Clauset, M. E. J. Newman, and C. Moore, “Finding community

structure in very large networks,” Physical Review E, vol. 70, no.

066111, 2004. [Online]. Available: http://link.aps.org/abstract/PRE/v70/

e066111

[9] M. Newman, “Modularity and Community Structure in Networks.” in

Proceedings of the National Academy of Sciences, 2005, pp. 8577–8582.

[10] S. White and P. Smyth, “A Spectral Clustering Approach to Finding

Communities in Graph.” in SIAM Data Mining Conference, 2005.

[11] R. Guimer` a, M. Sales-Pardo, and L. A. N. Amaral, “Modularity from

Fluctuations in Random Graphs and Complex Networks.” Physical

Review E, vol. 70, no. 025101, 2004.

[12] J. Reichardt and S. Bornholdt, “Statistical Mechanics of Community

Detection.” Physical Review E, vol. 74, no. 016110, 2006.

[13] J. Duch and A. Arenas, “Community Detection in Complex Networks

using Extremal Optimization.” Physical Review E, vol. 72, no. 027104,

2005.

[14] M. Gaertler, “Clustering,” in Network Analysis: Methodological

Foundations, ser. Lecture Notes in Computer Science, U. Brandes

and T. Erlebach, Eds.Springer-Verlag, February 2005, vol. 3418,

pp. 178–215. [Online]. Available: http://springerlink.metapress.com/

openurl.asp?genre=article&issn=0302-9743&volume=3418&spage=178

[15] L. Danon, A. D´ ıaz-Guilera, J. Duch, and A. Arenas, “Comparing

community structure identification,” Journal of Statistical Mechanics,

2005.

[16] M. R. Garey and D. S. Johnson, Computers and Intractability. A Guide

to the Theory of NP-Completeness.

1979.

[17] M. Newman, “Analysis of Weighted Networks,” Cornell University,

Santa Fe Institute, University of Michigan, Tech. Rep., jul 2004.

[18] C. J. Alpert and A. B. Kahng, “Recent Directions in Netlist Partitioning:

A Survey,” Integration: The VLSI Journal, vol. 19, no. 1-2, pp. 1–81,

1995. [Online]. Available: http://vlsicad.cs.ucla.edu/∼cheese/survey.html

[19] E. Hartuv and R. Shamir, “A Clustering Algorithm based on

Graph Connectivity,” Information Processing Letters, vol. 76, no.

4-6, pp. 175–181, 2000. [Online]. Available: http://citeseer.nj.nec.com/

hartuv99clustering.html

[20] S. Vempala, R. Kannan, and A. Vetta, “On Clusterings - Good, Bad

and Spectral,” in Proceedings of the 41st Annual IEEE Symposium on

Foundations of Computer Science (FOCS’00), 2000, pp. 367–378.

[21] I. Giotis and V. Guruswami, “Correlation Clustering with a Fixed

Number of Clusters,” in Proceedings of the 17th Annual ACM–SIAM

Symposium on Discrete Algorithms (SODA’06), New York, NY, USA,

2006, pp. 1167–1176. [Online]. Available: http://portal.acm.org/citation.

cfm?id=1109557.1109686#

Springer-Verlag, June 2007, pp.

W.H. Freeman and Company,

Page 13

13

[22] T. Bui, S. Chaudhuri, F. Leighton, and M. Sipser, “Graph bisection

algorithms with good average case behavior.” Combinatorica, vol. 7,

no. 2, pp. 171–191, 1987.

[23] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,”

ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999.

[24] U. Brandes, D. Delling, M. Gaertler, R. G¨ orke, M. Hoefer, Z. Nikoloski,

and D. Wagner, “On Modularity - NP-Completeness and Beyond,” ITI

Wagner, Faculty of Informatics, Universit¨ at Karlsruhe (TH), Tech. Rep.

2006-19, 2006.

[25] W. W. Zachary, “An Information Flow Model for Conflict and Fission

in Small Groups,” Journal of Anthropological Research, vol. 33, pp.

452–473, 1977.

[26] M. E. J. Newman and M. Girvan, “Mixing Patterns and Community

Structure in Networks,” in Statistical Mechanics of Complex Networks,

ser. Lecture Notes in Physics, R. Pastor-Satorras, M. Rubi, and A. Diaz-

Guilera, Eds. Springer-Verlag, 2003, vol. 625, pp. 66–87.

[27] D. Delling, M. Gaertler, R. G¨ orke, and D. Wagner, “Experiments on

Comparing Graph Clusterings,” ITI Wagner, Faculty of Informatics,

Universit¨ at Karlsruhe (TH), Tech. Rep. 2006-16, 2006.

[28] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M.

Dawson, “The bottlenose dolphin community of Doubtful Sound features

a large proportion of long-lasting associations,” Behavioral Ecology and

Sociobiology, vol. 54, no. 4, pp. 396–405, 2003.

[29] M.E. J. Newmanand M.

community structure in networks,” August 2003. [Online]. Available:

http://arxiv.org/abs/cond-mat/0308217

Girvan, “Finding andevaluating

APPENDIX

Proof: [of Corollary 6.2] Coverage of C can be expressed in

terms of cluster sizes as follows:

|E(C)|

=

?

?

?

?

?

n

2

?

?

?

?

?

−

k

?

i=1

?

k

?

k

?

k

?

?

j>i

|Ci| · |Cj|

=

n

2

−1

2

i=1

?

j?=i

|Ci| · |Cj|

=

n

2

−1

2

i=1

|Ci| ·

?

j?=i

|Cj|

=

n

2

−1

2

i=1

|Ci| · (n − |Ci|)

=

n

2

−1

2

n2−

k

?

i=1

|Ci|2

?

=

−n

2+12

k

?

i=1

|Ci|2.

Thus, we obtain

q(C)=

−

1

n − 1+

1

n(n − 1)

k

?

i=1

|Ci|2−

1

n2

k

?

i=1

|Ci|2

=

−

1

n − 1+

1

n2· (n − 1)

k

?

i=1

|Ci|2,

which proves the equation.

Proof: [of Proposition 6.5] Since k and n are given, mini-

mizing F is equivalent to minimizing?

ix2

i. Thus let us rewrite

this term:

k

?

i=1

?

xi−n

k

?2

=

k

?

k

?

k

?

?

i=1

x2

i− 2n

k

k

?

+n2

i=1

xi+ k ·

?n

k

?2

=

i=1

x2

i− 2n2

kk

⇐⇒

k

?

i=1

x2

i

=

i=1

?

xi−n

??

k

?2

?

=:h(x)

+n2

k

Thus minimizing F is equivalent to minimizing h. If r is 0,

then h(x∗) = 0. For every other vector y the function h is strictly

positive, since at least one summand is positive. Thus x∗is a

global optimum.

Let r > 0. First, we show that every vector x ∈ D(k)that is

close to (n

D ∩ [?n

k

value?n

n

k

1, causes an increase of h by 5 + 2ε > 0. Similarly, in the case

of xi= xj =

k

?n

by ‘shifting one unit’ between variables. Let x ∈ D(k)and with

loss of generality, we assume that xi≤ xi+1for all i. We define

a sequence of elements in D(k)as follows:

1) x(0):= x∗

2) if x(i)?= x, define x(i+1)as follows

Note that all obtained vectors x(i)are elements of D(k)and meet

the condition of x(i)

j

formula for the cost:

?

Since L < L?, one obtains x(i)

optimum in D(k).

Proof: [of Lemma 6.6] Note, that h(k) = F(x∗), where F

is the function of Proposition 6.5 with the given k. Consider first

the following equations:

k,...,n

?,?n

k

k) has (in principle) the form of x∗. Let x ∈

?]k, then it is easy to verify that there are k − r

?. Any ‘shift of one unit’ between two variables having

?. Replacing xiwith?n

?n

Finally, we show that any vector of D(k)can be reach from x∗

kk

entries that have value

?n

?

and the remaining r entries have

the same value, increases the corresponding cost: Let ε :=?n

?and the reassignment xi=

k

k

?−

kand xi= xj=?n

?− 1, causes an increase of h by 2 > 0.

k

?and xjwith?n

?n

k

?+

k

?and xj =

x(i+1)

j

:=

x(i)

j

x(i)

j

x(i)

j

− 1

+ 1

if j = min{? | x(i)

if j = max{? | x(i)

otherwise

?

> x?} =: L

< x?} =: L?

?

≤ x(i)

j+1. Furthermore, we gain the following

j

?

x(i+1)

j

?2

=

?

j

?

x(i)

j

?2

L? ≥ x(i)

+ 2

?

x(i)

L? − x(i)

L+ 1

?

.

L. Thus x∗is a global

k

?

i=1

(x∗

i)2

=(k − r) ·

?n

k

?2

+ r ·

?n

?(n − r)

k

?2

=(k − r)(n − r)2

n − r

k

1

k

?n

n +

k

k2

+ r

k

+ 1

?2

= ((n − r) + 2r) + r =n2− r2

n2−

k

?

2n − k

k

+ r

?

=

?

?

n −

?n

?n

?2

?

+ n −

??n

k

?2?

+ n −

?n

+ 1

?n

k

k

=2n

k

?n

− k

??

kk

?

k

=

k

? ??

Page 14

14

Since maximizing modularity is equivalent to minimize the ex-

pression k/n+1/n2?

Since we have characterized the global minima for fixed k, it is

sufficient to find the global minima by varying k.

Proof: [of Theorem 6.7] First, we show that the function h

can be bounded by the inequalities given in (9) and is monoton-

ically increasing (decreasing) for certain choices of k.

ix2

ifor (xi) ∈?n

j=1D(j). Note that every

vector (xi) can be realized as clustering with connected clusters.

kn +n2

k

≤ h(k) ≤ kn +n2

k

+k

4

.

(9)

In order to verify the Inequalities (9), let εkbe defined as n/k −

?n/k?(≥ 0). Then the definition of h can be rewritten as follows:

?n

= kn + n +

k− εk

kn + n +2n2

k

−n2

k

kn +n2

k

Replacing the term (1 − εk)εkk by a lower (upper) bound of 0

(k/4) proves the given statements.

Second, the function h is monotonically increasing for k ≥

1/2 +

n/?n +√n − 1. In order to prove the first part, it is sufficient

h(k)= kn + n +

k

??

2n −

??

− (1 − εk)n

?

2n −

1 +

?n

k

1 +n

??

k

?

?n

?

k− εk

?

k

?

=

− 2nεk+ (1 − εk)kεk+ nεk

= + (1 − εk)εkk .

?1/4 + n and monotonically decreasing for k

to show that h(k) ≤ h(k + 1) for every suitable k.

h(k + 1) − h(k)

= (k + 1)n + n +

k + 1

?

− kn − n −

k

??

??

Since ?·? is discrete and |?x? − ?x − 1?| ≤ 1, one obtains:

h(k + 1) − h(k)

Since 3n−?n/k?2−?n/k?+2k ?n/k? > n−?n/k?2−?n/k?, it is

sufficient to show that n−?n/k?2−?n/k? ≥ 0. This inequality is

fulfilled if n−(n/k)2−n/k ≥ 0. Solving the quadratic equations

leads to k ≥ 1/2 +?1/4 + n.

show that

≤

?

n

?

2n −

?

1 +

?n

n

k + 1

?n

?

??

n

k + 1

??

?

(k + 1)

?n

−

?

?

??

1 +

2n −

?n

?

1 +

??

−

k

k

?

?

= n + 2n

?

−

k

?

n

k + 1

???

???

n

k + 1

??

?

+ k 1 +

k

???n

k

1 +

?

n

k + 1

n

k + 1

=

n −

3n −

?n

k

?n

?2

k

−

?n

−

k

?n

?

k

if

?n

k

?=

?

n

k−1

?

?2

?

+ 2k

?n

k

?

otherwise

(10)

Using the above bound, for the second part, it is sufficient to

kn +n2

k

− (k + 1)n −

n2

k + 1−k + 1

4

≥ 0 ,

(11)

since this implies that the upper bound of h(k + 1) is smaller

than (the lower bound of) h(k). One can rewrite the left side of

Inequality (11) as:

kn+n2

k−(k+1)n−

Since h(k)−h(k+1) is monotonically decreasing for 0 ≤ k ≤√n,

it is sufficient to show that h(k)−h(k+1) is non-negative for the

maximum value of k. We show that the lower bound h−(k) :=

−n + n2/(k + 1)2− (k + 1)/4 is non-negative.

?

n2

k + 1−k + 1

4

= −n+

n2

k(k + 1)−k + 1

4

.

h−

n

?n +√n

− 1

?

=

−n −

+n2(n +√n)

n2

√n −

?

n

4?n +√n

=

n

4?n +√n

≤1

???

4

√n

≥ 0

Summarizing, the number of clusters k (of an optimum clustering)

can only be contained in the given interval, since outside the

function h is either monotonically increasing or decreasing. The

length of the interval is less than

?

?

The function ?(n) can be rewritten as follows:

??1

?n +1+ε

1 + ε

2

1

2+

1

4+ n −

n

?n +√n

=:?(n)

???

+1 .

?(n)=

4+ n???n +√n

√n?− n

?

?

− n

?n +√n

?n +√n

n

n +√n

≤

2

(12)

≤

,

for every positive ε. Inequality (12) is due to the fact that

?1

4+ n

???

n +√n

?

≤

n2+ n√n +1

4

?n +√n?

n√n

≤

n2+ 21 + ε

2

+(1 + ε)2

?

4

n

=n +1 + ε

2

√n

?2

,

for sufficiently large n.

Ulrik Brandes is a full professor of computer

science at the University of Konstanz. He graduated

from RWTH Aachen in 1994, obtained his PhD and

Habilitation from the University of Konstanz in 1999

and 2002, and was an associate professor at the Uni-

versity of Passau until 2003. His research interests

include algorithmic graph theory, graph drawing,

social network analysis, information visualization,

and algorithm engineering.

Page 15

15

Daniel Delling studied computer science at Univer-

sit¨ at Karlsruhe, Germany, and received his Diplom

in 2006. Since then he is a Ph.D. student at the

chair of Algorithmics I of the Faculty of Informatics

in Karlsruhe. His research focuses are computations

of shortest paths in large, dynamic, time-dependent

graphs and graph clustering.

Marco Gaertler studied mathematics at Universit¨ at

Konstanz, Germany, and received his Diplom in

2002. In 2007 he received his Ph.D. in computer

science at the chair of Algorithmics I of the Faculty

of Informatics in Karlsruhe. His research focuses as

a postdoc are the clustering of graphs both in a static

and in an evolving environment and the analytic

visualization of networks.

Robert G¨ orke studied technical mathematics at

Universit¨ at Karlsruhe, Germany, and received his

Diplom in 2005. Since then he is a Ph.D. student

at the chair of Algorithmics I of the Faculty of

Informatics in Karlsruhe. His research focuses are

the clustering of graphs both in a static and in an

evolving environment and the analytic visualization

of networks.

Martin Hoefer studied computer science at Tech-

nische Universit¨ at Clausthal, Germany, and received

his Diplom in 2004. Since then he is a Ph.D.

student at the chair of Algorithmics at University

of Konstanz. His research centers around efficient

graph algorithms, graph theory, algorithmic game

theory, and combinatorial optimization.

Zoran Nikoloski received a B.S. Degree in Com-

puter Science from Graceland University. Lamoni,

IA. USA in 2001, and a Ph.D. Degree in Computer

Science from University of Central Florida, Orlando,

FL, USA in 2005. He worked as a postdoctoral re-

searcher at the Department of Applied Mathematics,

Faculty of Mathematics and Physics, Charles Uni-

versity, Prague, Czech Republic from 2004 to 2007.

He is currently a postdoctoral researcher at the Max-

Planck Institute for Molecular Plant Physiology,

Potsdam-Golm and the Institute for Biochemistry

and Biology, University of Potsdam, Potsdam, Germany.

Dorothea

Ph.D. degrees in mathematics from the Rheinisch-

Westf¨ alische Technische Hochschule at Aachen,

Germany, in 1983 and 1986, respectively. In 1992,

she received the Habilitation degree from the De-

partment for Mathematics of the Technische Uni-

versit¨ at Berlin. Until 2003 she was a full professor

of computer science at the University of Konstanz,

and since then holds this position at Universit¨ at

Karlsruhe. Her research interests include discrete

optimization, graph algorithms, and algorithm engi-

neering. Alongside numerous positions in editorial boards she is vice president

of the German research association (DFG) since 2007.

Wagner

receivedtheDiplomand