Article

On Modularity Clustering

Univ. of Konstanz, Konstanz
IEEE Transactions on Knowledge and Data Engineering (Impact Factor: 1.82). 03/2008; DOI: 10.1109/TKDE.2007.190689
Source: IEEE Xplore

ABSTRACT Modularity is a recently introduced quality measure for graph clusterings. It has immediately received considerable attention in several disciplines, particularly in the complex systems literature, although its properties are not well understood. We study the problem of finding clusterings with maximum modularity, thus providing theoretical foundations for past and present work based on this measure. More precisely, we prove the conjectured hardness of maximizing modularity both in the general case and with the restriction to cuts and give an Integer Linear Programming formulation. This is complemented by first insights into the behavior and performance of the commonly applied greedy agglomerative approach.

0 Bookmarks
 · 
145 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Graph partition is a fundamental problem of parallel computing for big graph data. Many graph partition algorithms have been proposed to solve the problem in various applications, such as matrix computations and PageRank, etc., but none has pay attention to random walks. Random walks is a widely used method to explore graph structure in lots of fields. The challenges of graph partition for random walks include the large number of times of communication between partitions, lots of replications of the vertices, unbalanced partition, etc. In this paper, we propose a feasible graph partition framework for random walks implemented by parallel computing in big graph. The framework is based on two optimization functions to reduce the bandwidth, memory and storage cost in the condition that the load balance is guaranteed. In this framework, several greedy graph partition algorithms are proposed. We also propose five metrics from different perspectives to evaluate the performance of these algorithms. By running the algorithms on the big graph data set of real world, the experimental results show that these algorithms in the framework are capable of solving the problem of graph partition for random walks for different needs, e.g. the best result is improved more than 70 times in reducing the times of communication.
    12/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Networks are a convenient way to represent complex systems of interacting entities. Many networks contain "communities" of nodes that are more densely connected to each other than to nodes in the rest of the network. In this paper, we investigate the detection of communities in temporal networks represented as multilayer networks. As a focal example, we study time-dependent financial-asset correlation networks. We first argue that the use of the "modularity" quality function---which is defined by comparing edge weights in an observed network to expected edge weights in a "null network"---is application-dependent. We differentiate between "null networks" and "null models" in our discussion of modularity maximization, and we highlight that the same null network can correspond to different null models. We then investigate a multilayer modularity-maximization problem to identify communities in temporal networks. Our multilayer analysis only depends on the form of the maximization problem and not on the specific quality function that one chooses. We introduce a diagnostic to measure \emph{persistence} of community structure in a multilayer network partition. We prove several results that describe how the multilayer maximization problem measures a trade-off between static community structure within layers and higher values of persistence across layers. We also discuss some implementation issues that the popular "Louvain" heuristic faces with temporal multilayer networks and suggest ways to mitigate them.
    12/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Community detection has attracted increasing attention during the past decade, and many algorithms have been proposed to find the underlying community structure in a given network. Many of these algorithms are based on modularity maximization, and these methods suffer from the resolution limit. In order to detect the underlying cluster structure, we propose a new convex formulation to decompose a partially observed adjacency matrix of a network into low-rank and sparse components. In such decomposition, the low-rank component encodes the cluster structure under certain assumptions. We also devise an alternating direction method of multipliers with increasing penalty sequence to solve this problem; and compare it with Louvain method, which maximizes the modularity, on some synthetic randomly generated networks. Numerical results show that our method outperforms Louvain method on the randomly generated networks when variance among cluster sizes increases. Moreover, empirical results also demonstrate that our formulation is indeed tighter than the robust PCA formulation, and is able to find the true clustering when the robust PCA formulation fails.
    10/2014;

Preview (2 Sources)

Download
1 Download
Available from