A preview of the PDF is not available
Inferring Community-driven Structure in Complex Networks
Abstract and Figures
Despite a long tradition in the study of graphs and relational data, for decades the analysis of complex networks was limited by difficulties in data collection and computational burdens. The advent of new technologies in life sciences, as well as in our daily life, has suddenly shed light on the many interconnections that our world features, from friendships and collaborations between individuals or organizations, to functional couplings between cellular molecules. This has highly facilitated the collection of relational data, fostering an unprecedented interest in network science. Understanding relations encoded in complex networks, however, still represents a challenging task, and statistical methods that can help to summarize and simplify complex networks are needed. In this thesis we show that often one can gain a deep insight of a network by focusing their attention on communities, i.e. on clusters of nodes, and on the relations that exist between them. We begin by presenting NEAT, a network-based test that allows to assess relations between gene sets in a gene interaction network. NEAT extends traditional gene enrichment analysis tests by incorporating information on interactions between genes and it overcomes some limitations of existing network enrichment analysis approaches. Then, we propose two extended stochastic blockmodels that allow to infer the relations that exist between communities from relations between pairs of individuals in a social network. We advocate the use of penalized inference to estimate these models, with the aim of deriving a sparse reduced graph between communities. Application of these models to bill cosponsorship networks in the Italian Chamber of Deputies allows us to reconstruct the pattern of collaborations between Italian political parties from 2001 to 2015. Finally, we propose a novel clustering strategy for sequences of graphs, based on mixtures of generalized linear models. We show that the proposed clustering method not only is capable to retrieve subpopulations of networks within a cross-sectional or longitudinal sequence of networks, but it also allows to directly characterize them by considering each of the components that form the mixture model.
Figures - uploaded by Mirko Signorelli
All figure content in this area was uploaded by Mirko Signorelli
Content may be subject to copyright.