Chapter

Bayesian Nonparametrics for Sparse Dynamic Networks

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper we propose a Bayesian nonparametric approach to modelling sparse time-varying networks. A positive parameter is associated to each node of a network, which models the sociability of that node. Sociabilities are assumed to evolve over time, and are modelled via a dynamic point process model. The model is able to capture long term evolution of the sociabilities. Moreover, it yields sparse graphs, where the number of edges grows subquadratically with the number of nodes. The evolution of the sociabilities is described by a tractable time-varying generalised gamma process. We provide some theoretical insights into the model and apply it to three datasets: a simulated network, a network of hyperlinks between communities on Reddit, and a network of co-occurences of words in Reuters news articles after the September 11th11^{th} attacks.KeywordsBayesian nonparametricsPoisson random measuresNetworksRandom graphsSparsityPoint processes

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Vertex popularity models (Caron and Fox, 2017;Cai et al., 2016;Crane and Dempsey, 2016a;Palla et al., 2016;Herlau and Schmidt, 2016;Williamson, 2016) 4 are a simple yet powerful class of network models. There are a number of different versions, but all share the common feature that each vertex is associated with a nonnegative weight representing how likely it is to take part in an edge. ...
... These have appeared in previous work as "graph frequency models"(Cai et al., 2016) or left unnamed, and the weights w k are occasionally referred to as "sociability parameters"(Caron and Fox, 2017;Palla et al., 2016). ...
Preprint
Trait allocations are a class of combinatorial structures in which data may belong to multiple groups and may have different levels of belonging in each group. Often the data are also exchangeable, i.e., their joint distribution is invariant to reordering. In clustering---a special case of trait allocation---exchangeability implies the existence of both a de Finetti representation and an exchangeable partition probability function (EPPF), distributional representations useful for computational and theoretical purposes. In this work, we develop the analogous de Finetti representation and exchangeable trait probability function (ETPF) for trait allocations, along with a characterization of all trait allocations with an ETPF. Unlike previous feature allocation characterizations, our proofs fully capture single-occurrence "dust" groups. We further introduce a novel constrained version of the ETPF that we use to establish an intuitive connection between the probability functions for clustering, feature allocations, and trait allocations. As an application of our general theory, we characterize the distribution of all edge-exchangeable graphs, a class of recently-developed models that captures realistic sparse graph sequences.
... The vertex popularity model (Caron and Fox, 2014;Cai et al., 2016;Crane and Dempsey, 2016a;Palla et al., 2016;Herlau and Schmidt, 2016) 2 is a simple yet powerful network model. In the vertex popularity model, all (potentially infinitely many) vertices k ∈ N are associated with a weight w k ∈ (0, 1) such that k w k < ∞, and we sample an edge between vertex k and with probability proportional to w k w . ...
... This has appeared in previous work under the name "graph frequency model"(Cai et al., 2016) or left unnamed, and the weights w k are occasionally referred to as "sociability parameters"(Caron and Fox, 2014;Palla et al., 2016). ...
Article
Clustering requires placing data into mutually exclusive groups, while feature allocations allow each datum to exhibit binary membership in multiple groups. But often, data points can not only belong to multiple groups but have different levels of belonging in each group. We refer to the corresponding relaxation of these combinatorial structures as a "trait allocation." The exchangeable partition probability function (EPPF) allows for practical inference in clustering models, and the Kingman paintbox provides a representation for clustering that allows us to study all exchangeable clustering models at once. We provide the analogous exchangeable trait probability function (ETPF) and paintbox representation for trait allocations, along with a characterization of all trait allocations with an ETPF. Our proofs avoid the unnecessary auxiliary randomness of previous specialized constructions and---unlike previous feature allocation characterizations---fully capture single-occurrence "dust" groups. We further introduce a novel constrained version of the ETPF that we use to establish the first direct connection between the probability functions for clustering, feature allocations, and trait allocations. As an application of our general theory, we characterize the distribution of all edge-exchangeable graphs, a recently-developed model that captures realistic sparse graph sequences.
... and Miele (2017)) and related nonparametric graphon-based methods (Pensky (2019)) as well as nonparametric methods for dynamic link prediction (Sarkar, Chakrabarti and Jordan (2014)) and methods from Bayesian nonparametrics (Palla, Caron and Teh (2016)). Other related work includes sparse graphical models that can take account of different time points (Kalaitzis et al. (2013)). ...
Article
We propose a novel way of modelling time-varying networks, by inducing two-way sparsity on local models of node connectivity. This two-way sparsity separately promotes sparsity across time and sparsity across variables (within time). Separation of these two types of sparsity is achieved through a novel prior structure, which draws on ideas from the Bayesian lasso and from copula modelling. We provide an efficient implementation of the proposed model via a Gibbs sampler, and we apply the model to data from neural development. In doing so, we demonstrate that the proposed model is able to identify changes in genomic network structure that match current biological knowledge. Such changes in genomic network structure can then be used by neuro-biologists to identify potential targets for further experimental investigation.
... In statistics, this work covers methods based on Markov processes [Crane et al., 2016], on dynamic Erdős-Rényi graphs [Rosengren and Trapman, 2016], and on sparse regression methods [Kolar et al., 2010]. It also includes work on dynamic community structure [Zhang et al., 2012] and on methods extending the stochastic block model [Xu andHero III, 2013, Matias andMiele, 2016] and related non-parametric graphon-based methods [Pensky, 2016], as well as non-parametric methods for dynamic link prediction [Sarkar et al., 2014] and methods from Bayesian nonparametrics [Palla et al., 2016]. Other related work includes sparse graphical models which can account for samples/observations taken at different time-points [Kalaitzis et al., 2013]. ...
Article
Network models have become an important topic in modern statistics, and the evolution of network structure over time is an important new area of study, relevant to a range of applications. An important application of statistical network modelling is in genomics: network models are a natural way to describe and analyse patterns of interactions between genes and their products. However, whilst network models are well established in genomics, historically these models have mostly been static network models, ignoring the dynamic nature of genomic processes. In this work, we propose a model to infer dynamic genomic network structure, based on single-cell measurements of gene-expression counts. Our model draws on ideas from the Bayesian lasso and from copula modelling, and is implemented efficiently by combining Gibbs- and slice-sampling techniques. We apply the modelling to data from neural development, and infer changes in network structure which match current biological knowledge, as well as discovering novel network structures which identify potential targets for further experimental investigation by neuro-biologists.
... The continuous time extension of the DTM model, called continuous time dynamic topic model (cDTM) , in which the timestamp of each document is considered. Other approaches to model the temporal evolution of text datasets represent words co-occurences as a dynamic graph over time [Palla et al., 2016]. ...
Thesis
Most of current recommendation systems are based on ratings (i.e. numbers between 0 and 5) and try to suggest a content (movie, restaurant...) to a user. These systems usually allow users to provide a text review for this content in addition to ratings. It is hard to extract useful information from raw text while a rating does not contain much information on the content and the user. In this thesis, we tackle the problem of suggesting personalized readable text to users to help them make a quick decision about a content.More specifically, we first build a topic model that predicts personalized movie description from text reviews. Our model extracts distinct qualitative (i.e., which convey opinion) and descriptive topics by combining text reviews and movie ratings in a joint probabilistic model. We evaluate our model on an IMDB dataset and illustrate its performance through comparison of topics.We then study parameter inference in large-scale latent variable models, that include most topic models. We propose a unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We also propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods to online inference of latent variable models with the proper use of local Gibbs sampling. For the specific latent Dirichlet allocation topic model, we provide an extensive set of experiments and comparisons with existing work, where our new approach outperforms all previously proposed methods.Finally, we propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling text documents as sampling a DPP of sentences, and propose a conditional maximum likelihood formulation to model topic proportions, which is made possible with no approximation for our class of DPPs. We present an application to document summarization with a DPP on 2 to the power 500 items, where the summaries are composed of readable sentences.
Thesis
Full-text available
Nous proposons deux nouvelles approches pour les systèmes de recommandation etles réseaux. Dans la première partie, nous donnons d’abord un aperçu sur les systèmes de recommandationavant de nous concentrer sur les approches de rang faible pour la complétionde matrice. En nous appuyant sur une approche probabiliste, nous proposons de nouvellesfonctions de pénalité sur les valeurs singulières de la matrice de rang faible. En exploitantune représentation de modèle de mélange de cette pénalité, nous montrons qu’un ensemblede variables latentes convenablement choisi permet de développer un algorithme espérance-maximisationafin d’obtenir un maximum a posteriori de la matrice de rang faible complétée.L’algorithme résultant est un algorithme à seuillage doux itératif qui adapte de manière itérativeles coefficients de réduction associés aux valeurs singulières. L’algorithme est simple àmettre en œuvre et peut s’adapter à de grandes matrices. Nous fournissons des comparaisonsnumériques entre notre approche et de récentes alternatives montrant l’intérêt de l’approcheproposée pour la complétion de matrice à rang faible. Dans la deuxième partie, nous présentonsd’abord quelques prérequis sur l’approche bayésienne non paramétrique et en particuliersur les mesures complètement aléatoires et leur extension multivariée, les mesures complètementaléatoires composées. Nous proposons ensuite un nouveau modèle statistique pour lesréseaux parcimonieux qui se structurent en communautés avec chevauchement. Le modèle estbasé sur la représentation du graphe comme un processus ponctuel échangeable, et généralisenaturellement des modèles probabilistes existants à structure en blocs avec chevauchement aurégime parcimonieux. Notre construction s’appuie sur des vecteurs de mesures complètementaléatoires, et possède des paramètres interprétables, chaque nœud étant associé un vecteur représentantson niveau d’affiliation à certaines communautés latentes. Nous développons desméthodes pour simuler cette classe de graphes aléatoires, ainsi que pour effectuer l’inférence aposteriori. Nous montrons que l’approche proposée peut récupérer une structure interprétableà partir de deux réseaux du monde réel et peut gérer des graphes avec des milliers de nœudset des dizaines de milliers de connections.
Article
Full-text available
Users organize themselves into communities on web platforms. These communities can interact with one another, often leading to conflicts and toxic interactions. However, little is known about the mechanisms of interactions between communities and how they impact users. Here we study intercommunity interactions across 36,000 communities on Reddit, examining cases where users of one community are mobilized by negative sentiment to comment in another community. We show that such conflicts tend to be initiated by a handful of communities---less than 1% of communities start 74% of conflicts. While conflicts tend to be initiated by highly active community members, they are carried out by significantly less active members. We find that conflicts are marked by formation of echo chambers, where users primarily talk to other users from their own community. In the long-term, conflicts have adverse effects and reduce the overall activity of users in the targeted communities. Our analysis of user interactions also suggests strategies for mitigating the negative impact of conflicts---such as increasing direct engagement between attackers and defenders. Further, we accurately predict whether a conflict will occur by creating a novel LSTM model that combines graph embeddings, user, community, and text features. This model can be used to create an early-warning system for community moderators to prevent conflicts. Altogether, this work presents a data-driven view of community interactions and conflict, and paves the way towards healthier online communities.
Article
Full-text available
We propose a dynamic edge exchangeable network model that can capture sparse connections observed in real temporal networks, in contrast to existing models which are dense. The model achieved superior link prediction accuracy on multiple data sets when compared to a dynamic variant of the blockmodel, and is able to extract interpretable time-varying community structures from the data. In addition to sparsity, the model accounts for the effect of social influence on vertices' future behaviours. Compared to the dynamic blockmodels, our model has a smaller latent space. The compact latent space requires a smaller number of parameters to be estimated in variational inference and results in a computationally friendly inference algorithm.
Article
Full-text available
Statistical network modelling has focused on representing the graph as a discrete structure, namely the adjacency matrix. When assuming exchangeability of this array—which can aid in modelling, computations and theoretical analysis—the Aldous–Hoover theorem informs us that the graph is necessarily either dense or empty. We instead consider representing the graph as an exchangeable random measure and appeal to the Kallenberg representation theorem for this object. We explore using completely random measures (CRMs) to define the exchangeable random measure, and we show how our CRM construction enables us to achieve sparse graphs while maintaining the attractive properties of exchangeability. We relate the sparsity of the graph to the Lévy measure defining the CRM. For a specific choice of CRM, our graphs can be tuned from dense to sparse on the basis of a single parameter. We present a scalable Hamiltonian Monte Carlo algorithm for posterior inference, which we use to analyse network properties in a range of real data sets, including networks with hundreds of thousands of nodes and millions of edges.
Article
Full-text available
Many statistical methods for network data parameterize the edge-probability by attributing latent traits to the vertices such as block structure and assume exchangeability in the sense of the Aldous-Hoover representation theorem. Empirical studies of networks indicates that many large, real-world networks have a power-law distribution of the vertices which in turn implies the number of edges scale slower than quadratically in the number of vertices. These assumptions are fundamentally irreconcilable as the Aldous-Hoover theorem implies quadratic scaling of the number of edges. Recently Caron and Fox (2014) proposed the use of a different notion of exchangeability due to Kallenberg (2009) and obtained a network model which admits power-law behaviour while retaining desirable statistical properties, however this model do not capture latent vertex traits such as block-structure. In this work we re-introduce the use of block-structure for network modelling in the new setting and thereby obtain a model which admits the inference of block-structure and edge inhomogeneity. We derive a simple expression for the likelihood and an efficient sampling method. The obtained model is not significantly more difficult to implement than existing methods and performs well on real network datasets.
Article
Full-text available
The stable distribution, in its many parametrizations, is central to many stochastic processes. Many random variables that occur in the study of Lévy processes are related to it. Good progress has been made recently for simulating various quantities related to the stable law. In this note, we survey exact random variate generators for these distributions. Many distributional identities are also reviewed.
Article
Full-text available
In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation. We then develop a Dirichlet process mixture extension of our model and apply it to investigate the clustering of preferences for college degree programmes amongst Irish secondary school graduates. The existence of clusters of applicants who have similar preferences for degree programmes is established and we determine that subject matter and geographical location of the third level institution characterise these clusters.
Article
Full-text available
Significant efforts have gone into the development of statistical models for analyzing data in the form of networks, such as social networks. Most existing work has focused on modeling static networks, which represent either a single time snapshot or an aggregate view over time. There has been recent interest in statistical modeling of dynamic networks, which are observed at multiple points in time and offer a richer representation of many complex phenomena. In this paper, we present a state-space model for dynamic networks that extends the well-known stochastic blockmodel for static networks to the dynamic setting. We fit the model in a near-optimal manner using an extended Kalman filter (EKF) augmented with a local search. We demonstrate that the EKF-based algorithm performs competitively with a state-of-the-art algorithm based on Markov chain Monte Carlo sampling but is significantly less computationally demanding.
Article
Full-text available
Relational data-like graphs, networks, and matrices-is often dynamic, where the relational structure evolves over time. A fundamental problem in the analysis of time-varying network data is to extract a summary of the common structure and the dynamics of the underlying relations between the entities. Here we build on the intuition that changes in the network structure are driven by the dynamics at the level of groups of nodes. We propose a nonparametric multi-group membership model for dynamic networks. Our model contains three main components: We model the birth and death of individual groups with respect to the dynamics of the network structure via a distance dependent Indian Buffet Process. We capture the evolution of individual node group memberships via a Factorial Hidden Markov model. And, we explain the dynamics of the network structure by explicitly modeling the connectivity structure of groups. We demonstrate our model's capability of identifying the dynamics of latent groups in a number of different types of network data. Experimental results show that our model provides improved predictive performance over existing dynamic network models on future network forecasting and missing link prediction.
Article
Full-text available
The distributional properties of the duration of a recurrent Bessel process strad-dling an independent exponential time are studied in detail. Although our study may be considered as a particular case of M. Winkel's in [Wink], the infinite divisibility structure of these Bessel durations is particularly rich and we develop algebraic properties for a family of random variables arising from the Lévy measures of these durations.
Article
Full-text available
A random measure ξ on [0,1]2, [0, 1]}ℝ+ or ℝ +2 is said to be separately exchangeable, if its distribution is invariant under arbitrary Lebesgue measure-preserving transformations in the two coordinates, and jointly exchangeable if ξ is defined on [0,1]2 or ℝ +2, and its distribution is invariant under mappings by a common measure-preserving transformation in both directions. In each case, we derive a general representation of ξ in terms of independent Poisson processes and i.i.d. random variables.
Article
Full-text available
This article discusses the usage of a partiton based Fubini calculus for Poisson processes. The approach is an amplification of Bayesian techniques developed in Lo and Weng for gamma/Dirichlet processes. Applications to models are considered which all fall within an inhomogeneous spatial extension of the size biased framework used in Perman, Pitman and Yor. Among some of the results; an explicit partition based calculus is then developed for such models, which also includes a series of important exponential change of measure formula. These results are applied to obtain results for Levy-Cox models, identities related to the two-parameter Poisson-Dirichlet process and other processes, generalisations of the Markov-Krein correspondence, calculus for extended Neutral to the Right processes, among other things.
Article
This paper investigates properties of the class of graphs based on exchangeable point processes. We provide asymptotic expressions for the number of edges, number of nodes, and degree distributions, identifying four regimes: (i) a dense regime, (ii) a sparse, almost dense regime, (iii) a sparse regime with power-law behaviour, and (iv) an almost extremely sparse regime. We show that, under mild assumptions, both the global and local clustering coefficients converge to constants which may or may not be the same. We also derive a central limit theorem for subgraph counts and for the number of nodes. Finally, we propose a class of models within this framework where one can separately control the latent structure and the global sparsity/power-law properties of the graph.
Article
We propose a novel class of network models for temporal dyadic interaction data. Our goal is to capture a number of important features often observed in social interactions: sparsity, degree heterogeneity, community structure and reciprocity. We propose a family of models based on self-exciting Hawkes point processes in which events depend on the history of the process. The key component is the conditional intensity function of the Hawkes Process, which captures the fact that interactions may arise as a response to past interactions (reciprocity), or due to shared interests between individuals (community structure). In order to capture the sparsity and degree heterogeneity, the base (non time dependent) part of the intensity function builds on compound random measures following Todeschini et al. (2016). We conduct experiments on a variety of real-world temporal interaction data and show that the proposed model outperforms many competing approaches for link prediction, and leads to interpretable parameters.
Article
Abstract: We propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of completely random measures, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges.
Article
Many modern network datasets arise from processes of interactions in a population, such as phone calls, email exchanges, co-authorships, and professional collaborations. In such interaction networks, the edges comprise the fundamental statistical units, making a framework for edge-labeled networks more appropriate for statistical analysis. In this context we initiate the study of edge exchangeable network models and explore its basic statistical properties. Several theoretical and practical features make edge exchangeable models better suited to many applications in network analysis than more common vertex-centric approaches. In particular, edge exchangeable models allow for sparse structure and power law degree distributions, both of which are widely observed empirical properties that cannot be handled naturally by more conventional approaches. Our discussion culminates in the Hollywood model, which we identify here as the canonical family of edge exchangeable distributions. The Hollywood model is computationally tractable, admits a clear interpretation, exhibits good theoretical properties, and performs reasonably well in estimation and prediction as we demonstrate on real network datasets. As a generalization of the Hollywood model, we further identify the vertex components model as a nonparametric subclass of models with a convenient stick breaking construction.
Article
Many data sets can be represented as a sequence of interactions between entities|for example communications between individuals in a social network, protein-protein interactions or DNA-protein interactions in a biological context, or vehicles' journeys between cities. In these contexts, there is often interest in making predictions about future interactions, such as who will message whom. A popular approach to network modeling in a Bayesian context is to assume that the observed interactions can be explained in terms of some latent structure. For example, trafic patterns might be explained by the size and importance of cities, and social network interactions might be explained by the social groups and interests of individuals. Unfortunately, while elucidating this structure can be useful, it often does not directly translate into an effective predictive tool. Further, many existing approaches are not appropriate for sparse networks, a class that includes many interesting real-world situations. In this paper, we develop models for sparse networks that combine structure elucidation with predictive performance. We use a Bayesian nonparametric approach, which allows us to predict interactions with entities outside our training set, and allows the both the latent dimensionality of the model and the number of nodes in the network to grow in expectation as we see more data. We demonstrate that we can capture latent structure while maintaining predictive power, and discuss possible extensions.
Article
In a recent paper, Caron and Fox suggest a probabilistic model for sparse graphs which are exchangeable when associating each vertex with a time parameter in R+\mathbb{R}_+. Here we show that by generalizing the classical definition of graphons as functions over probability spaces to functions over σ\sigma-finite measure spaces, we can model a large family of exchangeable graphs, including the Caron-Fox graphs and the traditional exchangeable dense graphs as special cases. Explicitly, modelling the underlying space of features by a σ\sigma-finite measure space (S,S,μ)(S,\mathcal{S},\mu) and the connection probabilities by an integrable function W ⁣:S×S[0,1]W\colon S\times S\to [0,1], we construct a random family (Gt)t0(G_t)_{t\geq0} of growing graphs such that the vertices of GtG_t are given by a Poisson point process on S with intensity tμt\mu, with two points x,y of the point process connected with probability W(x,y). We call such a random family a graphon process. We prove that a graphon process has convergent subgraph frequencies (with possibly infinite limits) and that, in the natural extension of the cut metric to our setting, the sequence converges to the generating graphon. We also show that the underlying graphon is identifiable only as an equivalence class over graphons with cut distance zero. More generally, we study metric convergence for arbitrary (not necessarily random) sequences of graphs, and show that a sequence of graphs has a convergent subsequence if and only if it has a subsequence satisfying a property we call uniform regularity of tails. Finally, we prove that every graphon is equivalent to a graphon on R+\mathbb{R}_+ equipped with Lebesgue measure.
Article
A known failing of many popular random graph models is that the Aldous-Hoover Theorem guarantees these graphs are dense with probability one; that is, the number of edges grows quadratically with the number of nodes. This behavior is considered unrealistic in observed graphs. We define a notion of edge exchangeability for random graphs in contrast to the established notion of infinite exchangeability for random graphs --- which has traditionally relied on exchangeability of nodes (rather than edges) in a graph. We show that, unlike node exchangeability, edge exchangeability encompasses models that are known to provide a projective sequence of random graphs that circumvent the Aldous-Hoover Theorem and exhibit sparsity, i.e., sub-quadratic growth of the number of edges with the number of nodes. We show how edge-exchangeability of graphs relates naturally to existing notions of exchangeability from clustering (a.k.a. partitions) and other familiar combinatorial structures.
Article
We introduce a class of random graphs that we argue meets many of the desiderata one would demand of a model to serve as the foundation for a statistical analysis of real-world networks. The class of random graphs is defined by a probabilistic symmetry: invariance of the distribution of each graph to an arbitrary relabelings of its vertices. In particular, following Caron and Fox, we interpret a symmetric simple point process on R+2\mathbb{R}_+^2 as the edge set of a random graph, and formalize the probabilistic symmetry as joint exchangeability of the point process. We give a representation theorem for the class of random graphs satisfying this symmetry via a straightforward specialization of Kallenberg's representation theorem for jointly exchangeable random measures on R+2\mathbb{R}_+^2. The distribution of every such random graph is characterized by three (potentially random) components: a nonnegative real IR+I \in \mathbb{R}_+, an integrable function S:R+R+S: \mathbb{R}_+ \to \mathbb{R}_+, and a symmetric measurable function W:R+2[0,1]W: \mathbb{R}_+^2 \to [0,1] that satisfies several weak integrability conditions. We call the triple (I,S,W) a graphex, in analogy to graphons, which characterize the (dense) exchangeable graphs on N\mathbb{N}. Indeed, the model we introduce here contains the exchangeable graphs as a special case, as well as the "sparse exchangeable" model of Caron and Fox. We study the structure of these random graphs, and show that they can give rise to interesting structure, including sparse graph sequences. We give explicit equations for expectations of certain graph statistics, as well as the limiting degree distribution. We also show that certain families of graphexes give rise to random graphs that, asymptotically, contain an arbitrarily large fraction of the vertices in a single connected component.
Article
Current Bayesian models for dynamic social network data have focused on modelling the influence of evolving unobserved structure on observed social interactions. However, an understanding of how observed social relationships from the past affect future unobserved structure in the network has been neglected. In this paper, we introduce a new probabilistic model for capturing this phenomenon, which we call latent feature propagation, in social networks. We demonstrate our model's capability for inferring such latent structure in varying types of social network datasets, and experimental studies show this structure achieves higher predictive performance on link prediction and forecasting tasks.
Article
The natural habitat of most Bayesian methods is data represented by exchangeable sequences of observations, for which de Finetti's theorem provides the theoretical foundation. Dirichlet process clustering, Gaussian process regression, and many other parametric and nonparametric Bayesian models fall within the remit of this framework; many problems arising in modern data analysis do not. This expository paper provides an introduction to Bayesian models of graphs, matrices, and other data that can be modeled by random structures. We describe results in probability theory that generalize de Finetti's theorem to such data and discuss the relevance of these results to nonparametric Bayesian modeling. With the basic ideas in place, we survey example models available in the literature; applications of such models include collaborative filtering, link prediction, and graph and network analysis. We also highlight connections to recent developments in graph theory and probability, and sketch the more general mathematical foundation of Bayesian methods for other types of data beyond sequences and arrays.
Article
We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books.
Article
Consider an array of random variables (Xi,j), 1 ≤ i,j < ∞, such that permutations of rows or of columns do not alter the distribution of the array. We show that such an array may be represented as functions f(α, ξi, ηj, λi,j) of underlying i.i.d, random variables. This result may be useful in characterizing arrays with additional structure. For example, we characterize random matrices whose distribution is invariant under orthogonal rotation, confirming a conjecture of Dawid.
Conference Paper
In a dynamic social or biological environ- ment, interactions between the underlying actors can undergo large and systematic changes. Each actor can assume multiple roles and their degrees of aliation to these roles can also exhibit rich temporal phenom- ena. We propose a state space mixed mem- bership stochastic blockmodel which can track across time the evolving roles of the actors. We also derive an ecient variational infer- ence procedure for our model, and apply it to the Enron email networks, and rewiring gene regulatory networks of yeast. In both cases, our model reveals interesting dynamical roles of the actors.
Article
Real-world relational data sets, such as social networks, often involve measurements over time. We propose a Bayesian nonparametric latent feature model for such data, where the latent features for each actor in the network evolve according to a Markov process, extending recent work on similar models for static networks. We show how the number of features and their trajectories for each actor can be inferred simultaneously and demonstrate the utility of this model on prediction tasks using synthetic and real-world data. 1
Article
When making probabilistic models for survival times, one should consider the fact that individuals are heterogeneous. The observed changes in population intensities (or hazard rates) over time are a mixed result of two influences: on the one hand, the actual changes in the individual hazards, and, on the other hand, the selection due to high-risk individuals leaving the risk group early. I will consider the common multiplicative model for heterogeneity, but with the new feature that the random proportionality factor has a compound Poisson distribution. This distribution is studied in some detail. It is pointed out how its application to the survival situation extends a model of Hougaard, inheriting several nice properties. One important feature of the model is that it yields a subgroup of zero susceptibility, which "survives forever." This is a relevant model in medicine and demography. Two examples are given where the model is fitted to data concerning marriage rates and fertility.
Article
A parametric family of completely random measures, which includes gamma random measures, positive stable random measures as well as inverse Gaussian measures, is defined. In order to develop models for clustered point patterns with dependencies between points, the family is used in a shot-noise construction as intensity measures for Cox processes. The resulting Cox processes are of Poisson cluster process type and include Poisson processes and ordinary Neyman-Scott processes. We show characteristics of the completely random measures, illustrated by simulations, and derive moment and mixing properties for the shot-noise random measures. Finally statistical inference for shot-noise Cox processes is considered and some results on nearest-neighbour Markov properties are given.
Article
One of the main research areas in Bayesian Nonparametrics is the proposal and study of priors which generalize the Dirichlet process. In this paper, we provide a comprehensive Bayesian non-parametric analysis of random probabilities which are obtained by normalizing random measures with independent increments (NRMI). Special cases of these priors have already shown to be useful for statistical applications such as mixture models and species sampling problems. However, in order to fully exploit these priors, the derivation of the posterior distribution of NRMIs is crucial: here we achieve this goal and, indeed, provide explicit and tractable expressions suitable for practical implementation. The posterior distribution of an NRMI turns out to be a mixture with respect to the distribution of a specific latent variable. The analysis is completed by the derivation of the corresponding predictive distributions and by a thorough investigation of the marginal structure. These results allow to derive a generalized Blackwell-MacQueen sampling scheme, which is then adapted to cover also mixture models driven by general NRMIs. Copyright (c) 2008 Board of the Foundation of the Scandinavian Journal of Statistics.
Article
In a dynamic social or biological environment, the interactions between the underlying actors can undergo large and systematic changes. The latent roles or membership of the actors as determined by these dynamic links will also exhibit rich temporal phenomena, assuming a distinct role at one point while leaning more towards a second role at an another point. To capture this dynamic mixed membership in rewiring networks, we propose a state space mixed membership stochastic blockmodel which embeds an actor into a latent space and track its mixed membership in the latent space across time. We derived efficient approximate learning and inference algorithms for our model, and applied the learned models to analyze a social network between monks, and a rewiring gene interaction network of Drosophila melanogaster collected during its full life cycle. In both cases, our model reveals interesting patterns of the dynamic roles of the actors.
Article
Basic results on stochastic differential equations in Hilbert and Banach space, linear stochastic evolution equations and some classes of nonlinear stochastic evolution equations are reviewed. The emphasis is on equations relevant to the study of spacetime stochastic processes. In particular the class of measure processes, the continuous analogs of spacetime population processes, is studied in detail.
Article
The paper deals with the problem of determining the number of components in a mixture model. We take a Bayesian non-parametric approach and adopt a hierarchical model with a suitable non-parametric prior for the latent structure. A commonly used model for such a problem is the mixture of Dirichlet process model. Here, we replace the Dirichlet process with a more general non-parametric prior obtained from a generalized gamma process. The basic feature of this model is that it yields a partition structure for the latent variables which is of Gibbs type. This relates to the well-known (exchangeable) product partition models. If compared with the usual mixture of Dirichlet process model the advantage of the generalization that we are examining relies on the availability of an additional parameter "σ" belonging to the interval (0,1): it is shown that such a parameter greatly influences the clustering behaviour of the model. A value of "σ" that is close to 1 generates a large number of clusters, most of which are of small size. Then, a reinforcement mechanism which is driven by "σ" acts on the mass allocation by penalizing clusters of small size and favouring those few groups containing a large number of elements. These features turn out to be very useful in the context of mixture modelling. Since it is difficult to specify "a priori" the reinforcement rate, it is reasonable to specify a prior for "σ". Hence, the strength of the reinforcement mechanism is controlled by the data. Copyright 2007 Royal Statistical Society.
Article
Here we present a novel method for modeling stationary time series. Our approach is to construct the model with a specified marginal family and build the dependence structure around it. We show that the resulting time series is linear with a simple autocorrelation structure. We construct models that parallel existing structures, namely state-space models, autoregressive conditional heteroscedasticity (ARCH) models, and generalized ARCH models. We use Bayesian techniques to estimate the resulting models. We also demonstrate that the models perform well compared with competing methods for the applications considered, count models and volatility models.
Bayesian logistic gaussian process models for dynamic networks
  • D Durante
  • D Dunson
Dynamic network model from partial observations
  • E Ghalebi
  • B Mirzasoleiman
  • R Grosu
  • J Leskovec
A nonparametric bayesian model for sparse temporal multigraphs
  • E Ghalebi
  • H Mahyar
  • R Grosu
  • G W Taylor
  • S A Williamson
Finite-dimensional bfry priors and variational bayesian inference for power law models
  • J Lee
  • L F James
  • S Choi
Random probability measures derived from increasing additive processes and their application to Bayesian statistics
  • I Prünster
Relations on probability spaces and arrays of random variables. Preprint, Institute for Advanced Study
  • D N Hoover