Article

Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We propose a novel class of network models for temporal dyadic interaction data. Our goal is to capture a number of important features often observed in social interactions: sparsity, degree heterogeneity, community structure and reciprocity. We propose a family of models based on self-exciting Hawkes point processes in which events depend on the history of the process. The key component is the conditional intensity function of the Hawkes Process, which captures the fact that interactions may arise as a response to past interactions (reciprocity), or due to shared interests between individuals (community structure). In order to capture the sparsity and degree heterogeneity, the base (non time dependent) part of the intensity function builds on compound random measures following Todeschini et al. (2016). We conduct experiments on a variety of real-world temporal interaction data and show that the proposed model outperforms many competing approaches for link prediction, and leads to interpretable parameters.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Due to their flexibility and mathematical tractability, Hawkes processes have been extensively used in the literature in a series of applications. They have modelled among others, neural activity , earthquakes (Ogata 1988), violence (Loeffler & Flaxman 2018, Holbrook et al. 2021) and social interactions (Miscouridou et al. 2018). ...
... Some studies do provide nonparametric approaches for the background rate: Lewis & Mohler (2011) provide an estimation procedure for the background and kernel of the Hawkes process when no parametric form is assumed for either of the two. Miscouridou et al. (2018) use a nonparametric prior based on completely random measures to construct the discrete background rate for the Hawkes processes that build directed networks. Other recent approaches use neural networks to estimate the rate (Omi et al. 2019). ...
... It is a nonnegative function with initial nonzero value that captures the underlying patterns in space and time that encourage the clustering of events in those time and space locations. It often takes the form of a constant for simplicity, or a parametric form such as periodic as assumed in Unwin et al. (2021) or can even have a nonparametric prior constructed on random measures as in (Miscouridou et al. 2018). As further explained in more detail below, we assume a log-Gaussian process prior on µ(t, s). ...
Preprint
Full-text available
Hawkes processes are point process models that have been used to capture self-excitatory behavior in social interactions, neural activity, earthquakes and viral epidemics. They can model the occurrence of the times and locations of events. Here we develop a new class of spatiotemporal Hawkes processes that can capture both triggering and clustering behavior and we provide an efficient method for performing inference. We use a log-Gaussian Cox process (LGCP) as prior for the background rate of the Hawkes process which gives arbitrary flexibility to capture a wide range of underlying background effects (for infectious diseases these are called endemic effects). The Hawkes process and LGCP are computationally expensive due to the former having a likelihood with quadratic complexity in the number of observations and the latter involving inversion of the precision matrix which is cubic in observations. Here we propose a novel approach to perform MCMC sampling for our Hawkes process with LGCP background, using pre-trained Gaussian Process generators which provide direct and cheap access to samples during inference. We show the efficacy and flexibility of our approach in experiments on simulated data and use our methods to uncover the trends in a dataset of reported crimes in the US.
... There has been significant recent interest in generative models for timestamped relational event data. Such models typically combine a Temporal Point Process (TPP) model such as a Hawkes process (Laub et al., 2015) for event times with a latent variable network model such as a Stochastic Block Model (SBM) (Nowicki & Snijders, 2001) for the sender and receiver of the event (Blundell et al., 2012;DuBois et al., 2013;Yang et al., 2017;Miscouridou et al., 2018;Matias et al., 2018;Junuthula et al., 2019;Arastuie et al., 2020). We call such models as continuous-time network models because they provide probabilities of observing events between nodes at arbitrary times. ...
... The latent variable representations are often inspired by generative models for static networks such as latent space models (Hoff et al., 2002) and stochastic block models (Holland et al., 1983). Continuoustime network models have been built with continuous latent space representations (Yang et al., 2017) and latent block or community representations (Blundell et al., 2012;DuBois et al., 2013;Xin et al., 2017;Matias et al., 2018;Miscouridou et al., 2018;Corneli et al., 2018;Junuthula et al., 2019;Arastuie et al., 2020). ...
... The CHIP (Arastuie et al., 2020) model uses a univariate Hawkes process to model self excitation for each node pair, with node pairs in the same community pair sharing parameters. Bivariate Hawkes process models (Blundell et al., 2012;Yang et al., 2017;Miscouridou et al., 2018) allow events i → j, which we denote by the directed pair (i, j), to influence the probability of events (j, i). This encourages reciprocal events, which are commonly seen in email and messaging networks, where a reciprocal event typically denotes a user replying to a message. ...
Preprint
The stochastic block model (SBM) is one of the most widely used generative models for network data. Many continuous-time dynamic network models are built upon the same assumption as the SBM: edges or events between all pairs of nodes are conditionally independent given the block or community memberships, which prevents them from reproducing higher-order motifs such as triangles that are commonly observed in real networks. We propose the multivariate community Hawkes (MULCH) model, an extremely flexible community-based model for continuous-time networks that introduces dependence between node pairs using structured multivariate Hawkes processes. We fit the model using a spectral clustering and likelihood-based local refinement procedure. We find that our proposed MULCH model is far more accurate than existing models both for predictive and generative tasks.
... Specifically, Hawkes processes are well-fitted to model such reciprocating behaviors in temporal interactions. To further capture the underlying community structure, some recent works (Blundell et al., 2012;DuBois et al., 2013;Linderman et al., 2014;Yang et al., 2017;Miscouridou et al., 2018) attempt to hybridize statistical models for static networks with Hawkes processes to model both implicit social structure and reciprocity among entities. The Hawkes stochastic block models (Hawkes-SBMs) (Blundell et al., 2012;Junuthula et al., 2019;Arastuie et al., 2019) characterize the interaction dynamics between groups of individuals using mutually-exciting Hawkes processes. ...
... The Hawkes stochastic block models (Hawkes-SBMs) (Blundell et al., 2012;Junuthula et al., 2019;Arastuie et al., 2019) characterize the interaction dynamics between groups of individuals using mutually-exciting Hawkes processes. To further capture the reciprocity between each pair of two individuals, Miscouridou et al. (2018) proposes to model pair-wise reciprocating dynamics by letting the base intensities depending on the underlying community structure. The bottom left graph shows the underlying community structure. ...
... Despite having many attractive properties, the Hawkes-CCRM (Miscouridou et al., 2018) is restrictive in that the reciprocity in all the interactions are captured via the same triggering kernel, and thus cannot interpret the differences in interaction dynamics across individuals. For example, an employee may reply back to the emails from his/her department more quickly than responding to non-urgent emails from outside. ...
Preprint
We propose a novel probabilistic framework to model continuous-time interaction events data. Our goal is to infer the \emph{implicit} community structure underlying the temporal interactions among entities, and also to exploit how the community structure influences the interaction dynamics among these nodes. To this end, we model the reciprocating interactions between individuals using mutually-exciting Hawkes processes. The base rate of the Hawkes process for each pair of individuals is built upon the latent representations inferred using the hierarchical gamma process edge partition model (HGaP-EPM). In particular, our model allows the interaction dynamics between each pair of individuals to be modulated by their respective affiliated communities. Moreover, our model can flexibly incorporate the auxiliary individuals' attributes, or covariates associated with interaction events. Efficient Gibbs sampling and Expectation-Maximization algorithms are developed to perform inference via P\'olya-Gamma data augmentation strategy. Experimental results on real-world datasets demonstrate that our model not only achieves competitive performance for temporal link prediction compared with state-of-the-art methods, but also discovers interpretable latent structure behind the observed temporal interactions.
... Given the current estimate of the cluster assignments, the conditional intensities are then estimated using a non-parametric M-step, consisting of either a histogram or kernel based estimate. A similar model has been proposed elsewhere (Miscouridou et al. 2018), where edge exchangeable models for binary graphs are extended to this setting. Here, the baseline of a Hawkes process encodes the affiliation of each node to the K latent communities, with a common exponential kernel for all interactions. ...
... For each of these networks, we fix K , the number of communities, based on knowledge of the network structure, as we aim to compare link prediction for a given K . We use K as considered elsewhere for these examples (Miscouridou et al. 2018). We partition the events into training and test periods which contain 85% and 15% of events respectively. ...
Article
Full-text available
A common goal in network modeling is to uncover the latent community structure present among nodes. For many real-world networks, the true connections consist of events arriving as streams, which are then aggregated to form edges, ignoring the dynamic temporal component. A natural way to take account of these temporal dynamics of interactions is to use point processes as the foundation of network models for community detection. Computational complexity hampers the scalability of such approaches to large sparse networks. To circumvent this challenge, we propose a fast online variational inference algorithm for estimating the latent structure underlying dynamic event arrivals on a network, using continuous-time point process latent network models. We describe this procedure for network models capturing community structure. This structure can be learned as new events are observed on the network, updating the inferred community assignments. We investigate the theoretical properties of such an inference scheme, and provide regret bounds on the loss function of this procedure. The proposed inference procedure is then thoroughly compared, using both simulation studies and real data, to non-online variants. We demonstrate that online inference can obtain comparable performance, in terms of community recovery, to non-online variants, while realising computational gains. Our proposed inference framework can also be readily modified to incorporate other popular network structures.
... Given the current estimate of the cluster assignments, the conditional intensities are then estimated using a non-parametric M-step, consisting of either a histogram or kernel based estimate. A similar model has been proposed elsewhere (Miscouridou et al., 2018), where edge exchangeable models for binary graphs are extended to this setting. Here, the baseline of a Hawkes process encodes the affiliation of each node to the K latent communities, with a common exponential kernel for all interactions. ...
... For each of these networks, we fix K, the number of communities, based on knowledge of the network structure, as we aim to compare link prediction for a given K. We use K as considered elsewhere for these examples (Miscouridou et al., 2018). We partition the events into training and test periods which contain 85% and 15% of events respectively. ...
... Combining high heterogeneity with sparse connectivity results in modular structures (Mukherjee and Hill, 2011;Miscouridou et al., 2018), and the highly modular structure of the BNN shows the same set of advantages as sparse connectivity. The modular . ...
Article
Full-text available
Although it may appear infeasible and impractical, building artificial intelligence (AI) using a bottom-up approach based on the understanding of neuroscience is straightforward. The lack of a generalized governing principle for biological neural networks (BNNs) forces us to address this problem by converting piecemeal information on the diverse features of neurons, synapses, and neural circuits into AI. In this review, we described recent attempts to build a biologically plausible neural network by following neuroscientifically similar strategies of neural network optimization or by implanting the outcome of the optimization, such as the properties of single computational units and the characteristics of the network architecture. In addition, we proposed a formalism of the relationship between the set of objectives that neural networks attempt to achieve, and neural network classes categorized by how closely their architectural features resemble those of BNN. This formalism is expected to define the potential roles of top-down and bottom-up approaches for building a biologically plausible neural network and offer a map helping the navigation of the gap between neuroscience and AI engineering.
... Efforts at theoretically understanding the emergence of reciprocal interactions in temporal communication data include Bayesian inference via network models of Hawkes processes [69,70] and stochastic blockmodeling of relational event data [71] in both directed [72] and temporal [73] networks. When posed as a machine learning task, the identification of reciprocal interactions has also been applied to the prediction of online extremism in Twitter [74]. ...
Article
Full-text available
Human communication, the essence of collective social phenomena ranging from small-scale organizations to worldwide online platforms, features intense reciprocal interactions between members in order to achieve stability, cohesion, and cooperation in social networks. While high levels of reciprocity are well known in aggregated communication data, temporal patterns of reciprocal information exchange have received far less attention. Here we propose measures of reciprocity based on the time ordering of interactions and explore them in data from multiple communication channels, including calls, messaging and social media. By separating each channel into reciprocal and non-reciprocal temporal networks, we find persistent trends that point to the distinct roles of one-to-one exchange versus information broadcast. We implement several null models of communication activity, which identify memory, a higher tendency to repeat interactions with past contacts, as a key source of temporal reciprocity. When adding memory to a model of activity-driven, time-varying networks, we reproduce the levels of temporal reciprocity seen in empirical data. Our work adds to the theoretical understanding of the emergence of reciprocity in human communication systems, hinting at the mechanisms behind the formation of norms in social exchange and large-scale cooperation.
... However, classical MEPPs assume that the network between the member processes is directed with constant weights, which leads to an undesirably large incidence matrix and the incapability of capturing any change in these links. To model a dynamic network structure, Miscouridou et al. [2018] formulated the connections using completely random measures that promote model sparsity. Although it has a relatively small parameter space, this model requires users to fine-tune many hyper-parameters. ...
Preprint
Full-text available
We develop a Spatio-TEMporal Mutually Exciting point process with Dynamic network (STEMMED), i.e., a point process network wherein each node models a unique community-drug event stream with a dynamic mutually-exciting structure, accounting for influences from other nodes. We show that STEMMED can be decomposed node-by-node, suggesting a tractable distributed learning procedure. Simulation shows that this learning algorithm can accurately recover known parameters of STEMMED, especially for small networks and long data-horizons. Next, we turn this node-by-node decomposition into an online cooperative multi-period forecasting framework, which is asymptotically robust to operational errors, to facilitate Opioid-related overdose death (OOD) trends forecasting among neighboring communities. In our numerical study, we parameterize STEMMED using individual-level OOD data and county-level demographics in Massachusetts. For any node, we observe that OODs within the same drug class from nearby locations have the greatest influence on future OOD trends. Furthermore, the expected proportion of OODs triggered by historical events varies greatly across counties, ranging between 30%-70%. Finally, in a practical online forecasting setting, STEMMED-based cooperative framework reduces prediction error by 60% on average, compared to well-established forecasting models. Leveraging the growing abundance of public health surveillance data, STEMMED can provide accurate forecasts of local OOD trends and highlight complex interactions between OODs across communities and drug types. Moreover, STEMMED enhances synergies between local and federal government entities, which is critical to designing impactful policy interventions.
... Efforts at theoretically understanding the emergence of reciprocal interactions in temporal communication data include Bayesian inference via network models of Hawkes processes [68,69] and stochastic blockmodeling of relational event data [70] in both directed [71] and temporal [72] networks. When posed as a machine learning task, the identification of reciprocal interactions has also been applied to the prediction of online extremism in Twitter [73]. ...
Preprint
Full-text available
Human communication, the essence of collective social phenomena ranging from small-scale organizations to worldwide online platforms, features intense reciprocal interactions between members in order to achieve stability, cohesion, and cooperation in social networks. While high levels of reciprocity are well known in aggregated communication data, temporal patterns of reciprocal information exchange have received far less attention. Here we propose measures of reciprocity based on the time ordering of interactions and explore them in data from multiple communication channels, including calls, messaging and social media. By separating each channel into reciprocal and non-reciprocal temporal networks, we find persistent trends that point to the distinct roles of one-on-one exchange versus information broadcast. We implement several null models of communication activity, which identify memory, a higher tendency to repeat interactions with past contacts, as a key source of reciprocity. When adding memory to a model of activity-driven, time-varying networks, we reproduce the levels of reciprocity seen in empirical data. Our work adds to the theoretical understanding of the emergence of reciprocity in human communication systems, hinting at the mechanisms behind the formation of norms in social exchange and large-scale cooperation.
... Yu et al. (2020) separate the learning of the structure and the estimation of the non-zero parameters. Similarly, Miscouridou et al. (2018) use elaborate hierarchical structures and prior specifications to uncover sparsity, heterogeneity, reciprocity, and community structure in their multivariate Bayesian estimation procedure. Both methods only provide an approximation to the parameters (or their posterior distributions). ...
Preprint
Full-text available
Hawkes processes are point processes that model data where events occur in clusters through the self-exciting property of the intensity function. We consider a multivariate setting where multiple dimensions can influence each other with intensity function to allow for excitation and inhibition, both within and across dimensions. We discuss how such a model can be implemented and highlight challenges in the estimation procedure induced by a potentially negative intensity function. Furthermore, we introduce a new, stronger condition for stability that encompasses current approaches established in the literature. Finally, we examine the total number of offsprings to reparametrise the model and subsequently use Normal and sparsity-inducing priors in a Bayesian estimation procedure on simulated data.
Chapter
In this paper we propose a Bayesian nonparametric approach to modelling sparse time-varying networks. A positive parameter is associated to each node of a network, which models the sociability of that node. Sociabilities are assumed to evolve over time, and are modelled via a dynamic point process model. The model is able to capture long term evolution of the sociabilities. Moreover, it yields sparse graphs, where the number of edges grows subquadratically with the number of nodes. The evolution of the sociabilities is described by a tractable time-varying generalised gamma process. We provide some theoretical insights into the model and apply it to three datasets: a simulated network, a network of hyperlinks between communities on Reddit, and a network of co-occurences of words in Reuters news articles after the September 11th11^{th} attacks.KeywordsBayesian nonparametricsPoisson random measuresNetworksRandom graphsSparsityPoint processes
Article
Full-text available
A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG). MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection when assessing the significance of future events, including previously unobserved connections between nodes. The model combines mutually exciting point processes to estimate dependencies between events and latent space models to infer relationships between the nodes. The intensity functions for each network edge are characterized exclusively by node-specific parameters, which allows information to be shared across the network. This construction enables estimation of intensities even for unobserved edges, which is particularly important in real world applications, such as computer networks arising in cyber-security. A recursive form of the log-likelihood function for MEG is obtained, which is used to derive fast inferential procedures via modern gradient ascent algorithms. An alternative EM algorithm is also derived. The model and algorithms are tested on simulated graphs and real world datasets, demonstrating excellent performance. Supplementary materials for this article are available online.
Article
Continuous-time interaction data is usually generated under time-evolving environment. Hawkes processes (HP) are commonly used mechanisms for the analysis of such data. However, typical model implementations (such as e.g. stochastic block models) assume that the exogenous (background) interaction rate is constant, and so they are limited in their ability to adequately describe any complex time-evolution in the background rate of a process. In this paper, we introduce a stochastic exogenous rate Hawkes process (SE-HP) which is able to learn time variations in the exogenous rate. The model affiliates each node with a piecewise-constant membership distribution with an unknown number of changepoint locations, and allows these distributions to be related to the membership distributions of interacting nodes. The time-varying background rate function is derived through combinations of these membership functions. We introduce a stochastic gradient MCMC algorithm for efficient, scalable inference. The performance of the SE-HP is explored on real world, continuous-time interaction datasets, where we demonstrate that the SE-HP strongly outperforms comparable state-of-the-art methods. We introduce a stochastic gradient MCMC algorithm for efficient, scalable inference. The performance of the SE-HP is explored on real world, continuous-time interaction datasets, where we demonstrate that the SE-HP strongly outperforms comparable state-of-the-art methods.
ResearchGate has not been able to resolve any references for this publication.