Conference PaperPDF Available

Finding Hierarchy in Directed Online Social Networks

Authors:

Abstract and Figures

Social hierarchy and stratification among humans is a well studied concept in sociology. The popularity of online social networks presents an opportunity to study social hierarchy for different types of networks and at different scales. We adopt the premise that people form connections in a social network based on their perceived social hierarchy; as a result, the edge directions in directed social networks can be leveraged to infer hierarchy. In this paper, we define a measure of hierarchy in a directed online social network, and present an efficient algorithm to compute this measure. We validate our measure using ground truth including Wikipedia notability score. We use this measure to study hierarchy in several directed online social networks including Twitter, Delicious, YouTube, Flickr, LiveJournal, and curated lists of several categories of people based on different occupations, and different organizations. Our experiments on different online social networks show how hierarchy emerges as we increase the size of the network. This is in contrast to random graphs, where the hierarchy decreases as the network size increases. Further, we show that the degree of stratification in a network increases very slowly as we increase the size of the graph.
Content may be subject to copyright.
A preview of the PDF is not available
... However, in real societies, individuals in human networks are commonly divided into a hierarchical arrangement based on different attributes such as importance, wealth, knowledge and power. This phenomenon is known as social stratification [3], and stratification along economic or class-based lines has been one of the most important topics of study in the modern social sciences [4]. Social stratification, as well as its counterpart social mobility, governs the trajectories of people's lives, including the extent of prejudice that they face [5], their careers and occupations [6] and the likelihood that they will experience violence [7]. ...
... For instance, the capability of individuals to be upwardly mobile can be estimated by examining their connections to higher status individuals in networks [28]. In other words, networks provide an opportunity to study the emergence of social stratification [3], which can help to understand how decisions of individuals can lead to a socially stratified network. ...
... For example, it is conventional in economic analysis of Western societies to define a lower, middle and upper class [38]. However, in other networks, such as interactions in meetings or conferences [3], it is possible that neither the class boundaries nor even the number of classes is known ahead of time. ...
Article
It has been observed that real-world social networks often exhibit stratification along economic or other lines, with consequences for class mobility and access to opportunities. With the rise in human interaction data and extensive use of online social networks, the structure of social networks (representing connections between individuals) can be used for measuring stratification. However, although stratification has been studied extensively in the social sciences, there is no single, generally applicable metric for measuring the level of stratification in a network. In this work, we first propose the novel Stratification Assortativity (StA) metric, which measures the extent to which a network is stratified into different tiers. Then, we use the StA metric to perform an in-depth analysis of the stratification of five co-authorship networks. We examine the evolution of these networks over 50 years and show that these fields demonstrate an increasing level of stratification over time, and, correspondingly, the trajectory of a researcher’s career is increasingly correlated with her entry point into the network.
... For instance, the capability of individuals to be upwardly mobile can be estimated by examining their connections to higher status individuals in networks [28]. In other words, networks provide an opportunity to study the emergence of social stratification [3], which can help to understand how decisions of individuals can lead to a socially stratified network. Some empirical analyses on networks have examined social mobility as a proxy for stratification or, if network connections are known, individually examine inter-class connections between predefined classes [10]. ...
... For example, it is conventional in economic analysis of Western societies to define a lower, middle, and upper class [38]. However, in other networks, such as interactions in meetings or conferences [3], it is possible that neither the class boundaries nor even the number of classes is known ahead of time. ...
... For example, it is conventional in economic analysis of Western societies to define a lower, middle, and upper class [38]. However, in other applications like meetings or conferences between individuals [3], it is possible that neither the class boundaries nor even the number of classes is known ahead of time. ...
Preprint
Full-text available
It has been observed that real-world social networks often exhibit stratification along economic or other lines, with consequences for class mobility and access to opportunities. With the rise in human interaction data and extensive use of online social networks, the structure of social networks (representing connections between individuals) can be used for measuring stratification. However, although stratification has been studied extensively in the social sciences, there is no single, generally applicable metric for measuring the level of stratification in a network. In this work, we first propose the novel Stratification Assortativity (StA) metric, which measures the extent to which a network is stratified into different tiers. Then, we use the \texttt{StA} metric to perform an in-depth analysis of the stratification of five co-authorship networks. We examine the evolution of these networks over 50 years and show that these fields demonstrate an increasing level of stratification over time, and, correspondingly, the trajectory of a researcher's career is increasingly correlated with her entry point into the network.
... Maiya and Berger-Wolf (2009) use a distance-based approach that assumes interactions are most common between supervisors and their direct reports, as well as between organizational peers. Gupte et al. (2011) define "agony" as the difference in rank plus one for communications that are directed from a lower-ranked to a higher-ranked employee, and infer the organizational hierarchy that minimizes agony. While the hierarchical random graph model from Clauset et al. (2008) infers a hierarchy specifically given a network input, it does not produce a spanning tree, i.e. a tree with a vertex set equal to that of the input graph, and therefore its objective is different than the aforementioned methods. ...
... SRD is a measure of the total distance travelled up or down the organizational tree to get from u to v, which is zero if u and v have the same level, is positive if u is lower than v in the organization, and is negative otherwise. SRD is close to the definition of "agony" described in Gupte et al. (2011): Agony(u, v) = max{SRD(v, u) + 1, 0}. Finally, DRD is the reporting distance signed by whether u is higher or lower than v in the organization. ...
... Maiya and Berger-Wolf (2009) propose a distance-based tree reconstruction model: "as the distance between individuals within a hierarchy grows, we assume the probability of interaction decays." Gupte et al. (2011) propose a tree reconstruction method that minimizes agony in the communication network, based on the idea that "when people connect to other people who are lower in the hierarchy, this causes them social agony" and thus "higher rank nodes are less likely to connect to lower rank nodes". As a benchmark method for tree reconstruction, we compute the minimum spanning tree of the communication network (Prim, 1957). ...
Preprint
Full-text available
Most businesses impose a supervisory hierarchy on employees to facilitate management, decision-making, and collaboration. In contrast, routine inter-employee communication patterns within workplaces tend to emerge more naturally, as a consequence of both supervisory relationships and the needs of the organization. Scholars of organizational management have proposed theories relating organizational trees to communication dynamics and measures of business performance. Separately, network scientists have studied the topological structure of communication patterns in different types of organizations. However, the nature of the relationship between a formal organizational structure and emergent communications between employees remains unclear. In this paper, we study associations between organizational hierarchy and communication dynamics among approximately 200,000 employees of a large software company in May 2019. We propose new measures of communication reciprocity and new shortest-path distances for trees to characterize the frequency of messages passed up, down, and across the organizational hierarchy. By dividing the organization into 88 teams -- organizational trees rooted at the senior leadership level -- we identify distinct communication network structures within and between teams. These structures are related to the function of these teams within the company, including sales, marketing, engineering, and research. We discuss the relationship of routine employee communication patterns to supervisory hierarchies in this company, and empirically evaluate several theories of organizational management and performance.
... Given above objective functions, we formalize our problem in Def 5, which is known as the Agony (Gupte et al. 2011) model and is widely used to find a hierarchy from a directed graph. The good news is that Agony problem is in P and many efficient solutions are available (Tatti 2014) when the weights are natural numbers. ...
... Hierarchy Generation in Directed Graphs Extracting a DAG subgraph from the graph, or finding an estimated hierarchy, has been studied in various contexts. One such example is studying hierarchy in social network (Clauset, Moore, and Newman 2008;Gupte et al. 2011;Henderson et al. 2012;Maiya and Berger-Wolf 2009). Estimating the high-level node, in the social network (Cherkassky and Goldberg 1999;Even et al. 1998;Jameson, Appleby, and FREEMAN 1999) is useful to find a person with influence. ...
Article
Knowledge base(KB) plays an important role in artificial intelligence. Much effort has been taken to both manually and automatically construct web-scale knowledge bases. Comparing with manually constructed KBs, automatically constructed KB is broader but with more noises. In this paper, we study the problem of improving the quality for automatically constructed web-scale knowledge bases, in particular, lexical taxonomies of isA relationships. We find that these taxonomies usually contain cycles, which are often introduced by incorrect isA relations. Inspired by this observation, we introduce two kinds of models to detect incorrect isA relations from cycles. The first one eliminates cycles by extracting directed acyclic graphs, and the other one eliminates cycles by grouping nodes into different levels. We implement our models on Probase, a state-of-the-art, automatically constructed, web-scale taxonomy. After processing tens of millions of relations, our models eliminate 74 thousand wrong relations with 91% accuracy.
... In this work, we document a much more general mechanism for the nucleation and growth of transient bubbles. It is based on the fact that social influence is typically directed and hierarchical [23][24][25]. Indeed, in our example of financial markets, the influence of a famous investor on a retail investor is likely much larger than the other way around. ...
Preprint
Full-text available
We present a generic new mechanism for the emergence of collective exuberance among interacting agents in a general class of Ising-like models that have a long history in social sciences and economics. The mechanism relies on the recognition that socio-economic networks are intrinsically non-symmetric and hierarchically organized, which is represented as a non-normal adjacency matrix. Such non-normal networks lead to transient explosive growth in a generic domain of control parameters, in particular in the subcritical regime. Contrary to previous models, here the coordination of opinions and actions and the associated global macroscopic order do not require the fine-tuning close to a critical point. This is illustrated in the context of financial markets theoretically, numerically via agent-based simulations and empirically through the analysis of so-called meme stocks. It is shown that the size of the bubble is directly controlled through the Kreiss constant which measures the degree of non-normality in the network. This mapping improves conceptually and operationally on existing methods aimed at anticipating critical phase transitions, which do not take into consideration the ubiquitous non-normality of complex system dynamics. Our mechanism thus provides a general alternative to the previous understanding of instabilities in a large class of complex systems, ranging from ecological systems to social opinion dynamics and financial markets.
... Our approach can be compared to previous work in the literature in some important ways. There are several methods that extract relative rankings between the nodes of a network, based on spectral node centrality [10][11][12][13], minimum violation ranking [14][15][16][17], random utility models [18][19][20], and latent space models [21][22][23][24]. The most central difference between these methods and the one presented in this work is that none of them attempt to simultaneously detect community structure, or include degree-correction. ...
Preprint
Full-text available
We develop a method to infer community structure in directed networks where the groups are ordered in a latent one-dimensional hierarchy that determines the preferred edge direction. Our nonparametric Bayesian approach is based on a modification of the stochastic block model (SBM), which can take advantage of rank alignment and coherence to produce parsimonious descriptions of networks that combine ordered hierarchies with arbitrary mixing patterns between groups. Since our model also includes directed degree correction, we can use it to distinguish non-local hierarchical structure from local in- and out-degree imbalance -- thus removing a source of conflation present in most ranking methods. We also demonstrate how we can reliably compare with the results obtained with the unordered SBM variant to determine whether a hierarchical ordering is statistically warranted in the first place. We illustrate the application of our method on a wide variety of empirical networks across several domains.
Chapter
The chapter presents what would in a nutshell be a traveling through time with the problem of Feedback Arc Set. The review goes from the first paper, jumping through a number of periods and research results, ending with recent achievements, including a quantum computing algorithm. This chapter, more than any other part of the book, provides a historical compendium of techniques in Computer Science.
Article
Full-text available
Community detection and hierarchy extraction are usually thought of as separate inference tasks on networks. Considering only one of the two when studying real-world data can be an oversimplification. In this work, we present a generative model based on an interplay between community and hierarchical structures. It assumes that each node has a preference in the interaction mechanism and nodes with the same preference are more likely to interact, while heterogeneous interactions are still allowed. The sparsity of the network is exploited for implementing a more efficient algorithm. We demonstrate our method on synthetic and real-world data and compare performance with two standard approaches for community detection and ranking extraction. We find that the algorithm accurately retrieves the overall node’s preference in different scenarios, and we show that it can distinguish small subsets of nodes that behave differently than the majority. As a consequence, the model can recognize whether a network has an overall preferred interaction mechanism. This is relevant in situations where there is no clear “a priori” information about what structure explains the observed network datasets well. Our model allows practitioners to learn this automatically from the data.
Article
Full-text available
Despite its increasing role in communication, the world wide web remains the least controlled medium: any individual or institution can create websites with unrestricted number of documents and links. While great efforts are made to map and characterize the Internet's infrastructure, little is known about the topology of the web. Here we take a first step to fill this gap: we use local connectivity measurements to construct a topological model of the world wide web, allowing us to explore and characterize its large scale properties. Comment: 5 pages, 1 figure, updated with most recent results on the size of the www
Article
Full-text available
This paper provides a novel algorithm for automatically extracting social hierarchy data from electronic communication behavior. The algorithm is based on data mining user behaviors to automatically analyze and catalog patterns of communications between entities in a email collection to extract social standing. The advantage to such automatic methods is that they extract relevancy between hierarchy levels and are dynamic over time. We illustrate the algorithms over real world data using the Enron corporation's email archive. The results show great promise when compared to the corporations work chart and judicial proceeding analyzing the major players.
Article
The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
Article
The common practice of ranking a group of animals in the closest possible order to a linear dominance hierarchy assumes that dominance among those animals is generally transitive. In fact, analysis of groups in which dominance relationships are random shows that this method has a surprisingly high probability of producing an apparently linear or near-linear hierarchy by chance. As such, the existence of transitive dominance should be tested before it is used in ranking. A suitable statistical test is described here. Chance may also contribute to the linear appearance of hierarchies based on other factors.