Article

Temporal Motifs Reveal the Dynamics of Editor Interactions in Wikipedia

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Wikipedia is a collaborative setting with both combative and cooperative editing. We propose a new method for investi-gating the types of editor interactions using a novel repre-sentation of Wikipedia's revision history as a temporal, bi-partite network with multiple node and edge types for users and revisions. From this representation we identify signifi-cant author interactions as network motifs and show how the motif types capture important, diverse editing behaviors. Two experiments demonstrate the further benefit of motifs. First, we demonstrate significant performance improvement over a purely revision-based analysis in classifying pages as com-bative or cooperative page by using motifs; and second we use motifs as a basis for analyzing trends in the dynamics of editor behavior to explain Wikipedia's content growth.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Because of the controversial nature of some topics, the narrative in a Wikipedia article may contain misleading information that stops a neutral consensus emerging. Prior work in this area has established insights such as the predictability of controversy from editor behaviour [20], such as deletions, reversions, and statistics from the collaboration network, prediction of article quality taking insights from multiple models [29], and interactions between users, bots, admin and pages [9]. There has also been a number of different types of network developed to assess Wikipedia articles, including collaboration networks [6] that capture the positive or negative relationship between editors, edit networks that capture "undoing" of edits by a third party [12] and affiliation networks [10]. ...
... To achieve this at scale, and in contrast to previous literature [9,20,22], we assess a relatively large sample of Wikipedia articles, involving over 21,000 Wikipedia articles, by determining their subgraph ratio profiles. Each such profile represents the under and over representation of induced triads in the revision network of a Wikipedia article using 13 dimensions of connected triads, while also normalising for differences in network size. ...
... The associated revision log for Wikipedia articles has been shown to provide a basis to examine potential controversy through examining the collaborative behaviour of individual editors within an article [23] or across multiple articles [28]. An article's revision log identifies the structure underlying temporal interactions [27], and provides insight into how articles and contributors' habits may evolve over time [9]. Features from the aggregation of this, such as number of edits, revision, and previous version restorations have been shown to correlate (e.g., [23]). ...
Preprint
Full-text available
Wikipedia serves as a good example of how editors collaborate to form and maintain an article. The relationship between editors, derived from their sequence of editing activity, results in a directed network structure called the revision network, that potentially holds valuable insights into editing activity. In this paper we create revision networks to assess differences between controversial and non-controversial articles, as labelled by Wikipedia. Originating from complex networks, we apply motif analysis, which determines the under or over-representation of induced sub-structures, in this case triads of editors. We analyse 21,631 Wikipedia articles in this way, and use principal component analysis to consider the relationship between their motif subgraph ratio profiles. Results show that a small number of induced triads play an important role in characterising relationships between editors, with controversial articles having a tendency to cluster. This provides useful insight into editing behaviour and interaction capturing counter-narratives, without recourse to semantic analysis. It also provides a potentially useful feature for future prediction of controversial Wikipedia articles.
... Because of the controversial nature of some topics, the narrative in a Wikipedia article may contain misleading information that stops a neutral consensus emerging. Prior work in this area has established insights such as the predictability of controversy from editor behaviour [20], such as deletions, reversions, and statistics from the collaboration network, prediction of article quality taking insights from multiple models [29], and interactions between users, bots, admin and pages [9]. There has also been a number of different types of network developed to assess Wikipedia articles, including collaboration networks [6] that capture the positive or negative relationship between editors, edit networks that capture "undoing" of edits by a third party [12] and affiliation networks [10]. ...
... To achieve this at scale, and in contrast to previous literature [9,20,22], we assess a relatively large sample of Wikipedia articles, involving over 21,000 Wikipedia articles, by determining their subgraph ratio profiles. Each such profile represents the under and over representation of induced triads in the revision network of a Wikipedia article using 13 dimensions of connected triads, while also normalising for differences in network size. ...
... The associated revision log for Wikipedia articles has been shown to provide a basis to examine potential controversy through examining the collaborative behaviour of individual editors within an article [23] or across multiple articles [28]. An article's revision log identifies the structure underlying temporal interactions [27], and provides insight into how articles and contributors' habits may evolve over time [9]. Features from the aggregation of this, such as number of edits, revision, and previous version restorations have been shown to correlate (e.g., [23]). ...
Conference Paper
Full-text available
Wikipedia serves as a good example of how editors collaborate to form and maintain an article. The relationship between editors, derived from their sequence of editing activity, results in a directed network structure called the revision network, that potentially holds valuable insights into editing activity. In this paper we create revision networks to assess differences between controversial and non-controversial articles, as labelled by Wikipedia. Originating from complex networks, we apply motif analysis, which determines the under or over-representation of induced sub-structures, in this case triads of editors. We analyse 21,631 Wikipedia articles in this way, and use principal component analysis to consider the relationship between their motif subgraph ratio profiles. Results show that a small number of induced triads play an important role in characterising relationships between editors, with controversial articles having a tendency to cluster. This provides useful insight into editing behaviour and interaction capturing counter-narratives, without recourse to semantic analysis. It also provides a potentially useful feature for future prediction of controversial Wikipedia articles.
... In the simplest case of the static-temporal approach, static motifs (as defined above) are counted in each snapshot and then their counts are compared across the snapshots [27]. To overcome this approach's limitation of ignoring any motif relationships between different snapshots, the notion of static network motifs has been extended into several notions of temporal motifs [28], [29], [30], [31], [32]. However, each of these existing temporal motif-based approaches suffers from at least three of the following drawbacks: 1. ...
... However, each of these existing temporal motif-based approaches suffers from at least three of the following drawbacks: 1. They can only deal with motif structures of limited complexity, such as small motifs or simple topologies (e.g., linear paths) [28], [29], [32], which limits their practical usefulness to capture complex network structure in detail. 2. They pose additional constraints, such as limiting the number of events (temporal edges) a node can participate in at a given time point [30], [31]. ...
... 3. They allow for obtaining the motif-based topological "signature" of the entire network only but not of each individual node [27], [28], [29], [32], [30], [31], whereas the latter is very useful when aiming to link the network topological position of a node to its function via e.g., network alignment or clustering (see the above discussion on static graphlets). ...
Article
With increasing availability of temporal real-world networks, how to efficiently study these data? One can model a temporal network as a single aggregate static network, or as a series of time-specific snapshots, each being an aggregate static network over the corresponding time window. Then, one can use established methods for static analysis on the resulting aggregate network(s), but losing in the process valuable temporal information either completely, or at the interface between different snapshots, respectively. Here, we develop a novel approach for studying a temporal network more explicitly, by capturing inter-snapshot relationships. We base our methodology on well-established graphlets (subgraphs), which have been proven in numerous contexts in static network research. We develop new theory to allow for graphlet-based analyses of temporal networks. Our new notion of dynamic graphlets is different from existing dynamic network approaches that are based on temporal motifs (statistically significant subgraphs). The latter have limitations: their results depend on the choice of a null network model that is required to evaluate the significance of a subgraph, and choosing a good null model is non-trivial. Our dynamic graphlets overcome the limitations of the temporal motifs. Also, when we aim to characterize the structure and function of an entire temporal network or of individual nodes, our dynamic graphlets outperform the static graphlets. Clearly, accounting for temporal information helps. We apply dynamic graphlets to temporal age-specific molecular network data to deepen our limited knowledge about human aging. http://www.nd.edu/∼cone/DG. tmilenko@nd.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
... The network motifs of network are connectivity patterns that occur far more often in than in randomized networks with the same degree distribution [12]. Several researchers studied the network motifs of an entire graph extracted from Wikipedia [10], [32], [35]. However, to the best of our knowledge, no attempt has been made to analyze the network motifs of a domain-specific WAG. ...
... 2) The sentences from the plain text were acquired with the sentence splitter provided by Stanford CoreNLP library. 10 The sentences were then cleaned by removing duplicates, very short, and incomplete sentences. A total of 604,708 sentences were obtained. ...
Article
Full-text available
Discovering hyponym relations among domain-specific terms is a fundamental task in taxonomy learning. However, the great diversity of various domain corpora and the lack of labeled training sets make this task very challenging for conventional methods that are based on text content. The hyperlink structure of Wikipedia article pages was found to contain recurring network motifs in this study, indicating the probability of a hyperlink being a hyponym hyperlink. Hence, a novel hyponym relation extraction approach based on the network motifs was proposed. This approach automatically constructs motif-based features from the hyperlink structures of a domain; every hyperlink is mapped to a 13-dimensional feature vector based on the 13 types of three-node motifs. The approach extracts structural information from Wikipedia and heuristically creates a labeled training set. Classification models were determined from the training sets for hyponym relation extraction. Two experiments were conducted to validate our approach based on seven domain-specific datasets. The first experiment verified the effectiveness of the motif-based features. The second experiment showed that the approach performs better than the approach based on lexico-syntactic patterns and achieves comparable result to the approach based on textual features. Experimental results show fairly good domain scalability of the proposed approach.
... Largely speaking , this research has employed two primary approaches for capturing system dynamics: (1) " aggregating " sequence logs to represent a particular process, creating multiple " snapshots " of the process over time [63] , or (2) " collapsing " sequence logs to calculate a pair-wise relationship between entities (e.g. affiliation between users), often forming a network of these entities and using social network analysis (SNA) techniques to study the structure of the network [37, 39, 30]. While our interest is analyzing the sequences of production and coordination activities that make up work routines, this prior work aggregates and collapses event sequences, thus obscuring the sequential ordering and structures in longitudinal records of co-production activities. ...
... Following calls to analyze the micro-level processes of peer production systems [44] and despite the reliance on event logs to understand how work processes and roles are structured over time in prior research [29], these analyses have largely ignored the role of detailed sequences in log data and overlooked the temporal and artifact-related dependencies. Prior research investigating co-production dynamics as either aggregated sequence log to represent a particular process, creating multiple " snapshots " of the process over time, or " collapsed " sequence logs to calculate a pair-wise relationship between entities, forming a network of these entities and using social network analysis techniques to study the structure of the network [37, 39, 30]. Less relevant to our investigation are prior works that employed information visualization techniques –– rather than a formal knowledge representation –– to depict the dynamics of collaborative work [18, 59, 65]. ...
... Largely speaking, this research has employed two primary approaches for capturing system dynamics: (1) "aggregating" sequence logs to represent a particular process, creating multiple "snapshots" of the process over time [63], or (2) "collapsing" sequence logs to calculate a pair-wise relationship between entities (e.g. affiliation between users), often forming a network of these entities and using social network analysis (SNA) techniques to study the structure of the network [37,39,30]. While our interest is analyzing the sequences of production and coordination activities that make up work routines, this prior work aggregates and collapses event sequences, thus obscuring the sequential ordering and structures in longitudinal records of co-production activities. ...
... Following calls to analyze the micro-level processes of peer production systems [44] and despite the reliance on event logs to understand how work processes and roles are structured over time in prior research [29], these analyses have largely ignored the role of detailed sequences in log data and overlooked the temporal and artifact-related dependencies. Prior research investigating co-production dynamics as either aggregated sequence log to represent a particular process, creating multiple "snapshots" of the process over time, or "collapsed" sequence logs to calculate a pair-wise relationship between entities, forming a network of these entities and using social network analysis techniques to study the structure of the network [37,39,30]. Less relevant to our investigation are prior works that employed information visualization techniques --rather than a formal knowledge representation -to depict the dynamics of collaborative work [18,59,65]. ...
Conference Paper
Research into socio-technical systems like Wikipedia has overlooked important structural patterns in the coordination of distributed work. This paper argues for a conceptual reorientation towards sequences as a fundamental unit of analysis for understanding work routines in online knowledge collaboration. We outline a research agenda for researchers in computer-supported cooperative work (CSCW) to understand the relationships, patterns, antecedents, and consequences of sequential behavior using methods already developed in fields like bio-informatics. Using a data set of 37,515 revisions from 16,616 unique editors to 96 Wikipedia articles as a case study, we analyze the prevalence and significance of different sequences of editing patterns. We illustrate the mixed method potential of sequence approaches by interpreting the frequent patterns as general classes of behavioral motifs. We conclude by discussing the methodological opportunities for using sequence analysis for expanding existing approaches to analyzing and theorizing about co-production routines in on-line knowledge collaboration.
... Zhao et al. [35] studied the temporal annotations (e.g., timestamps and duration) of historical communications graphlet in social networks using communication motifs and their maximum flow. Jurgens and Tsai [36] proposed a method to represent Wikipedia revision history as a temporal bipartite graph of editor interactions, which identifies significant author interactions as network motifs of diverse editing behaviours. Yan and Guo [37] proposed evolving gene regulatory networks (GRNs) based self-organizing robotic to generate evolving network motifs for path detection in unknown environment autonomously. ...
... Alternatively, we aim to compare timevarying complexities (SSCs) and ESC of two evolving systems modelled as evolving network series. Possible applications of our work is to study complexity of domains like social media [34], communication motifs [35], natural language processing [36], robot path [37], and software systems [38]. ...
Article
Full-text available
Era of computation intelligence leads to various kinds of systems that evolve. Usually, an evolving system contains evolving interconnected entities (or components) that make evolving networks for the system State Series ${\text{SS}}\,= \,\{S_{1},\,S_{2}\ldots \,S_{N}\}$ created over time, where S $_i$ represents the ith system state. In this paper, we introduce an approach for mining Network Evolution Subgraphs such as Network Evolution Graphlets (NEGs) and Network Evolution Motifs (NEMs) from a set of evolving networks. We used graphlets information of a state to calculate System State Complexity (SSC). The System State Complexities (SSCs) represent time-varying complexities of multiple states. Additionally, we also used the NEGs information to calculate Evolving System Complexity (ESC) for a state series over time. We proposed an algorithm named System Network Complexity (SNC) for mining NEGs, SSCs, and ESC, which analyzes a pre-evolved state series of an evolving system. We prototyped the technique as a tool named SNC-Tool, which is applied to six real-world evolving systems collected from open-internet repositories of four different domains: software system, natural language system, retail market basket system, and IMDb movie genres system. This is demonstrated as experimentation reports containing retrieved—NEGs, NEMs, SSCs, and ESC—for each evolving system.
... A derivative of the TEG also plays a crucial role in the calculation of higher order temporal motifs [16]. The analysis of temporal motifs has previously uncovered various behaviours of individuals when applied to a number of different temporal networks [17][18][19][20]. ...
Article
Full-text available
Temporal networks are increasingly being used to model the interactions of complex systems. Most studies require the temporal aggregation of edges (or events) into discrete time steps to perform analysis. In this article we describe a static, lossless, and unique representation of a temporal network, the temporal event graph (TEG). The TEG describes the temporal network in terms of both the inter-event time and two-event temporal motif distributions. By considering these distributions in unison we provide a new method to characterise the behaviour of individuals and collectives in temporal networks as well as providing a natural decomposition of the network. We illustrate the utility of the TEG by providing examples on both synthetic and real temporal networks.
... Again, this type of activity analysis has been executed for multiple platforms. For instance, Jurgens and Lu [14] analyse temporal patterns in edits to Wikipedia articles. Their analysis reveals motif instances in the edit-patterns to pages. ...
Conference Paper
Full-text available
In this paper, we study activity on the microblogging platform Twitter. We analyse two separate aspects of activity on Twitter. First, we analyse the daily and weekly number of posts, through which we find clear circadian (daily) patterns emerging in the use of Twitter for multiple languages. We see that both the number of tweets and the daily and weekly activity patterns differ between languages. Second, we analyse the progression of individual tweets through retweets in the Twittersphere. We find that the size of these progressions follow a power-law distribution. Furthermore, we build an algorithm to analyse the actual structure of the progressions and use this algorithm on a limited set of tweets. We find that retweet trees show a star-like structure.
... Traditional studies of user behavior in online social communities has largely employed two approaches: Some research studies aggregate sequence logs and create snapshots of the network and investigate various phenomena by using the aggregated information. Other studies define a tie between users based on statistics from the sequence data and use social network analysis to answer a specific structural research questions (Keegan et al., 2015) (Van Der Aalst et al., 2005) (Jurgens and Lu, 2012. here are some studies that take account of temporal features in the data, however they do not specifically investigate sequence of action (Eagle and Pentland, 2009) (Wang et al., 2015). ...
Preprint
The degree to which individuals can exert influence on propagation of information and opinion dynamics in online communities is highly dependent on their social status. Therefore, there is a high demand for identifying influential users in a community by predicting their social position in that community. Moreover, understanding how people with various social status behave, can shed light on the dynamics of interaction in social networks. In this paper, I study an evolving online social network originated from an online community for university students and I tackle the problem of forecasting users' social status, represented as their PageRank, based on frequency of recurring temporal sequences of observed behavior, i.e. behavioral motifs. I show that individuals with different values of PageRank exhibit different behavior even in early weeks since the online community's inception and it is possible to forecast future PageRank values given frequency of behavioral motifs with high accuracy.
... Wikipedia, which is one of the most popular social media networks has been studied extensively. For example, Hu et al. [4] analysed and predicted user collaborations, Leskovec et al. [10] investigated the promotion process from the point of view of the voters engaged in group decision-making, and Jurgens et al. [6] investigated trends of editor behaviour. The page links structure has also been studied. ...
Conference Paper
In this paper we propose a framework for analysing the structure of a large-scale social media network, a topic of significant recent interest. Our study is focused on the Wikipedia category network, where nodes correspond to Wikipedia categories and edges connect two nodes if the nodes share at least one common page within the Wikipedia network. Moreover, each edge is given a weight that corresponds to the number of pages shared between the two categories that it connects. We study the structure of category clusters within the three complete English Wikipedia category networks from 2010 to 2012. We observe that category clusters appear in the form of well-connected components that are naturally clustered together. For each dataset we obtain a graph, which we call the t-filtered category graph, by retaining just a single edge linking each pair of categories for which the weight of the edge exceeds some specified threshold t. Our framework exploits this graph structure and identifies connected components within the t-filtered category graph. We studied the large-scale structural properties of the three Wikipedia category networks using the proposed approach. We found that the number of categories, the number of clusters of size two, and the size of the largest cluster within the graph all appear to follow power laws in the threshold t. Furthermore, for each network we found the value of the threshold t for which increasing the threshold to t + 1 caused the “giant” largest cluster to diffuse into two or more smaller clusters of significant size and studied the semantics behind this diffusion.
... This approach examines the network of how contributors (nodes) contribute to a particular article (ties). A considerable amount of edit network research examines the role of conflict in the collaboration process, which results in lower quality articles (Brandes et al. 2009, Maniu et al. 2011, Jurgens and Lu 2012. Others conceptualize the collaboration network differently, examining the network between contributors (nodes) through their interactions with one another on talk pages (ties), which we call the talk page network (Massa 2011). ...
Article
The 15-year history of collaboration on Wikipedia offers insight into how peer production communities create knowledge. In this research, we combine disparate content and collaboration approaches through a social network analysis approach known as an affiliation network. It captures both how knowledge is transferred in a peer production network and also the underlying skills possessed by its contributors in a single methodological approach. We test this approach on the Wikipedia articles dedicated to medical information developed in a subcommunity known as a WikiProject. Overall, we find that the position of an article in the affiliation network is associated with the quality of the article. We further investigate information quality through additional qualitative and quantitative approaches including expert coders using medical students, crowdsourcing using Amazon Mechanical Turk, and visualization using network graphs. A review by fourth-year medical students indicates that the Wikipedia quality rating is a reliable measure of information quality. Amazon Mechanical Turk ratings, however, are a less reliable measure of information quality, reflecting observable content characteristics such as article length and the number of references.
... Recently, the temporal motif is no longer limited by the snapshot, but has been extended to the network motif with time attribute [26]. According to this idea, Kovanen et al. [9] presented the definition of temporal motif which was widely used in Wikipedia network [27], combat system of systems (SoS) coordination [28] and mobile cohesive groups [8]. Paranjape et al. [29] also defined δ-temporal motifs and proposed the counting algorithms. ...
Article
Temporal network is a basic tool for representing complex systems, such as communication networks and social networks; besides the temporal motif (TM) plays an important role in the analysis of temporal networks. Without considering the temporal information, most existing motif mining methods focus on static networks and are not suitable for mining temporal motifs. In this paper, we study the problem of temporal motif mining for the temporal network. To formulate the problem, we define the temporal motif as a frequently connected subgraph that has a similar sequence of information flows. Moreover, an efficient algorithm called TM-Miner is proposed. Based on the time first search (TFS) algorithm, the TM-Miner builds a canonical labeling system that uses a new lexicographic order and maps the temporal graph to the unique minimum TFS code. By utilizing the canonical labeling system, the computational cost of temporal graph isomorphism is reduced and the efficiency of the algorithm is improved. Finally, we evaluate the performance of the TM-Miner algorithm in real datasets and extensive experiments demonstrate that it is faster than the existing algorithms.
... Network Motifs Applications: In [8], the authors represent star motifs of an article with different types of editors (registered, anonymous, bots and administrators) with distinct revisions types (add, delete, edit and revert). Applying above-defined motifs, they classified pages as combative or cooperative and also understand dynamics of editors behaviors to study the growth of Wikipedia. ...
Chapter
Full-text available
Wikipedia is a multilingual encyclopedia that works on the idea of virtual collaboration. Initially, its contents such as articles, editors and edits grow exponentially. Further growth analysis of Wikipedia shows slowdown or saturation in its contents. In this paper, we investigate whether two essential characteristics of Wikipedia, collaboration and cohesiveness also encounter the phenomenon of slowdown or saturation with time. Collaboration in Wikipedia is the process where two or more editors edit together to complete a common article. Cohesiveness is the extent to which a group of editors stays together for mutual interest. We employ the concept of network motifs to investigate saturation in these two considered characteristics of Wikipedia. We consider star motifs of articles with the average number of edits to study the growth of collaboration and 2 \(\times \) 2 complete bicliques or “butterfly” motifs to interpret the change in the cohesiveness of Wikipedia. We present the change in the count of the mentioned network motifs for the top 22 languages of Wikipedia upto May 2019. We observe saturation in collaboration while the linear or sudden rise in cohesiveness in most of the languages of Wikipedia. We therefore notice, although the contents of Wikipedia encounter natural limits of growth, the activities of editors are still improving with time.
... In this network an edge between an author and an article is equivalent to a record in the history dump file, thus there may be multiple edges between the same two nodes. Edge weights as defined in other works [11], [14], [28] were not computed. We constructed two slightly different versions of this network. ...
Conference Paper
Full-text available
The collaboration of Wikipedia editors is well researched, covered by scientific works of many different fields. There is a growing interest to implement recommender systems that guide inexperienced editors to projects which fit their interests in certain topical domains. Although there have been numerous studies focusing on editing behavior in Wikipedia the role of topical domains in this regard is still unclear. In particular, topical aspects of co-authorship are generally neglected. In this paper, we want to determine by which criteria editors usually choose articles they want to contribute to. We analyzed three different language editions of Wikipedia (Vietnamese, Hebrew, and Serbo-Croatian) by building social networks and running community detection algorithms on them, i.e. editors are grouped based on their shared involvement in Wikipedia articles using social network analysis techniques. Then, we related this to the topical domains of these articles based on Wikipedia’s user defined category network. Our results demonstrated that communities in Wikipedia tend to edit articles with a higher than average topical relatedness. But the significance and quality of these results vary considerably in the different language versions of Wikipedia. Topical relations between contributors and articles are a complex matter and influenced by a number of different factors, e.g. by culture.
... In this paper, we analyze Genius in the context of other well-studied crowdsourced information sites, such as Stack Overflow [30,32,36], Quora [21,39] Yahoo Answers [1], and Wikipedia [7,23]. The temporal dynamics of user activity on such sites have been studied in several contexts [4,5,19,27,29]. ...
Preprint
Many platforms collect crowdsourced information primarily from volunteers. As this type of knowledge curation has become widespread, contribution formats vary substantially and are driven by diverse processes across differing platforms. Thus, models for one platform are not necessarily applicable to others. Here, we study the temporal dynamics of Genius, a platform primarily designed for user-contributed annotations of song lyrics. A unique aspect of Genius is that the annotations are extremely local -- an annotated lyric may just be a few lines of a song -- but also highly related, e.g., by song, album, artist, or genre. We analyze several dynamical processes associated with lyric annotations and their edits, which differ substantially from models for other platforms. For example, expertise on song annotations follows a ``U shape'' where experts are both early and late contributors with non-experts contributing intermediately; we develop a user utility model that captures such behavior. We also find several contribution traits appearing early in a user's lifespan of contributions that distinguish (eventual) experts from non-experts. Combining our findings, we develop a model for early prediction of user expertise.
... Communication patterns can be identified using several graph concepts: from motifs [28,29] to graphlets [23], from subgraphs [37] to temporal greedy walks [38]. In particular, the study of temporal motifs has attracted a lot of interest and showed how motifs can be helpful to understand particular characteristics of the human behaviour, such as homophily [9,42], mobility [39], preferences [35], analysis of trends [26], but also human brain [3,11], stock prediction [25], weather prediction [31] and many other fields. Today, one of the main problems to understand communication patterns with current proposals is the use of fixed "size" structures [23]. ...
Article
Full-text available
In the last decades, temporal networks played a key role in modelling, understanding, and analysing the properties of dynamic systems where individuals and events vary in time. Of paramount importance is the representation and the analysis of Social Media, in particular Social Networks and Online Communities, through temporal networks, due to their intrinsic dynamism (social ties, online/offline status, users’ interactions, etc..). The identification of recurrent patterns in Online Communities, and in detail in Online Social Groups, is an important challenge which can reveal information concerning the structure of the social network, but also patterns of interactions, trending topics, and so on. Different works have already investigated the pattern detection in several scenarios by focusing mainly on identifying the occurrences of fixed and well known motifs (mostly, triads) or more flexible subgraphs. In this paper, we present the concept on the Incremental Communication Patterns, which is something in-between motifs, from which they inherit the meaningfulness of the identified structure, and subgraph, from which they inherit the possibility to be extended as needed. We formally define the Incremental Communication Patterns and exploit them to investigate the interaction patterns occurring in a real dataset consisting of 17 Online Social Groups taken from the list of Facebook groups. The results regarding our experimental analysis uncover interesting aspects of interactions patterns occurring in social groups and reveal that Incremental Communication Patterns are able to capture roles of the users within the groups.
... The MR method that looks at mutual reverts in the revision history as sign of edit wars is an example of these methods. In a more advanced level, these edit patterns are modeled by network motifs in a recent work [11], where the network motifs are defined by considering the network of editors and articles over each three consecutive versions. The frequency of different network motif types (more than 39'000 different types) over the entire revision history of articles is extracted as feature vectors and different edit patterns are learned for controversial and non-controversial articles. ...
Conference Paper
Full-text available
Wikipedia articles are the result of the collaborative editing of a diverse group of anonymous volunteer editors, who are passionate and knowledgeable about specific topics. One can argue that this plurality of perspectives leads to broader coverage of the topic, thus benefitting the reader. On the other hand, differences among editors on polarizing topics can lead to controversial or questionable content, where facts and arguments are presented and discussed to support a particular point of view. Controversial articles are manually tagged by Wikipedia editors, and span many interesting and popular topics, such as religion, history, and politics, to name a few. Recent works have been proposed on automatically identifying controversy within unmarked articles. However, to date, no systematic comparison of these efforts has been made. This is in part because the various methods are evaluated using different criteria and on different sets of articles by different authors, making it hard for anyone to verify the efficacy and compare all alternatives. We provide a first attempt at bridging this gap. We compare five different methods for modelling and identifying controversy, and discuss some of the unique difficulties and opportunities inherent to the way Wikipedia is produced.
Article
Many types of social media metadata come in forms of temporal networks, networks where we have information about not only who is in contact with whom but also when contacts happen. In this paper, we review methods to analyze temporal networks developed in the last few years applied to social media data. These methods seek to identify important spreaders and, in more generality, how the temporal and topological structure of interaction affects spreading processes.
Article
Full-text available
This paper addresses the need of characterizing system instability toward critical transitions in complex systems. We propose a novel information dynamic spectrum framework and a probabilistic light cone method to automate the analysis. Our framework uniquely investigates heterogeneously networked dynamical systems with transient directional influences, which subsumes unidirectional diffusion dynamics. When the observed instability of a system deviates from the prediction, the method automatically indicates the approach of an upcoming critical transition. We provide several demonstrations in engineering, economics, and social systems. The results suggest that early detecting critical transitions of synchronizations, sudden collapse, and exponential growth is possible.
Article
One of the challenges in network data analysis is the determination of the most informative perspective on the network to use in analysis. This is particularly an issue when the network is dynamic and is defined by events that occur over time. We present an example of such a scenario in the analysis of edit networks in Wikipedia—the networks of editors interacting on Wikipedia pages. We propose the prediction of article quality as a task that allows us to quantify the informativeness of alternative network views. We present three fundamentally different views on the data that attempt to capture structural and temporal aspects of the edit networks. We demonstrate that each view captures information that is unique to that view and propose a strategy for integrating the different sources of information.
Article
To understand large, connected systems we cannot only zoom into the details. We also need to see the large-scale features from afar. One way to take a step back and get the whole picture is to model the systems as a network. However, many systems are not static, but consisting of contacts that are off and on as time progresses. This chapter is an introduction to the mathematical and computational modeling of such systems, and thus an introduction to the rest of the book. We will cover some of the earlier developments that form the foundation for the more specialized topics of the other chapters.
Article
The structure of complex networks can be characterized by counting and analysing network motifs. Motifs are small graph structures that occur repeatedly in a network, such as triangles or chains. Recent work has generalized motifs to temporal and dynamic network data. However, existing techniques do not generalize to sequential or trajectory data, which represent entities moving through the nodes of a network, such as passengers moving through transportation networks. The unit of observation in these data is fundamentally different since we analyse observations of trajectories (e.g. a trip from airport A to airport C through airport B), rather than independent observations of edges or snapshots of graphs over time. In this work, we define sequential motifs in trajectory data, which are small, directed and sequence-ordered graphs corresponding to patterns in observed sequences. We draw a connection between the counting and analysis of sequential motifs and Higher-Order Network (HON) models. We show that by mapping edges of a HON, specifically a $k$th-order DeBruijn graph, to sequential motifs, we can count and evaluate their importance in observed data. We test our methodology with two datasets: (1) passengers navigating an airport network and (2) people navigating the Wikipedia article network. We find that the most prevalent and important sequential motifs correspond to intuitive patterns of traversal in the real systems and show empirically that the heterogeneity of edge weights in an observed higher-order DeBruijn graph has implications for the distributions of sequential motifs we expect to see across our null models.
Article
Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.
Conference Paper
This paper addresses the need of predicting system instability toward critical transitions occurred in complex systems. A novel information dynamic spectrum framework and a method for automated prediction of system trajectories are proposed. Our framework goes beyond unidirectional diffusion dynamics to investigate heterogeneously networked dynamical systems with transient directional influence dynamics. Our method automatically analyzes the input time series of system instability to predict the instability trajectories toward critical transitions.
Conference Paper
A fundamental problem in behavioral analysis of human interactions is to understand how communications unfold. In this paper, we study this problem by mining Communication motifs from dynamic interaction networks. A communication motif is a recurring subgraph that has a similar sequence of information flow. Mining communication motifs requires us to explore the exponential subgraph search space where existing techniques fail to scale. To tackle this scalability bottleneck, we develop a technique called COMMIT. COMMIT converts a dynamic graph into a database of sequences. Through careful analysis in the sequence space, only a small portion of the exponential search space is accessed to identify regions embedding communication motifs. Extensive experiments on three different social networks show COMMIT to be up to two orders of magnitude faster than baseline techniques. Furthermore, qualitative analysis demonstrate communication motifs to be effective in characterizing the recurring patterns of interactions while also revealing the role that the underlying social network plays in shaping human behavior.
Conference Paper
Determining the occurrence of motifs yields profound insight for many biological systems, like metabolic, protein-protein interaction, and protein structure networks. Meaningful spatial protein-structure motifs include enzyme active sites and ligand-binding sites which are essential for function, shape, and performance of an enzyme. Analyzing their dynamics over time leads to a better understanding of underlying properties and processes. In this work, we present StreaM, a stream-based algorithm for counting undirected 4-vertex motifs in dynamic graphs. We evaluate StreaM against the four predominant approaches from the current state of the art on generated and real-world datasets, a simulation of a highly dynamic enzyme. For this case, we show that StreaM is capable to capture essential molecular protein dynamics and thereby provides a powerful method for evaluating large molecular dynamics trajectories. Compared to related work, our approach achieves speedups of up to 2, 300 times on real-world datasets.
Conference Paper
Social media have become a popular platform for people to share their opinions and emotions. Analyzing opinions that are posted on the web is very important since they influence future decisions of organizations and people. Comparative opinion mining is a subfield of opinion mining that deals with identifying and extracting information that is expressed in a comparative form. Due to the fact that there is a huge amount of opinions posted online everyday, analyzing comparative opinions from a temporal perspective is an important application that needs to be explored. This study introduces the idea of integrating temporal elements in comparative opinion mining. Different type of results can be obtained from the temporal analysis, including trend analysis, competitive analysis as well as burst detection. In our study we show that temporal analysis of comparative opinion mining provides more current and relevant information to users compared to standard opinion mining.
Conference Paper
This study investigates the behaviour of Ukrainian, Russian and English Wikipedia contributors in terms of their attention management, which Pierre Lévy casts as the initial stage of personal knowledge management. We analyse the salience of the Ukrainian crisis of 2013-14 as a topic of public discussion on the national, regional and international level, as well as the changing intensity of discussions between Ukrainian-speaking, Russian-speaking and English-speaking communities of Wikipedia contributors. We propose a meta-driven methodology to identify and track multi-faceted topics of public discussion rather than individual articles, which is common practice in Wikipedia scholarship. We develop a ‘discussion intensity’ metric to trace the salience of topics related to the Ukrainian crisis among Wikipedia contributors over time and to detect which aspects of this topic fuel discussions and direct attention. This method allows for a comparison across different language versions of Wikipedia and enables the identification of major differences in the attention management of different communities of Wikipedia creators and the role of the encyclopaedia in the development of collective knowledge. We observe three distinct patterns of collective attention management, which we characterize as intense attention, dispersed attention, and focused attention.
Conference Paper
Recent research has discovered the importance of informal roles in peer online collaboration. These roles reflect prototypical activity patterns of contributors such as different editing activities in writing communities. While previous work has analyzed the dynamics of contributors within single communities, so far, the relationship between individuals' roles and interaction among contributors remains unclear. This is a severe drawback given that collaboration is one of the driving forces in online communities. In this study, we use a network-based approach to combine information about individuals' roles and their interaction over time. We measure the impact of recurring subgraphs in co-author networks, so called motifs, on the overall quality of the resulting collaborative product. Doing so allows us to measure the effect of collaboration over mere isolated contributions by individuals. Our findings indicate that indeed there are consistent positive implications of certain patterns that cannot be detected when looking at contributions in isolation, e.g. we found shared positive effects of contributors that specialize on content quality over of quantity. The empirical results presented in this work are based on a study of several online writing communities, namely wikis from Wikia and Wikipedia.
Article
Full-text available
Current theories struggle to explain how participants in peer-production self-organize to produce high-quality knowledge in the absence of formal coordination mechanisms. The literature traditionally holds that norms, policies, and roles make coordination possible. However, peer-production is largely free from workflow constraints and most peer-production communities do not allocate or assign tasks. Yet, scholars have suggested that ordered work sequences can emerge in such settings. We refer to sequences of activities that emerge organically as components of "emergent routines". The volunteer nature of peer-production, coupled with high degrees of turnover, makes learning and coordination difficult, calling into question the extent to which emergent routines could be ingrained in the community. The objective of this paper is to characterize the work sequences that organically emerge in peer-production, as well as to understand the temporal dynamics of these emergent routine components. We center our empirical investigation on the peer-production of a set of 1,000 Wikipedia articles. Using a dataset of labelled wiki work, we employ Variable-Length Markov Chains (VLMC) to identify sequences of activities exhibiting structural dependence, cluster the sequences to identify components of emergent routines, and then track their prevalence over time. We find that work is organized according to several routine components and that the prevalence of these components changes over time.
Preprint
Investigating the frequency and distribution of small subgraphs with a few nodes/edges, i.e., motifs, is an effective analysis method for static networks. Motif-driven analysis is also useful for temporal networks where the number of motifs is significantly larger due to the additional temporal information on edges. This variety makes it challenging to design a temporal motif model that can consider all aspects of temporality. In the literature, previous works have introduced various models that handle different characteristics. In this work, we compare the existing temporal motif models, and evaluate the facets of temporal networks that are overlooked in the literature. We first survey four temporal motif models and highlight their differences. Then, we evaluate the advantages and limitations of these models with respect to the temporal inducedness and timing constraints. In addition, we suggest a new lens, event pairs, to study temporal motifs to investigate cause-effect relations. We believe that our comparative survey and extensive evaluation will catalyze the research on temporal network motif models.
Preprint
Full-text available
Patents are intellectual properties that reflect innovative activities of companies and organizations. The literature is rich with the studies that analyze the citations among the patents and the collaboration relations among companies that own the patents. However, the adversarial relations between the patent owners are not as well investigated. One proxy to model such relations is the patent opposition, which is a legal activity in which a company challenges the validity of a patent. Characterizing the patent oppositions, collaborations, and the interplay between them can help better understand the companies' business strategies. Temporality matters in this context as the order and frequency of oppositions and collaborations characterize their interplay. In this study, we construct a two-layer temporal network to model the patent oppositions and collaborations among the companies. We utilize temporal motifs to analyze the oppositions and collaborations from structural and temporal perspectives. We first characterize the frequent motifs in patent oppositions and investigate how often the companies of different sizes attack other companies. We show that large companies tend to engage in opposition with multiple companies. Then we analyze the temporal interplay between collaborations and oppositions. We find that two adversarial companies are more likely to collaborate in the future than two collaborating companies oppose each other in the future.
Article
Wikipedia has been turned into an immensely popular crowd-sourced encyclopedia for information dissemination on numerous versatile topics in the form of subscription free content. It allows anyone to contribute so that the articles remain comprehensive and updated. For enrichment of content without compromising standards, the Wikipedia community enumerates a detailed set of guidelines, which should be followed. Based on these, articles are categorized into several quality classes by the Wikipedia editors with increasing adherence to guidelines. This quality assessment task by editors is laborious as well as demands platform expertise. As a first objective, in this paper, we study evolution of a Wikipedia article with respect to such quality scales. Our results show novel non-intuitive patterns emerging from this exploration. As a second objective we attempt to develop an automated data driven approach for the detection of the early signals influencing the quality change of articles. We posit this as a change point detection problem whereby we represent an article as a time series of consecutive revisions and encode every revision by a set of intuitive features. Finally, various change point detection algorithms are used to efficiently and accurately detect the future change points. We also perform various ablation studies to understand which group of features are most effective in identifying the change points. To the best of our knowledge, this is the first work that rigorously explores English Wikipedia article quality life cycle from the perspective of quality indicators and provides a novel unsupervised page level approach to detect quality switch, which can help in automatic content monitoring in Wikipedia thus contributing significantly to the CSCW community.
Article
We assess the potential of network motif profiles to characterize ego-networks in much the same way that a bag-of-words strategy allows text documents to be compared in a vector space framework. This is potentially valuable as a generic strategy for comparing nodes in a network in terms of the network structure in which they are embedded. In this paper, we consider the computational challenges and model selection decisions involved in network motif profiling. We also present three case studies concerning the analysis of Wikipedia edit networks, YouTube spam campaigns, and peer-to-peer lending in the Prosper marketplace.
Conference Paper
Full-text available
The study of collaboration patterns in wikis can help shed light on the process of content creation by online communities. To turn a wiki's revision history into a collaboration network, we propose an algorithm that identifies as authors of a page the users who provided the most of its relevant content, measured in terms of quantity and of acceptance by the community. The scalability of this approach allows us to study the English Wikipedia community as a co-authorship network. We find evidence of the presence of a nucleus of very active contributors, who seem to spread over the whole wiki, and to interact preferentially with inexperienced users. The fundamental role played by this elite is witnessed by the growing centrality of sociometric stars in the network. Isolating the community active around a category, it is possible to study its specific dynamics and most influential authors.
Conference Paper
Full-text available
The success of Wikipedia and the relative high quality of its articles seem to contradict conventional wisdom. Recent studies have begun shedding light on the processes contributing to Wikipedia's success, highlighting the role of coordination and contribution inequality. In this study, we expand on these works in two ways. First, we make a distinction between global (Wikipedia-wide) and local (article-specific) inequality and investigate both constructs. Second, we explore both direct and indirect effects of these inequalities, exposing the intricate relationships between global inequality, local inequality, coordination, and article quality. We tested our hypotheses on a sample of a Wikipedia articles using structural equation modeling and found that global inequality exerts significant positive impact on article quality, while the effect of local inequality is indirect and is mediated by coordination.
Conference Paper
Full-text available
Wikipedia, a wiki-based encyclopedia, has become one of the most successful experiments in collaborative knowledge building on the Internet. As Wikipedia continues to grow, the potential for conflict and the need for coordination increase as well. This article examines the growth of such non-direct work and describes the development of tools to characterize conflict and coordination costs in Wikipedia. The results may inform the design of new collaborative knowledge systems. Author Keywords Wikipedia, wiki, collaboration, conflict, user model, Web-based interaction, visualization.
Conference Paper
Full-text available
Prior research on Wikipedia has characterized the growth in content and editors as being fundamentally exponential in nature, extrapolating current trends into the future. We show that recent editing activity suggests that Wikipedia growth has slowed, and perhaps plateaued, indicating that it may have come against its limits to growth. We measure growth, population shifts, and patterns of editor and administrator activities, contrasting these against past results where possible. Both the rate of page growth and editor growth has declined. As growth has declined, there are indicators of increased coordination and overhead costs, exclusion of newcomers, and resistance to new edits. We discuss some possible explanations for these new developments in Wikipedia including decreased opportunities for sharing existing knowledge and increased bureaucratic stress on the socio-technical system itself. The existing trends of exponential growth in digital technologies were the basis for Kurzweil's (17) argument that biological evolution and technological evolution follow a law of accelerating returns (i.e., exponential or even super-exponential growth). This lead to the notion of the "Singularity": a point in the near future when technological change becomes "so rapid and profound that it represents a rupture in the fabric of human history." 1 We argue that Wikipedia, one of the world's largest knowledge aggregators, does indeed mirror the growth of natural populations, but, following Darwin (7), we suggest that this growth becomes increasingly constrained and limited, and under those conditions there will be increased evidence of competition and dominance. In this paper, we present data that challenges the notion that Wikipedia exhibits unconstrained exponential growth in editor participation and contribution. We will show that growth has decreased substantially over the last two years, perhaps indicating some fundamental limiting constraints to growth. In ecological systems, when unfettered population growth approaches natural limits (e.g., in available resources), one generally observes increased competition. For Wikipedia, we will examine the data for indicators of increased competition that would be expected as a growing population system comes up against limits to growth. We present data from Wikipedia addressing three different aspects over time: the global activity level, a detailed analysis of the edit rates of various editor classes, and the population shifts in editor classes.
Article
Full-text available
Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological–temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.
Article
Full-text available
In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes who interacts how with whom when editing an article; it sig- nificantly extends previous network models that code author communities in Wikipedia. Several characteristics summa- rizing some aspects of the organization process and allowing the analyst to identify certain types of authors can be ob- tained from the edit network. Moreover, we propose several indicators characterizing the global network structure and methods to visualize edit networks. It is shown that the structural network indicators are correlated with quality la- bels of the associated Wikipedia articles.
Article
Full-text available
This paper assesses the content- and population-dynamics of a largecessful outcomes. Their destiny relies on the capacity of project sample of wikis, over a timespan of several months, in order to identify basic features that may predict or induce different types of fate. We analyze and discuss, in particular, the correlation of various macroscopic indicators, structural features and governance policies with specific growth patterns. While recent analyses of wiki dynamics have mostly focused on popular projects such as Wikipedia, we suggest research directions towards a more general theory of the dynamics of such communities.search on a wide range of wikis at various stages of development.
Conference Paper
Good Wikipedia articles are authoritative sources due to the collaboration of a number of knowledgeable contributors. This is the many eyes idea. The edit network associated with a Wikipedia article can tell us something about its quality or authoritativeness. In this paper we explore the hypothesis that the characteristics of this edit network are predictive of the quality of the corresponding article's content. We characterize the edit network using a profile of network motifs and we show that this network motif profile is predictive of the Wikipedia quality classes assigned to articles by Wikipedia editors. We further show that the network motif profile can identify outlier articles particularly in the 'Featured Article' class, the highest Wikipedia quality class.
Conference Paper
Reverts are important to maintaining the quality of Wikipedia. They fix mistakes, repair vandalism, and help enforce policy. However, reverts can also be damaging, especially to the aspiring editor whose work they destroy. In this research we analyze 400,000 Wikipedia revisions to understand the effect that reverts had on editors. We seek to understand the extent to which they demotivate users, reducing the workforce of contributors, versus the extent to which they help users improve as encyclopedia editors. Overall we find that reverts are powerfully demotivating, but that their net influence is that more quality work is done in Wikipedia as a result of reverts than is lost by chasing editors away. However, we identify key conditions -- most specifically new editors being reverted by much more experienced editors - under which reverts are particularly damaging. We propose that reducing the damage from reverts might be one effective path for Wikipedia to solve the newcomer retention problem.
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Article
When the probability of measuring a particular value of some quantity varies inversely as a power of that value, the quantity is said to follow a power law, also known variously as Zipf's law or the Pareto distribution. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. For instance, the distributions of the sizes of cities, earthquakes, solar flares, moon craters, wars and people's personal fortunes all appear to follow power laws. The origin of power-law behaviour has been a topic of debate in the scientific community for more than a century. Here we review some of the empirical evidence for the existence of power-law forms and the theories proposed to explain them.
Edit wars in wikipedia Studying cooperation and conflict between authors with history flow visualizations On ranking controversies in wikipedia: models and evaluation
  • R Sumi
  • T Yasseri
  • A Rung
  • A Kornai
  • J Kertész
  • K Dave
  • E Lim
  • A Sun
  • M Le
  • H Lauw
Sumi, R.; Yasseri, T.; Rung, A.; Kornai, A.; and Kertész, J. 2011. Edit wars in wikipedia. In IEEE Third International Confernece on Social Computing. Viégas, F.; Wattenberg, M.; and Dave, K. 2004. Studying cooperation and conflict between authors with history flow visualizations. In Proceedings of the SIGCHI conference on Human factors in computing systems, 575–582. ACM. Vuong, B.; Lim, E.; Sun, A.; Le, M.; and Lauw, H. 2008. On ranking controversies in wikipedia: models and evaluation. In Proceedings of the international conference on Web search and web data mining, 171–182. ACM.
Study of vandalism survival times
  • L Cobb
Cobb, L. 2009. Study of vandalism survival times.
Temporal networks Physical Reports He says, she says: Conflict and coordination in wikipedia
  • P Holme
  • J Saramäki
  • A Kittur
  • B Suh
  • B Pendleton
Holme, P., and Saramäki, J. 2012. Temporal networks. Physical Reports. to appear. Kittur, A.; Suh, B.; Pendleton, B.; and Chi, E. 2007. He says, she says: Conflict and coordination in wikipedia. In Proceedings of the SIGCHI conference on Human factors in computing systems, 453–462. ACM.
Network motifs: simple building blocks of complex networks Power laws, pareto distributions and zipf's law
  • R Milo
  • S Shen-Orr
  • S Itzkovitz
  • N Kashtan
  • D Chklovskii
  • U Alon
Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; and Alon, U. 2002. Network motifs: simple building blocks of complex networks. Science 298(5594):824. Newman, M. 2005. Power laws, pareto distributions and zipf's law. Contemporary physics 46(5):323–351.