Conference PaperPDF Available

Harnessing the wisdom of crowds in Wikipedia: Quality through coordination


Abstract and Figures

Wikipedia's success is often attributed to the large numbers of contributors who improve the accuracy, completeness and clarity of articles while reducing bias. However, because of the coordination needed to write an article collaboratively, adding contributors is costly. We examined how the number of editors in Wikipedia and the coordination methods they use affect article quality. We distinguish between explicit coordination, in which editors plan the article through communication, and implicit coordination, in which a subset of editors structure the work by doing the majority of it. Adding more editors to an article improved article quality only when they used appropriate coordination techniques and was harmful when they did not. Implicit coordination through concentrating the work was more helpful when many editors contributed, but explicit coordination through communication was not. Both types of coordination improved quality more when an article was in a formative stage. These results demonstrate the critical importance of coordination in effectively harnessing the "wisdom of the crowd" in online production environments.
Content may be subject to copyright.
A preview of the PDF is not available
... Artificial Intelligence. 8,45,185,186,196,199,209,213,225,226,243,249 API Application Programming Interface. 12, 64 ...
... Bot Approvals Group. 124, 126, 128, 131, 137, 139, 141, 144, 149, 169, 178, 195-197, 8,185,196,199,209,213,224,240,268 ...
... Section These appear to undermine non-ownership principles, even if the cause of such non-observance of Wikipedia rules is not fully clear.Kittur & Kraut (2008), recognised the disparity between the open-access environment of Wikipedia and other peer production settings: "Management implications. Both the public perception and the ideology of Wikipedia are of a free and open environment, where everyone is equally eligible to contribute. However, even a free and open environment needs coordinati ...
Hypertextual in nature, the Web in its earliest form was technically limited and not capable of using the full richness of hypertext at that time. Despite subsequent advances in Web technology, some of the older hypertextual capabilities remain unrealised and hypertext/media appears to be treated more as a technology than a medium. For a hypertext docuverse that holds changing information, such as a knowledge base, paying heed to its hypertextual structure aids the long-term health and sustainability of the knowledge it contains. Wikipedia is the world largest public hypertext knowledge base. Constantly updated by humans and bots, it is an ever-changing knowledge store. Using Wikipedia as a context, this thesis investigates whether large collaborative hypertexts show signs of their contributors using deliberate hypertextual structure or are simply connecting ‘pages’ of digital content. The research also considers collaborative hypertexts in the context of social machines with regard to sustaining organisational knowledge as hypertext content. The results reveal under-use of processes available to sustain and improve an organisation’s docuverse and a gap in organisational roles and skill-sets to apply those processes.
... Task-related topics in the crowdsourcing context have been studied in the form of crowdsourcing contests (Zheng et al., 2011), human intelligence (Kurup et al., 2020), open-source software (Carillo et al., 2017), citizen science (Tinati et al., 2017), Wikipedia (Kittur & Kraut, 2008), cultural heritage crowdsourcing (Zhang et al., 2020), and other general online crowdsourcing contexts (Neto & Santos, 2018;Pee et al., 2018;Zhao & Zhu, 2016). Typical online marketplaces, such as Amazon, Mechanical Turk, and Design Crowd, emphasize the efficiency of task completion, while the corresponding research focuses on task recommendation (Kurup et al., 2020;Yuen et al., 2015), task assignment (Ho & Vaughan, 2012), task composition and decomposition (Amer-Yahia et al., 2016;Jiang et al., 2020), and participant remuneration (Mao et al., 2013). ...
... In non-profit crowdsourcing platforms, such as citizen science projects and Wikipedia, registration is optional and volunteers can engage anonymously. Unlike studies on commercial crowdsourcing, task-related studies in non-profit crowdsourcing contexts have mainly focused on task design (Sprinks et al., 2017), virtual taskrewarding (Cappa et al., 2018), task complexity and granularity (Nov et al., 2011), and task significance (Schroer & Hertel, 2009) to encourage volunteer engagement (Tinati et al., 2017), improving the quality of outcomes (Kittur & Kraut, 2008) and enriching scientific outputs (Phillips et al., 2018). Most relevant empirical studies are based on surveys, interviews, and quasi-experiments from a behavioral science perspective. ...
As the crowdsourcing approach is increasingly being used for digitizing cultural heritage artifacts, there is a rising need for volunteer engagement in such collaborative digital humanities projects. This study focuses on the less explored topic of imbalanced volunteer engagement (IVE); it refers to the fact that most volunteers tend to focus only on a small portion of tasks, making it challenging to sustain cultural heritage crowdsourcing (CHC) projects. Using a public dataset containing 145,168,535 items captured from the Australian Newspaper Digitisation Project, we utilized a machine learning-based causal inference approach to investigate the IVE problem by examining the causal relationships between task content characteristics and volunteer engagements. We used the directed acyclic graph (DAG) to represent the structure, such that a causal relationship consisting of 11 nodes and 16 edges was obtained. Specifically, four causes, including task category, word count, number of task lists, and whether the task was illustrated, directly affect IVE. We further discuss these findings from a theoretical perspective and suggest three propositions: a) nudge-like intervention of a task list, b) subjective (perceived) low task complexity, and c) attraction of task presentation, alleviating the IVE problem. This study contributes to the literature on volunteer engagement in the CHC context and sheds new light on the design and implementation of collaborative digital humanities projects.
... In online collaboration, there are two main threads of research in HCI and CSCW. One thread is about distributed teamwork with organized, stable, and long-last collaboration in contexts such as teleworking, software development, and education (e.g., [2,3,19,31,62]); another is about virtual teamwork with informal, unstable, and short-term collaboration in contexts such as gaming (e.g., [15,22,46]) and peer production communities (e.g., [39][40][41]). In the informal, unstable, and short-term collaborative process, many scholars have investigated the group characteristics of the volunteer workers and how these characteristics are related to group productivity and outcome quality (see meta-analysis by [1,51]). ...
... As the community grows, the conflicts increase, and costs of coordination, such as conflict resolution, consensus building, and community management, also increase [41]. Coordination mechanisms were not always effective for managing different conflicts in the team [40], and both implicit coordination through editor concentration and explicit coordination through communication can potentially improve the article quality when they were used [39]. While most scholarships focus on the editorial team and the visible outcome -though some have mentioned that editors can revert vandalism, little work explores the moderation team and the invisible content management, namely, how volunteer mods work as a team to get rid of harmful content, such as vandalism on Wikipedia and harassment on Twitch. ...
Full-text available
Volunteer moderators (mods) play significant roles in developing moderation standards and dealing with harmful content in their micro-communities. However, little work explores how volunteer mods work as a team. In line with prior work about understanding volunteer moderation, we interview 40 volunteer mods on Twitch - a leading live streaming platform. We identify how mods collaborate on tasks (off-streaming coordination and preparation, in-stream real-time collaboration, and relationship building both off-stream and in-stream to reinforce collaboration) and how mods contribute to moderation standards (collaboratively working on the community rulebook and individually shaping community norms). We uncover how volunteer mods work as an effective team. We also discuss how the affordances of multi-modal communication and informality of volunteer moderation contribute to task collaboration, standards development, and mod's roles and responsibilities.
... However, the population size is a factor that can critically affect performance, as it is known to affect the worker collective's coordination costs. A larger population means a larger search space of available candidate teammates, and therefore more effort needed by the workers to process suitable teammates (Kittur and Kraut, 2008). We simulate nine separate and increasing population sizes starting from 20 (our basic simulation setting) and going up to 100 workers per pool to observe how the average best, worse, and median teamwork quality vary accordingly. ...
Full-text available
Modern crowdsourcing offers the potential to produce solutions for increasingly complex tasks requiring teamwork and collective labor. However, the vast scale of the crowd makes forming project teams an intractable problem to coordinate manually. To date, most crowdsourcing collaborative platforms rely on algorithms to automate team formation based on worker profiling data and task objectives. As a top-down strategy, algorithmic crowd team formation tends to alienate workers causing poor collaboration, interpersonal clashes, and dissatisfaction. In this paper, we investigate different ways that crowd teams can be formed through three team formation models namely bottom-up, top-down, and hybrid. By simulating an open collaboration scenario such as a hackathon, we observe that the bottom-up model forms the most competitive teams with the highest teamwork quality. Furthermore, we note that bottom-up approaches are particularly suitable for populations with high-risk appetites (most workers being lenient toward exploring new team configurations) and high degrees of homophily (most workers preferring to work with similar teammates). Our study highlights the importance of integrating worker agency in algorithm-mediated team formation systems, especially in collaborative/competitive settings, and bears practical implications for large-scale crowdsourcing platforms.
Wikipedia has been turned into an immensely popular crowd-sourced encyclopedia for information dissemination on numerous versatile topics in the form of subscription free content. It allows anyone to contribute so that the articles remain comprehensive and updated. For enrichment of content without compromising standards, the Wikipedia community enumerates a detailed set of guidelines, which should be followed. Based on these, articles are categorized into several quality classes by the Wikipedia editors with increasing adherence to guidelines. This quality assessment task by editors is laborious as well as demands platform expertise. As a first objective, in this paper, we study evolution of a Wikipedia article with respect to such quality scales. Our results show novel non-intuitive patterns emerging from this exploration. As a second objective we attempt to develop an automated data driven approach for the detection of the early signals influencing the quality change of articles. We posit this as a change point detection problem whereby we represent an article as a time series of consecutive revisions and encode every revision by a set of intuitive features. Finally, various change point detection algorithms are used to efficiently and accurately detect the future change points. We also perform various ablation studies to understand which group of features are most effective in identifying the change points. To the best of our knowledge, this is the first work that rigorously explores English Wikipedia article quality life cycle from the perspective of quality indicators and provides a novel unsupervised page level approach to detect quality switch, which can help in automatic content monitoring in Wikipedia thus contributing significantly to the CSCW community.
Full-text available
In this paper, we investigate the process of scientific discovery using an under-exploited source of information: the Polymath projects. Polymath projects are an original attempt to solve a series mathematical problems collectively and in a collaborative online environment. To investigate the Polymath experiment, we analyze all the posts related to the projects that have resulted in a peer-reviewed publication. We focus in particular on the organization of the scientific labor and on the innovations that result from the contributions of the different authors. We find that a high presence of occasional contributors increases the productivity of the most active users and the overall productivity of the forums (i.e., the number of posts grows super-linearly with the number of contributors). We argue that, in large-scale collaborations, the serendipitous interaction between occasional contributors can be crucial to the scientific process, and individual contributions from occasional participants can open new directions of research.
Conference Paper
Volunteer moderators (mods) play significant roles in developing moderation standards and dealing with harmful content in their micro-communities. However, little work explores how volunteer mods work as a team. In line with prior work about understanding volunteer moderation, we interview 40 volunteer mods on Twitch --- a leading live streaming platform. We identify how mods collaborate on tasks (off-streaming coordination and preparation, in-stream real-time collaboration, and relationship building both off-stream and in-stream to reinforce collaboration) and how mods contribute to moderation standards (collaboratively working on the community rulebook and individually shaping community norms). We uncover how volunteer mods work as an effective team. We also discuss how the affordances of multi-modal communication and informality of volunteer moderation contribute to task collaboration, standards development, and mod's roles and responsibilities.
Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different social media platforms. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).
Presents a framework derived from expectancy theory for organizing the research on productivity loss among individuals combining their efforts into a common pool (i.e., the research on social loafing, free riding, and the sucker effect). Low productivity is characterized as a problem of low motivation arising when individuals perceive no value to contributing, perceive no contingency between their contributions and achieving a desirable outcome, or perceive the costs of contributing to be excessive. Three broad categories of solutions, corresponding to each of the 3 sources of low productivity, are discussed: (1) providing incentives for contributing, (2) making contributions indispensable, and (3) decreasing the cost of contributing. Each of these solutions is examined, and directions for future research and the application of this framework to social dilemmas are discussed.
The Internet has fostered an unconventional and powerful style of collaboration: "wiki" web sites, where every visitor has the power to become an editor. In this paper we investigate the dynamics of Wikipedia, a prominent, thriving wiki. We make three contributions. First, we introduce a new exploratory data analysis tool, the visualization, which is effective in revealing patterns within the wiki context and which we believe will be useful in other collaborative situations as well. Second, we discuss several collaboration patterns highlighted by this visualization tool and corroborate them with statistical analysis. Third, we discuss the implications of these patterns for the design and governance of online collaborative social spaces. We focus on the relevance of authorship, the value of community surveillance in ameliorating antisocial behavior, and how authors with competing perspectives negotiate their differences.
According to its proponents, open source style software development has the capacity to compete successfully, and perhaps in many cases displace, traditional commercial development methods. In order to begin investigating such claims, we examine data from two major open source projects, the Apache web server and the Mozilla browser. By using email archives of source code change history and problem reports we quantify aspects of developer participation, core team size, code ownership, productivity, defect density, and problem resolution intervals for these OSS projects. We develop several hypotheses by comparing the Apache project with several commercial projects. We then test and refine several of these hypotheses, based on an analysis of Mozilla data. We conclude with thoughts about the prospects for high-performance commercial/open source process hybrids.
The book, The Mythical Man-Month, Addison-Wesley, 1975 (excerpted in Datamation, December 1974), gathers some of the published data about software engineering and mixes it with the assertion of a lot of personal opinions. In this presentation, the author will list some of the assertions and invite dispute or support from the audience. This is intended as a public discussion of the published book, not a regular paper.