To read the full-text of this research, you can request a copy directly from the authors.
Abstract
Reverts are important to maintaining the quality of Wikipedia. They fix mistakes, repair vandalism, and help enforce policy. However, reverts can also be damaging, especially to the aspiring editor whose work they destroy. In this research we analyze 400,000 Wikipedia revisions to understand the effect that reverts had on editors. We seek to understand the extent to which they demotivate users, reducing the workforce of contributors, versus the extent to which they help users improve as encyclopedia editors. Overall we find that reverts are powerfully demotivating, but that their net influence is that more quality work is done in Wikipedia as a result of reverts than is lost by chasing editors away. However, we identify key conditions -- most specifically new editors being reverted by much more experienced editors - under which reverts are particularly damaging. We propose that reducing the damage from reverts might be one effective path for Wikipedia to solve the newcomer retention problem.
To read the full-text of this research, you can request a copy directly from the authors.
... Work from this community manager perspective typically measures either aspects of the technology or community-level measures of activity in order to predict community-level outcomes. For example, researchers have found that the quality of information and content in a community substantially matters in encouraging continuous participation [47]; that group-level norms, such as those for how to deal with newcomers, can help attract and retain new members [28,36,51]; that social and technical mechanisms for managing and moderating a community can affect a community's long-term ability to thrive and overcome challenges [28,34,43,51]; and that meta-characteristics of a group (such as size, activity levels, or network structures) can help predict "success" [16][17][18]. In this research, community growth is typically taken as a goal, and often appears as a dependent variable in regression equations measuring the effect of interventions or variations in community features on community success [4,9,18,33,35,60,68]. ...
... Work from this community manager perspective typically measures either aspects of the technology or community-level measures of activity in order to predict community-level outcomes. For example, researchers have found that the quality of information and content in a community substantially matters in encouraging continuous participation [47]; that group-level norms, such as those for how to deal with newcomers, can help attract and retain new members [28,36,51]; that social and technical mechanisms for managing and moderating a community can affect a community's long-term ability to thrive and overcome challenges [28,34,43,51]; and that meta-characteristics of a group (such as size, activity levels, or network structures) can help predict "success" [16][17][18]. In this research, community growth is typically taken as a goal, and often appears as a dependent variable in regression equations measuring the effect of interventions or variations in community features on community success [4,9,18,33,35,60,68]. ...
... Derived from a rich body of research, these design mechanisms take up a top-down "social engineering" approach, wherein the basic assumption is that community managers can make design choices that will shape the community's success. This and similar work often directly or indirectly proposes at least one of the following three goals: (1) increasing the number of community members by attracting new members, crafting early experiences, and socializing newcomers [4,5,14,27,28,36,51] (2) retaining existing community members via strategies to increase individuals' commitment to a community [23]; and (3) increasing contributions by and interactions amongst community members [3,44]. ...
Many benefits of online communities---such as obtaining new information, opportunities, and social connections---increase with size. Thus, a ``successful'' online community often evokes an image of hundreds of thousands of users, and practitioners and researchers alike have sought to devise methods to achieve growth and thereby, success. On the other hand, small online communities exist in droves and many persist in their smallness over time. Turning to the highly popular discussion website Reddit, which is made up of hundreds of thousands of communities, we conducted a qualitative interview study examining how and why people participate in these persistently small communities, in order to understand why these communities exist when popular approaches would assume them to be failures. Drawing from twenty interviews, this paper makes several contributions: we describe how small communities provide unique informational and interactional spaces for participants, who are drawn by the hyperspecific aspects of the community; we find that small communities do not promote strong dyadic interpersonal relationships but rather promote group-based identity; and we highlight how participation in small communities is part of a broader, ongoing strategy to curate participants' online experience. We argue that online communities can be seen as nested niches: parts of an embedded, complex, symbiotic socio-informational ecosystem. We suggest ways that social computing research could benefit from more deliberate considerations of interdependence between diverse scales of online community sizes.
... The survival of open community content platforms depends on continued contributions by their members. Churn, i.e. the desertion of contributors, is a major concern in many online platforms [1,2,3,4]. The highest churn rate appears after a contributor's single (first and last) contribution. ...
... These motivations are reflected in Stack Exchange's internal annual survey. 1 Contribution motivation categories such as reputation building, relationship building, and selfdiscovery are regulated by motivational affordance mechanisms such as points [8,16], badges [17] and feedback [10,18]. In this study, we hope to contribute to the understanding of the influence of feedback on newcomer motivation to continue contributing to CQA services. ...
... FIT assumes that "attention is limited and therefore only feedback-standard gaps that receive attention actively participate in behavior regulation". The first feedback a newcomer receives warrants high attention [1,23]. ...
... shown that public feedback mechanisms can have unintended long-term consequences. Strongly negative contribution feedback may be discouraging to newcomers (Halfaker et al., 2011), and public peer ratings and commenting systems can yield harmful feedback effects affecting contributor behaviour (Cheng et al., 2014;Michael and Otterbacher, 2014). Such effects may arise in part because public peer feedback fosters a desire to build reputation and social standing (Tausczik and Pennebaker, 2012;Parnell et al., 2011). ...
... In volunteer communities, constraints in the capacity and ability of participants may pose further barriers to sustained participation. Newcomers can be demotivated more easily by strongly negative feedback, compared to participants who are more experienced (Halfaker et al., 2011;Zhu et al., 2013;Wohn, 2015). It may be possible to address this difference with targeted interventions to improve self-efficacy, for example by nurturing and educating new contributors, and by fostering a belief that participation is possible and will be welcomed (Bishop, 2007). ...
... Positive and social feedback can increase participant motivation (Zhu et al., 2013), however strongly negative feedback can harm motivation. On Wikipedia, reverts to article modifications can improve article quality, but they are powerfully demotivating to newcomers (Halfaker et al., 2011). As a consequence it may be advisable to provide more nuanced feedback, rather than an outright rejection of an entire contribution. ...
Organisers of large crowdsourcing initiatives need to consider how to produce outcomes, but also how to build volunteer capacity. Central concerns include the impact of the first-time contributor experience, and the interplay of different modes of participation in larger organisations that host multiple strands of activity. How can volunteer capacity be built proactively, so that trained volunteers are available when needed? How important are opportunities for social encounter, either online or in person? We present four empirical studies of the Humanitarian OpenStreetMap Team (HOT), a novel setting where thousands of volunteers produce maps to support humanitarian aid. Its diversity of settings and activities provides an opportunity to observe the effects of different coordination practices within a single organisation. Participation is online and open to all, however volunteers need to learn specialist tools and workflows. To support newcomers, HOT organises offline events to learn the practice under expert guidance. Our research is motivated by a dual aim: first, to produce empirical evaluations of novel practices, informed by existing community concerns. Second, to revisit existing theories in social and behavioural science through the lens of this novel setting. We use statistical methods to observe the activity and retention of HOT volunteers. The full HOT contribution history is our primary source of empirical evidence, covering multiple years of activity. We can demonstrate that coordination practices have a marked impact on contributor retention. Complex task designs can be a deterrent, while social contribution settings and peer feedback are associated with a significant increase in newcomer retention. We further find that event-centric campaigns can be significant recruiting and reactivation events, however that this is not guaranteed. Our analytical methods provide a means of interpreting key differences in outcomes. We relate our findings to comparable settings, and close with a discussion of the theoretical and practical implications.
... Numerous studies carried out in the past [1, 16] show tremendous initial growth in Wikipedia with an exponential increase in the number of articles, editors, edits and views. Further, researchers in [6,7,9,15] found that growth in its contents is slowing down or getting almost saturated. Some of the possible reasons reported for the saturation are unfriendly behaviors towards newly joined editors where their edits are likely to be reverted by experienced editors [7], increased overhead costs of coordination and production, and Wikipedia has probably reached the natural limits of growth [6]. ...
... Further, researchers in [6,7,9,15] found that growth in its contents is slowing down or getting almost saturated. Some of the possible reasons reported for the saturation are unfriendly behaviors towards newly joined editors where their edits are likely to be reverted by experienced editors [7], increased overhead costs of coordination and production, and Wikipedia has probably reached the natural limits of growth [6]. This phenomenon of slowdown or saturation in growth is also discussed in Wikipedia 3 itself. ...
... In several research papers [6,7,9,15], researchers noticed slowdown or saturation in the contents of Wikipedia after its enormous initial growth. Saturation in the contents . ...
Wikipedia is a multilingual encyclopedia that works on the idea of virtual collaboration. Initially, its contents such as articles, editors and edits grow exponentially. Further growth analysis of Wikipedia shows slowdown or saturation in its contents. In this paper, we investigate whether two essential characteristics of Wikipedia, collaboration and cohesiveness also encounter the phenomenon of slowdown or saturation with time. Collaboration in Wikipedia is the process where two or more editors edit together to complete a common article. Cohesiveness is the extent to which a group of editors stays together for mutual interest. We employ the concept of network motifs to investigate saturation in these two considered characteristics of Wikipedia. We consider star motifs of articles with the average number of edits to study the growth of collaboration and 2 \(\times \) 2 complete bicliques or “butterfly” motifs to interpret the change in the cohesiveness of Wikipedia. We present the change in the count of the mentioned network motifs for the top 22 languages of Wikipedia upto May 2019. We observe saturation in collaboration while the linear or sudden rise in cohesiveness in most of the languages of Wikipedia. We therefore notice, although the contents of Wikipedia encounter natural limits of growth, the activities of editors are still improving with time.
... 2 But when discrimination gets encoded into automatic decision-making at Wikimedia, this aggravates the problem. For example, it has been previously found that new contributors whose edits are automatically reverted are much more likely to withdraw from the project [21,22,50]. ...
... Berk et al. [4], Hardt et al. [23], and Dwork et al. [16] discuss different notions of fairness including equality of opportunity and statistical parity. More closely related to our work, Halfaker et al. [21,22] and Schneider et al. [50] find that newcomer retention at Wikimedia projects is severely affected by overzealous reversion of their edits. Passing part of the blame to automatic vandalism detectors, no remedies are proposed. ...
... The discriminatory nature of Wikipedia's damage control system has been previously shown [21,22,50]: the rise of (semi-)automatic reviewing tools caused more newcomer contributions to be considered damaging, severely affecting retention. Although policies have been adjusted to prevent such discrimination, the vandalism detection models have not been redesigned. ...
Crowdsourced knowledge bases like Wikidata suffer from low-quality edits and vandalism, employing machine learning-based approaches to detect both kinds of damage. We reveal that state-of-the-art detection approaches discriminate anonymous and new users: benign edits from these users receive much higher vandalism scores than benign edits from older ones, causing newcomers to abandon the project prematurely. We address this problem for the first time by analyzing and measuring the sources of bias, and by developing a new vandalism detection model that avoids them. Our model FAIR-S reduces the bias ratio of the state-of-the-art vandalism detector WDVD from 310.7 to only 11.9 while maintaining high predictive performance at 0.963 ROC-AUC and 0.316 PR-AUC.
... Importantly, we uncover the collaborative dynamics within an article that lead newcomers' continued participation after the shock. Much of the work on participation of newcomers in Wikipedia has focused on platform newcomersusers who are new to Wikipedia, and on the retention of such newcomers (Halfaker, Kittur, and Riedl 2011;Halfaker et al. 2013;Faulkner, Walling, and Pinchuk 2012;Mesgari et al. 2015;Suh et al. 2009;Li and Farzan 2018;Robert and Romero 2015;Chen, Ren, and Riedl 2010;Ransbotham and Kane 2011). In contrast, our study considers newcomers to an article who may have some experience editing other articles in Wikipedia. ...
... In contrast, our study considers newcomers to an article who may have some experience editing other articles in Wikipedia. However, as we will show, articles newcomers during shocks have significantly lower Wikipedia experience than incumbents, suggesting that some of the same challenges in retaining platform newcomers (Halfaker, Kittur, and Riedl 2011) will apply to article newcomers as well. Our study also goes beyond measuring whether an article newcomer returns to the article (retention) and measure their participation by how much they contribute relative to the contributions of existing editors. ...
... These studies focus on various approaches to attracting and keeping new platform members (Halfaker et al. 2013;Morgan and Halfaker 2018;Schneider, Gelley, and Halfaker 2014;Li and Farzan 2018). An example is Halfaker et al. (2011), which examined the impact of platform policies and norms on the retention of new members. Perhaps more closely related to our study, Li and Farzan (2018), studied the behavior of editors that join Wikipedia during three current events. ...
User participation is vital to the success of collaborative crowdsourcing platforms such as Wikipedia. Previously user participation has been studied during "normal times". However , less is known about participation following shocks that draw attention to an article. Such events can be recruiting opportunities due to increased attention; but can also pose a threat to the quality and control of the article and drive away newcomers. We study the collaborative dynamics of Wikipedia articles after times corresponding to shocks generated by drastic increases in attention as indicated by data from Google trends. We find that participation following such events is indeed different from participation during normal times-both newcomers and incumbents participate at higher rates during shocks. We also identify collaboration dynamics that mediate the effects of shocks on continued participation after the shock. The impact of shocks on participation is mediated by the amount of negative feedback given to newcomers in the form of reverted edits and the amount of coordination editors engage in through edits of the article's talk page.
... Although it also serves to mitigate damage, removing content is a common form of sanctioning because it communicates that an action was inappropriate [63]. Halfaker et al. [29] shows that removing content is an e ective sanction and results in higher quality subsequent contributions by the reverted contributor in Wikipedia. Similarly, Srinivasan et al. [71] found that people whose comments were removed from Reddit were less likely to violate norms in the future. ...
... Although the goal of most sanctioning is to steer participants toward more productive types of behavior, the e ect is o en simply to deter participation. is can be particularly problematic with well-meaning newcomers who o en violate norms because they have not yet learned the ropes [1,26,29]. ...
... Our outcome variable for answering RQ1 must capture sanctioning in Wikipedia. Following a large body of other social computing research, we measure sanctions as identity reverts [e.g., 26,29,63,73]. Identity reverts occur when a user undoes another user's edit by restoring a page to an earlier state and are measured by comparing hashes of page revisions [29]. at said, identity reverts are an imperfect measure of sanctioning. ...
Online community moderators often rely on social signals like whether or not a user has an account or a profile page as clues that users are likely to cause problems. Reliance on these clues may lead to "over-profiling" bias when moderators focus on these signals but overlook misbehavior by others. We propose that algorithmic flagging systems deployed to improve efficiency of moderation work can also make moderation actions more fair to these users by reducing reliance on social signals and making norm violations by everyone else more visible. We analyze moderator behavior in Wikipedia as mediated by a system called RCFilters that displays social signals and algorithmic flags and to estimate the causal effect of being flagged on moderator actions. We show that algorithmically flagged edits are reverted more often, especially edits by established editors with positive social signals, and that flagging decreases the likelihood that moderation actions will be undone. Our results suggest that algorithmic flagging systems can lead to increased fairness but that the relationship is complex and contingent.
... Since crowdfunding is enabled by online platforms and relies on contributions from many participants, it can be regarded as a special case of peer production or crowdsourcing, where contributors bring campaign proposals and personal finances instead of ideas, opinions, or effort [22]. This allows us to leverage the rich literature about contributor retention in peer production and crowdwork that covers platforms like Wikipedia [23,24], Wikia [48], OpenStreetMap [15], Q&A sites [16,43,54], forums [37], newsgroups [7,31], and social media sites [34,53]. ...
... The shared theme of this research is the retention of newcomers, which is strongly affected by their experience during the first contributions on the platform [24,31,48]. Current performance, as captured by the frequency, speed, and overall number of contributions, has a positive impact on retention of novice [15,16,37,43] and continuing contributors [53]. ...
... Instead of allowing multiple pending investments, we require investors to decide about new contributions only after they find out whether their previous investment was successful or not. This assumption is supported by research that found previous activity to impact user retention on various platforms [24,54]. To formalize the intuition that the behavior of individuals changes in response to the outcome of their actions, we thus assume that decisions about ongoing participation on the platform are made successively after learning about the success of the prior investment [8]. ...
Crowdfunding platforms promise to disrupt investing as they bypass traditional financial institutions through peer-to-peer transactions. To stay functional, these platforms require a supply of investors who are willing to contribute to campaigns. Yet, little is known about the retention of investors in this setting. Using four years of data from a leading equity crowdfunding platform, we empirically study the length and success of investor activity on the platform. We analyze temporal variations in these outcomes and explain patterns using statistical modeling. Our models are based on information about user's past and current investment decisions, i.e., content-based and structural similarities between the campaigns they invest in. We uncover the role of past successes and diversity of investment decisions for novice vs. serial investors. Our results inform potential strategies for increasing the retention of investors and improving their decisions on crowdfunding platforms.
... Our second hypothesis (H2) looks at low-quality edits that we measure by counting the number of edits that are subsequently removed (reverted) in their entirety. We adopt the simplest and most widely used approach to detecting reverts by focusing on what are called "identity reverts" (Halfaker et al., 2011;Piskorski & Gorbatâi, 2017;Priedhorsky et al., 2007). 7 Using the identity revert approach, a contribution A is considered reverted if, and only if, a second user's contribution B returns the page to a state that is identical to its state before edit A. The affordances of MediaWiki make it Note. ...
... Wikis that attract new contributors and generate more nonreverted contributions or contributions that last longer are thriving wikis. These measures are also consistent with the theoretical predictions we develop above as well as prior empirical approaches to modeling attempts to build public information goods in online communities (Cheshire & Antin, 2008;Halfaker et al., 2011;Schweik & English, 2012;Zhang & Zhu, 2011). Differentiating between contributions of low (H2) and high (H3) quality allows us to consider the impact of the intervention in a more holistic fashion than much of this earlier work. ...
Online communities, like Wikipedia, produce valuable public information goods. Whereas some of these communities require would-be contributors to create accounts, many do not. Does this requirement catalyze cooperation or inhibit participation? Prior research provides divergent predictions but little causal evidence. We conduct an empirical test using longitudinal data from 136 natural experiments where would-be contributors to wikis were suddenly required to log in to contribute. Requiring accounts leads to a small increase in account creation, but reduces both high- and low-quality contributions from registered and unregistered participants. Although the change deters a large portion of low-quality participation, the vast majority of deterred contributions are of higher quality. We conclude that requiring accounts introduces an undertheorized tradeoff for public goods production in interactive communication systems.
... illustrating that IT might also be used to channel and control the innovative contributions of selected experts (Halfaker, Kittur, & Riedl, 2011;Shaikh & Vaast, 2016). ...
... Our findings on the diverse types of community BUI importantly extend existing knowledge on the interplay between member(s) and technology, suggesting the channeling role of IT in community innovation (Halfaker, Kittur, & Riedl, 2011;Shaikh & Vaast, 2016). We argue that such dynamics are typical for communities with core-periphery structures, diverse member backgrounds, and a dominant need for specific IT services. ...
This paper examines how innovative uses of IT artifacts and their repurposing to fulfill emerging or unsatisfied user needs (bottom-up innovation, BUI) develop in community settings. Based on a longitudinal analysis of “HomeNets,” communities that have developed residential internet access in Belarus over a 20-year period, we illustrate that the development of community BUI is driven not only by the needs of the innovating members. Instead, community BUI development emerges from the interplay between the innovating members’ community context and technology, as well as from the interplay between the BUI technology and context. We demonstrate how these dynamics trigger community BUI development that goes beyond the needs and expectations of the innovating actors and impacts community evolution and long-term survival. Based on our findings, we develop a model of community BUI development. We discuss the theoretical implications of our findings, highlighting the role of technology and context in community BUI and its processual unfolding beyond the needs and intensions of the innovating members.
... Although automated tools support rapid restoration of the digital work to its original state-often within minutes [2], automated vandal fighting is not without costs. It requires development, invocation, and maintenance of tools and it leads to impersonal messages and improper rejection of contributions which has been implicated in a decline of well-intentioned newcomers [15], a trend which requires substantial effort to remediate [16,25]. The digital nature of online vandalism does not mean it is victimless-besides offending and burdening hard-working creators, maintainers, and moderators [11], the general public suffers as well. ...
... Identifiability and effort, together with exclusionary policies that affect some vandals, offer some explanation for both the rates of vandalism and types of vandalism perpetrated. Interventions that target these factors independently may have unintended consequences and could deter newcomers and valuable casual contributions [15,16]. ...
What factors influence the decision to vandalize? Although the harm is clear, the benefit to the vandal is less clear. In many cases, the thing being damaged may itself be something the vandal uses or enjoys. Vandalism holds communicative value: perhaps to the vandal themselves, to some audience at whom the vandalism is aimed, and to the general public. Viewing vandals as rational community participants despite their antinormative behavior offers the possibility of engaging with or countering their choices in novel ways. Rational choice theory (RCT) as applied in value expectancy theory (VET) offers a strategy for characterizing behaviors in a framework of rational choices, and begins with the supposition that subject to some weighting of personal preferences and constraints, individuals maximize their own utility by committing acts of vandalism. This study applies the framework of RCT and VET to gain insight into vandals' preferences and constraints. Using a mixed-methods analysis of Wikipedia, I combine social computing and criminological perspectives on vandalism to propose an ontology of vandalism for online content communities. I use this ontology to categorize 141 instances of vandalism and find that the character of vandalistic acts varies by vandals' relative identifiability, policy history with Wikipedia, and the effort required to vandalize.
... Research within HCI, and computing more broadly, is yet to engage critically and meaningfully with the notion of aspirations. A scrutiny of the literature in the Association of Computing Machinery's Digital Library (ACM DL) reveals that with 340 results (including some duplicates), the word aspiration has been used to refer to a wide range of abstract concepts such as needs, desires, hopes, and wants (e.g., [6,12,38])-the many terms one might use towards imagining short-and/or long-term futures. If we look at who aspires, we end up with individuals (e.g., women, students, designers [6,12]), organizations (e.g., non-profits, government agencies [24,48]), disciplines (e.g., HCI, Ubicomp [11,17]), technologies (e.g., network algorithms, platforms [1,34]), and more. ...
... A scrutiny of the literature in the Association of Computing Machinery's Digital Library (ACM DL) reveals that with 340 results (including some duplicates), the word aspiration has been used to refer to a wide range of abstract concepts such as needs, desires, hopes, and wants (e.g., [6,12,38])-the many terms one might use towards imagining short-and/or long-term futures. If we look at who aspires, we end up with individuals (e.g., women, students, designers [6,12]), organizations (e.g., non-profits, government agencies [24,48]), disciplines (e.g., HCI, Ubicomp [11,17]), technologies (e.g., network algorithms, platforms [1,34]), and more. Some design research also articulates the idea of design as a way of thinking about aspirations, say, of a city [10], and aspirational personas [42]. ...
We present a case for aspirations-based design by drawing on a qualitative inquiry into the lives of young girls in rural West Bengal (India). These girls form a particularly vulnerable population, coming from an area known to be susceptible to sex trafficking and crimes against women. We leverage our findings to engage with Kentaro Toyama's call for greater attention to aspirations in designing technology for development [51]. We highlight the aspirations of and for these girls and reflect on the embedded, temporal, and mutable qualities of these aspirations. Finally, we examine how an aspirations-based design approach might factor these qualities into technology design. Although our analysis draws on empirical findings from rural/suburban India, the insights derived from this research are relevant for the process of designing technologies towards fulfillment of aspirations, more generally.
... As an online example, although Wikipedia's quality assessment systems can efficiently detect and revert low quality edits, research shows that they can also harm the motivation of well-meaning newcomers, still learning how to contribute [30]. When their first few edits were rudely reverted by algorithmic tools, Wikipedia newcomers left in droves [33], violating the community's "don't bite the newcomers" policy [1]. Unfortunately, low newcomer retention has hindered the overall growth of Wikipedia [54]. ...
... The Wikipedia literature has described how quality control processes can have adverse impacts on newcomers [33]. Interestingly, our participants suggest that algorithms like ORES can and should play a role in helping inexperienced editors, as well as other underrepresented editor groups, such as minori-ties and females. ...
On Wikipedia, sophisticated algorithmic tools are used to assess the quality of edits and take corrective actions. However, algorithms can fail to solve the problems they were designed for if they conflict with the values of communities who use them. In this study, we take a Value-Sensitive Algorithm Design approach to understanding a community-created and -maintained machine learning-based algorithm called the Objective Revision Evaluation System (ORES)---a quality prediction system used in numerous Wikipedia applications and contexts. Five major values converged across stakeholder groups that ORES (and its dependent applications) should: (1) reduce the effort of community maintenance, (2) maintain human judgement as the final authority, (3) support differing peoples' differing workflows, (4) encourage positive engagement with diverse editor groups, and (5) establish trustworthiness of people and algorithms within the community. We reveal tensions between these values and discuss implications for future research to improve algorithms like ORES.
... Wikipedia, however, is not without its problems. Wikipedia's guidelines instruct editors not to "bite the newbies," but they also proclaim, "there are no rules," and community norms tend to reward aggressive behaviors (e.g., [35,43,69]). As others have noted (e.g., [8,12,35]), these cultural contradictions not only make it difficult for newcomers to participate, but also create a labyrinth of spaces that are, at once, safe and unsafe depending upon who the user is and how they navigate passage. ...
... Wikipedia's guidelines instruct editors not to "bite the newbies," but they also proclaim, "there are no rules," and community norms tend to reward aggressive behaviors (e.g., [35,43,69]). As others have noted (e.g., [8,12,35]), these cultural contradictions not only make it difficult for newcomers to participate, but also create a labyrinth of spaces that are, at once, safe and unsafe depending upon who the user is and how they navigate passage. Although Wikipedia purports to be the encyclopedia "anyone can edit," as Ford & Wajcman observe, "not everyone does" [24]. ...
Wikipedia is one of the most successful online communities in history, yet it struggles to attract and retain women editors-a phenomenon known as the gender gap. We investigate this gap by focusing on the voices of experienced women Wikipedians. In this interview-based study (N=25), we identify a core theme among these voices: safety. We reveal how our participants perceive safety within their community, how they manage their safety both conceptually and physically, and how they act on this understanding to create safe spaces on and off Wikipedia. Our analysis shows Wikipedia functions as both a multidimensional and porous space encompassing a spectrum of safety. Navigating this space requires these women to employ sophisticated tactics related to identity management, boundary management, and emotion work. We conclude with a set of provocations to spur the design of future online environments that encourage equity, inclusivity, and safety for historically marginalized users.
... Wikipedia editors routinely remove or add text produced by others (Halfaker, Kittur & Riedl, 2011). The stakes are high: Wikipedia is read every day by millions of people worldwide and articles can influence social perception. ...
This study explores Wikipedia as a site for learning. It traces how people learn to become Wikipedia editors through engagement in an editathon, a training event for people who want to become a volunteer editor. The study is original in its emphasis on the various types of knowledge editors acquire as they develop expertise. Determining the knowledge needed to contribute to Wikipedia is significant in terms of understanding Wikipedia as a site for learning. Data was gathered from nine participants who took part in an “editathon” event. The study used a rigorous methodology, combining quantitative social network analysis, documenting the online activity of participants as they created and edited Wikipedia pages, with qualitative interviews about participants’ lived experiences during the editathon. Conceptual and procedural knowledge are representative of the foundational knowledge needed to contribute to Wikipedia actively as an editor. However, these knowledge types on their own are not sufficient. Editors also develop socio-cultural and relational forms of knowledge to enable them to operate and problem-solve effectively. The relationship between the physical and the digital is important, since socio-cultural and relational knowledge are developed through active experimentation as the editathon engage with physical objects to create the online wiki pages.
... For instance, dominant patriarchal structures in peer production systems like Wikipedia make it challenging for women editors to participate when they are constantly targeted by trolls or receiving unwanted sexual advances [63]; psychological constraints embedded in crowdfunding platforms like Kickstarter disadvantage introverted personality types [13]; and demands to improve reputation ratings in ridesharing services like Uber and Lyft result in added work for drivers to please their passengers [73]. The proactive socialization of online volunteers remains essential for retaining their engagement [31]. In building on this scholarship, we study how moderators of online communities engage in emotional labor in the context of identity work, and how such practices might serve as a way to sustain their communities. ...
We examine how and why Asian American and Pacific Islander (AAPI) moderators on Reddit shape the norms of their online communities through the analytic lens of emotional labor. We conduct interviews with 21 moderators who facilitate identity work discourse in AAPI subreddits and present a thematic analysis of their moderation practices. We report on their challenges to sustaining moderation, which include burning out from volunteer work, navigating hierarchical structures, and balancing unfulfilled expectations. We then describe strategies that moderators employ to manage emotional labor, which involve distancing away from drama, building solidarity from shared struggles, and integrating an ecology of tools for self-organized moderation. We provide recommendations for improving moderation in online communities centered around identity work and discuss implications of emotional labor in the design of Reddit and similar platforms.
... We used the web-based Wikipedia interface that Wikipedia contributors use including the article history page in Wikipedia that allows anyone to navigate all edits to an article in chronological order. We kept an open mind and considered that vandalism on Wikipedia may reflect mistakes by inexperienced newbies [19,31]. To the extent that innocent or unwitting vandalism is part of joining a community of practice, anti-normative behavior may simply be a feature of the learning environment [9,23]. ...
By choice or by necessity, some contributors to commons-based peer production sites use privacy-protecting services to remain anonymous. As anonymity seekers, users of the Tor network have been cast both as ill-intentioned vandals and as vulnerable populations concerned with their privacy. In this study, we use a dataset drawn from a corpus of Tor edits to Wikipedia to uncover the character of Tor users' contributions. We build in-depth narrative descriptions of Tor users' actions and conduct a thematic analysis that places their editing activity into seven broad groups. We find that although their use of a privacy-protecting service marks them as unusual within Wikipedia, the character of many Tor users' contributions is in line with the expectations and norms of Wikipedia. However, our themes point to several important places where lack of trust promotes disorder, and to contributions where risks to contributors, service providers, and communities are unaligned.
... Later work found that in social production communities, rigid policies and norms as well as complex user interfaces reinforce a gender gap over time rather than reducing disparities [14]. While a stream of new users is a necessary part of the site's continued ability to thrive, basic site functionality like edit reverts has had a significant, ongoing demoralizing effect on newcomers [24]. ...
In this work we show that machine learning with natural language processing can accurately forecast the outcomes of group decision-making in online discussions. Specifically, we study Articles for Deletion, a Wikipedia forum for determining which content should be included on the site. Applying this model, we replicate several findings from prior work on the factors that predict debate outcomes; we then extend this prior work and present new avenues for study, particularly in the use of policy citation during discussion. Alongside these findings, we introduce a structured corpus and source code for analyzing over 400,000 deletion debates spanning Wikipedia's history, enabling future large-scale studies of group decision-making discourse
... We used the web-based Wikipedia interface that Wikipedia contributors use including the article history page in Wikipedia that allows anyone to navigate all edits to an article in chronological order. We kept an open mind and considered that vandalism on Wikipedia may reflect mistakes by inexperienced newbies [19,31]. To the extent that innocent or unwitting vandalism is part of joining a community of practice, anti-normative behavior may simply be a feature of the learning environment [9,23]. ...
By choice or by necessity, some contributors to commons-based peer production sites use privacy-protecting services to remain anonymous. As anonymity seekers, users of the Tor network have been cast both as ill-intentioned vandals and as vulnerable populations concerned with their privacy. In this study, we use a dataset drawn from a corpus of Tor edits to Wikipedia to uncover the character of Tor users' contributions. We build in-depth narrative descriptions of Tor users' actions and conduct a thematic analysis that places their editing activity into seven broad groups. We find that although their use of a privacy-protecting service marks them as unusual within Wikipedia, the character of many Tor users' contributions is in line with the expectations and norms of Wikipedia. However, our themes point to several important places where lack of trust promotes disorder, and to contributions where risks to contributors, service providers, and communities are unaligned.
... Wikipedia has a long history of struggling to retain its newcomers; maintaining a large number of active participants is crucial for the community's long-term development [18,20,31,45]. New contributors to Wikipedia face both social and technical barriers. ...
Bots are playing an increasingly important role in the creation of knowledge in Wikipedia. In many cases, editors and bots form tightly knit teams. Humans develop bots, argue for their approval, and maintain them, performing tasks such as monitoring activity, merging similar bots, splitting complex bots, and turning off malfunctioning bots. Yet this is not the entire picture. Bots are designed to perform certain functions and can acquire new functionality over time. They play particular roles in the editing process. Understanding these roles is an important step towards understanding the ecosystem, and designing better bots and interfaces between bots and humans. This is important for understanding Wikipedia along with other kinds of work in which autonomous machines affect tasks performed by humans. In this study, we use unsupervised learning to build a nine category taxonomy of bots based on their functions in English Wikipedia. We then build a multi-class classifier to classify 1,601 bots based on labeled data. We discuss different bot activities, including their edit frequency, their working spaces, and their software evolution. We use a model to investigate how bots playing certain roles will have differential effects on human editors. In particular, we build on previous research on newcomers by studying the relationship between the roles bots play, the interactions they have with newcomers, and the ensuing survival rate of the newcomers.
... 3.2 La construction de barrière à la contribution comme régulation de l'accès à une ressource rivale 2015). Il y a aussi une sélection très sévère des contributions, qui a souvent pour conséquence d'évincer les premiers essais des nouveaux contributeurs (Halfaker et al., 2011). Les personnes interrogées dans le cadre du projet sur GeoRezo n'ont pas souligné de facto l'importance de la barrière à l'entrée, même si elles regrettaient la difficulté du projet à attirer les étudiants. ...
... A new asker must learn the quality standards of the network and follow the guidelines. Collaborative editing in Wikipedia is proven to have a demotivating effect, especially on new editors (Halfaker et al., 2011). ...
Social question‐and‐answer (Q&A) sites are platforms where users can freely ask, share, and rate knowledge. For the sustainable growth of social Q&A sites, maintaining askers is as critical as maintaining answerers. Based on motivational affordances theory and self‐determination theory, this study explores the influence of the design elements of social Q&A sites (i.e., upvotes, downvotes, edits, user profile, and comments) on the survival of new askers. In addition, the moderating effect of having an alternative experience is examined. Online data on 25,000 new askers from the top five Q&A sites in the Technology category of the Stack Exchange network are analyzed using logistic regression. The results show that the competency‐ and autonomy‐related design features of social Q&A sites motivate new askers to continue participating. Surprisingly, having an alternative experience shows a negative moderating effect, implying that alternative experiences increase switching costs in the Stack Exchange network. This study provides valuable insights for administrators of social Q&A sites as well as academics.
... A lack of transparency associated with decision-making led to some frustration in the community. In other open collaboration systems, such as Wikipedia, researchers have described the demotivating effect of having a contribution reverted or edited [45,58,59], especially when the justification is missing or inadequate [45]. Reducing the presence of gaming behaviors and improving transparency around decisions would help organizers decide what design problems to prioritize and could help community members feel more empowered in this process. ...
Many organizations have adopted design processes that integrate community voices to discover the real problems that communities face. Online discussion forums offer a familiar and flexible technology that can help facilitate discussion around problems and potential solutions. However, we lack understanding about what information community members share, how that information is structured, and how social interactions affect design processes at scale. This paper presents a mixed-methods analysis of Canvas, a learning management system, which enables users to contribute to the design of the platform by sharing and deliberating on problems and solutions in a discussion forum. We collected and analyzed 1412 ideas and 18,335 associated comments shared on the Canvas discussion forum. We found that the distributed nature of design information, the presence of duplicate ideas, and contributors' gaming behaviors made it difficult for the community to make sense of the design discussion. These gaming behaviors also constitute a new concern for participatory design research. Finally, we reflect on how Canvas community members contribute information to a shared design space and how future systems could more effectively coordinate community design efforts.
... The Wikipedia literature has described how quality control processes can have adverse impacts on newcomers [33]. Interestingly, our participants suggest that algorithms like ORES can and should play a role in helping inexperienced editors, as well as other underrepresented editor groups, such as minori-ties and females. ...
... As has been evident from the guidelines, they are circuitous and often require experience for implementation. Such strict policy adherence have also been sometimes a barrier for onboarding of new editors on Wikipedia which has led to the decline of newcomers over the past decade [132,55]. Since it is nontrivial to discern qualifying differences between articles manually, it has given rise to the emergence of automated techniques using machine learning models. ...
The evolution of AI-based system and applications had pervaded everyday life to make decisions that have momentous impact on individuals and society. With the staggering growth of online data, often termed as the Online Infosphere it has become paramount to monitor the infosphere to ensure social good as the AI-based decisions are severely dependent on it. The goal of this survey is to provide a comprehensive review of some of the most important research areas related to infosphere, focusing on the technical challenges and potential solutions. The survey also outlines some of the important future directions. We begin by discussions focused on the collaborative systems that have emerged within the infosphere with a special thrust on Wikipedia. In the follow up we demonstrate how the infosphere has been instrumental in the growth of scientific citations and collaborations thus fueling interdisciplinary research. Finally, we illustrate the issues related to the governance of the infosphere such as the tackling of the (a) rising hateful and abusive behavior and (b) bias and discrimination in different online platforms and news reporting.
... A rich tradition in HCI and CSCW has contributed to our understanding of how crowds can work together, build upon each other's explorations and insights, and work to achieve common goals -sometimes yielding better results than domain experts working in isolation. Past work has explored collective sensemaking across a range of contexts, including citizen science [17], image labelling [96], knowledge mapping and curation (e.g., on Wikipedia) [31,37,45,46], and social commerce [13,16]. Similarly, in everyday algorithm audits, users come together to collectively question, detect, hypothesize, and theorize problematic machine behaviors during their interactions with algorithmic systems. ...
A growing body of literature has proposed formal approaches to audit algorithmic systems for biased and harmful behaviors. While formal auditing approaches have been greatly impactful, they often suffer major blindspots, with critical issues surfacing only in the context of everyday use once systems are deployed. Recent years have seen many cases in which everyday users of algorithmic systems detect and raise awareness about harmful behaviors that they encounter in the course of their everyday interactions with these systems. However, to date little academic attention has been granted to these bottom-up, user-driven auditing processes. In this paper, we propose and explore the concept of everyday algorithm auditing, a process in which users detect, understand, and interrogate problematic machine behaviors via their day-to-day interactions with algorithmic systems. We argue that everyday users are powerful in surfacing problematic machine behaviors that may elude detection via more centrally-organized forms of auditing, regardless of users' knowledge about the underlying algorithms. We analyze several real-world cases of everyday algorithm auditing, drawing lessons from these cases for the design of future platforms and tools that facilitate such auditing behaviors. Finally, we discuss work that lies ahead, toward bridging the gaps between formal auditing approaches and the organic auditing behaviors that emerge in everyday use of algorithmic systems.
... Potential members reason prospectively about the community and whether they can imagine themselves as a part of it [2][3][4]. Over time, individuals will remain or leave depending on how existing members respond to them (e.g., do they bite the newbies or reach out to o er support?) or based on shi ing perceptions of the community, other members, or their own role [12,22,23,35,37]. As part of these decisions, people may estimate the impact and importance of their contributions. ...
Why are online community sizes so extremely unequal? Most answers to this question have pointed to general mathematical processes drawn from physics like cumulative advantage. These explanations provide little insight into specific social dynamics or decisions that individuals make when joining and leaving communities. In addition, explanations in terms of cumulative advantage do not draw from the enormous body of social computing research that studies individual behavior. Our work bridges this divide by testing whether two influential social mechanisms used to explain community joining can also explain the distribution of community sizes. Using agent-based simulations, we evaluate how well individual-level processes of social exposure and decisions based on individual expected benefits reproduce empirical community size data from Reddit. Our simulations contribute to social computing theory by providing evidence that both processes together---but neither alone---generate realistic distributions of community sizes. Our results also illustrate the potential value of agent-based simulation to online community researchers to both evaluate and bridge individual and group-level theories.
... These studies have often defined conflict according to discreet forms of interaction and social order between users and the platform, and then analyzed the dynamics and occurrences of those modes. [61,139,148] For example, Wikipedia 'edit wars' -high-frequency conflicts in which editors aggressively revert or add content, are often visited as a cornerstone of Wikipedia conflict [67,148]. Other research has pointed to Wikipedia's culture of treating new editors as second-class citizens before they learn to apply Wikipedia's many complicated rules. ...
This paper investigates a hidden dimension of research with real world stakes: research subjects who care -- sometimes deeply -- about the topic of the research in which they participate. They manifest this care, we show, by managing how they are represented in the research process, by exercising politics in shaping knowledge production, and sometimes in experiencing trauma in the process. We draw first-hand reflections on participation in diversity research on Wikipedia, transforming participants from objects of study to active negotiators of research process. We depict how care, vulnerability, harm, and emotions shape ethnographic and qualitative data. We argue that, especially in reflexive cultures, research subjects are active agents with agendas, accountabilities, and political projects of their own. We propose ethics of care and collaboration to open up new possibilities for knowledge production and socio-technical intervention in HCI.
... The retention of newcomers is decreasing in open source communities [18,49]. It may be that as communities evolve, they create barriers to entry [19]. One such barrier is the cumulated community norms and practices. ...
Experienced members of online communities use discussion to familiarize newcomers with norms. These members use imperatives, a kind of directive speech act, to suggest a course of action. A method for automatically recognizing such imperatives is described here. The recognition performance of the algorithm is compared to that of human readers. In addition, to test and illustrate the technique, the imperatives in a sample of Wikipedia deletion discussions are extracted, analyzed, and discussed. The method may be used not only to understand a community's culture and practices but also to elicit information that is beneficial to the community's newcomers.
... Individual characteristics. Behavior of community members and their adherence to community norms has been shown to vary according to their level of involvement with the community [6,10,18,42]. We find that individuals that exhibit a high level of community involvement (e.g., have interacted with more users over a longer period of time) are less likely to abandon the community during a temporary arXiv:1902.08628v1 ...
Community norm violations can impair constructive communication and collaboration online. As a defense mechanism, community moderators often address such transgressions by temporarily blocking the perpetrator. Such actions, however, come with the cost of potentially alienating community members. Given this tradeoff, it is essential to understand to what extent, and in which situations, this common moderation practice is effective in reinforcing community rules.
In this work, we introduce a computational framework for studying the future behavior of blocked users on Wikipedia. After their block expires, they can take several distinct paths: they can reform and adhere to the rules, but they can also recidivate, or straight-out abandon the community. We reveal that these trajectories are tied to factors rooted both in the characteristics of the blocked individual and in whether they perceived the block to be fair and justified. Based on these insights, we formulate a series of prediction tasks aiming to determine which of these paths a user is likely to take after being blocked for their first offense, and demonstrate the feasibility of these new tasks. Overall, this work builds towards a more nuanced approach to moderation by highlighting the tradeoffs that are in play.
... In so doing, we have associated the collectively generated explanations with certain shared values and beliefs upheld by the LoL community. However, researchers from CSCW have also critically reflected upon the notion of community, acknowledging that norms in a particular online community could have peculiar norms that deviate from societal values [2], and that even successful online communities like Wikipedia might have already developed values and norms that marginalize newcomers and minority groups [10,41]. Elsewhere, scholars have criticized the gamer culture for being masculine and toxic [67], and LoL is notorious for its players' toxicity [61]. ...
Artificial intelligence (AI) has become prevalent in our everyday technologies and impacts both individuals and communities. The explainable AI (XAI) scholarship has explored the philosophical nature of explanation and technical explanations, which are usually driven by experts in lab settings and can be challenging for laypersons to understand. In addition, existing XAI research tends to focus on the individual level. Little is known about how people understand and explain AI-led decisions in the community context. Drawing from XAI and activity theory, a foundational HCI theory, we theorize how explanation is situated in a community's shared values, norms, knowledge, and practices, and how situated explanation mediates community-AI interaction. We then present a case study of AI-led moderation, where community members collectively develop explanations of AI-led decisions, most of which are automated punishments. Lastly, we discuss the implications of this framework at the intersection of CSCW, HCI, and XAI.
... In so doing, we have associated the collectively generated explanations with certain shared values and beliefs upheld by the LoL community. However, researchers from CSCW have also critically reflected upon the notion of community, acknowledging that norms in a particular online community could have peculiar norms that deviate from societal values [2], and that even successful online communities like Wikipedia might have already developed values and norms that marginalize newcomers and minority groups [10,41]. Elsewhere, scholars have criticized the gamer culture for being masculine and toxic [67], and LoL is notorious for its players' toxicity [61]. ...
Artificial intelligence (AI) has become prevalent in our everyday technologies and impacts both individuals and communities. The explainable AI (XAI) scholarship has explored the philosophical nature of explanation and technical explanations, which are usually driven by experts in lab settings and can be challenging for laypersons to understand. In addition, existing XAI research tends to focus on the individual level. Little is known about how people understand and explain AI-led decisions in the community context. Drawing from XAI and activity theory, a foundational HCI theory, we theorize how explanation is situated in a community's shared values, norms, knowledge, and practices, and how situated explanation mediates community-AI interaction. We then present a case study of AI-led moderation, where community members collectively develop explanations of AI-led decisions, most of which are automated punishments. Lastly, we discuss the implications of this framework at the intersection of CSCW, HCI, and XAI.
... According to [14], vandalism is strongly related to the contributor's profile, but it can be perpetrated by new users, as well as experienced contributors. Besides, good quality data contributed by inexperienced users should be recognized as their true value to ensure the quality of the work done in crowdsourcing platforms [33]. ...
Though Volunteered Geographic Information (VGI) has the advantage of providing free open spatial data, it is prone to vandalism, which may heavily decrease the quality of these data. Therefore, detecting vandalism in VGI may constitute a first way of assessing the data in order to improve their quality. This article explores the ability of supervised machine learning approaches to detect vandalism in OpenStreetMap (OSM) in an automated way. For this purpose, our work includes the construction of a corpus of vandalism data, given that no OSM vandalism corpus is available so far. Then, we investigate the ability of random forest methods to detect vandalism on the created corpus. Experimental results show that random forest classifiers perform well in detecting vandalism in the same geographical regions that were used for training the model and has more issues with vandalism detection in “unfamiliar regions”.
... The retention of newcomers is decreasing in open source communities [18,49]. It may be that as communities evolve, they create barriers to entry [19]. One such barrier is the cumulated community norms and practices. ...
Experienced members of online communities use discussion to familiarize newcomers with norms. These members use imperatives, a kind of directive speech act, to suggest a course of action. A method for automatically recognizing such imperatives is described here. The recognition performance of the algorithm is compared to that of human readers. In addition, to test and illustrate the technique, the imperatives in a sample of Wikipedia deletion discussions are extracted, analyzed, and discussed. The method may be used not only to understand a community's culture and practices but also to elicit information that is beneficial to the community's newcomers.
... As we note in our results, new participants' messages in our data matched these patterns; in this sample, new participants were much more likely than established members to ask questions and request information, though we find evidence that this was not necessarily a first step toward increased future participation. These cautious first steps are logical for new participants; the responses they receive to their first contributions may be read as major signals of whether they will be welcome on a platform [11]. ...
Knowledge bases are becoming a key asset leveraged for various types of applications on the Web, from search engines presenting ‘entity cards’ as the result of a query, to the use of structured data of knowledge bases to empower virtual personal assistants. Wikidata is an open general-interest knowledge base that is collaboratively developed and maintained by a community of thousands of volunteers. One of the major challenges faced in such a crowdsourcing project is to attain a high level of editor engagement. In order to intervene and encourage editors to be more committed to editing Wikidata, it is important to be able to predict at an early stage, whether an editor will or not become an engaged editor. In this paper, we investigate this problem and study the evolution that editors with different levels of engagement exhibit in their editing behaviour over time. We measure an editor’s engagement in terms of (i) the volume of edits provided by the editor and (ii) their lifespan (i.e. the length of time for which an editor is present at Wikidata). The large-scale longitudinal data analysis that we perform covers Wikidata edits over almost 4 years. We monitor evolution in a session-by-session- and monthly-basis, observing the way the participation, the volume and the diversity of edits done by Wikidata editors change. Using the findings in our exploratory analysis, we define and implement prediction models that use the multiple evolution indicators.
This chapter considers the theory of social machines, from three perspectives. First, it looks at social machines as social; second, as machines; third, it takes the perspective of the data that fuels the machines. Looking at the sociality of social machines, the chapter considers various approaches to developing meaningful narratives around the operation of social machines, including prosopography, wayfaring and study of information tokens across platforms in transcendental information cascades. The issues surrounding the feedback loops of reflexivity are considered, as are the need for diversity and the possibility of so-called Mandevillian intelligence, where the collective intelligence of the group is enhanced, not degraded, by the imperfect reasoning of its participants. Looking at the mechanical aspects of social machines, the chapter considers the use of a formal process language the Lightweight Social Calculus (LSC) to map out the potential interactions between participants and technology support. The specification of shadow institutions using LSC is described, as is the use of a simplified diagrammatic calculus called Sociograms to allow the design of LSC specifications. From the data perspective, the chapter looks at annotation and provenance. In particular, it maps out a provenance methodology for keeping records about where data have come from, over which data scientists can reason. The chapter concludes with two examples of the use of provenance to understand social machines, and two examples of the use of social machines to create provenance records.
Purpose
This paper aims to explore the effect of participant composition and contribution behavior of the different types of participants on the quality of knowledge generation in online communities.
Design/methodology/approach
This study samples all the featured articles in Chinese Wikipedia and performs a Cox regression to reveal how participant composition and contribution behavior affect the quality of articles in different contexts.
Findings
The results show that an increase in the number of participants increases the possibility of either enhancing or reducing the article quality. In most cases, the greater the proportion of core members (people who frequently participate in editing), the higher the possibility of enhancing the article quality. Occasional participants’ editorial behavior hinders quality promotion, this negative effect weakens when such editorial behavior becomes more frequent.
Practical implications
The findings help to better leverage the role of online communities in practice and to achieve knowledge collaboration in a more efficient manner. For example, an appropriate centralized organizational form should be established in online communities to improve the efficiency of crowd contributions. And it is worth developing mechanism to encourage participants to frequently participate in editing the article.
Originality/value
This study contributes to the research on the organizational forms of online communities by showing the effect of participant composition and behavior in the new form of organizing on knowledge generation. This study also contributes to the research on wisdom of crowds by revealing who in a group of participants, in what context, and by what means influence knowledge generation.
WikiTribune is a pilot news service, where evidence-based articles are co-created by professional journalists and a community of volunteers using an open and collaborative digital platform. The WikiTribune project is set within an evolving and dynamic media landscape, operating under principles of openness and transparency. It combines a commercial for-profit business model with an open collaborative mode of production with contributions from both paid professionals and unpaid volunteers. This descriptive case study captures the first 12-months of WikiTribune's operations to understand the challenges and opportunities within this hybrid model of production. We use the rich literature on Wikipedia to understand the WikiTribune case and to identify areas of convergence and divergence, as well as avenues for future research. Data was collected on news articles with a focus on the time it takes for an article to reach published status, the number and type of contributors typically involved, article activity and engagement levels, and the types of topics covered.
Attracting and retaining newcomers is critical and challenging for online production communities such as Wikipedia, both because volunteers need specialized training and are likely to leave before being integrated into the community. In response to these challenges, the Wikimedia Foundation started the Wiki Education Project (Wiki Ed), an online program in which college students edit Wikipedia articles as class assignments. The Wiki Ed program incorporates many components of institutional socialization, a process many conventional organizations successfully use to integrate new employees through formalized on-boarding practices. Research has not adequately investigated whether Wiki Ed and similar programs are effective ways to integrate volunteers in online communities, and, if so, the mechanisms involved. This paper evaluates the Wiki Ed program by comparing 16,819 student editors in 770 Wiki Ed classes with new editors who joined Wikipedia in the conventional way. The evaluation shows that the Wiki Ed students did more work, improved articles more, and were more committed to Wikipedia. For example, compared to new editors who joined Wikipedia in the conventional way they were twice as likely to still be editing Wikipedia a year after their Wiki Ed class was finished. Further, students in classrooms that encouraged joint activity, a key component of institutional socialization, produced better quality work than those in classrooms where students worked independently. These findings are consistent with an interpretation that the Wiki Ed program was successful because it incorporated elements of institutionalized socialization.
Peer production projects involve people in many tasks, from editing articles to analyzing datasets. To facilitate mastery of these practices, projects offer a number of learning resources, ranging from project-defined FAQsto individually-oriented search tools and communal discussion boards. However, it is not clear which project resources best support participant learning, overall and at different stages of engagement. We draw onSørensen's framework of forms of presence to distinguish three types of engagement with learning resources:authoritative, agent-centered and communal. We assigned resources from the Gravity Spy citizen-science into these three categories and analyzed trace data recording interactions with resources using a mixed-effects logistic regression with volunteer performance as an outcome variable. The findings suggest that engagement with authoritative resources (e.g., those constructed by project organizers) facilitates performance initially. However, as tasks become more difficult, volunteers seek and benefit from engagement with their own agent-centered resources and community-generated resources. These findings suggest a broader scope for the design of learning resources for peer production
Modern research is inescapably digital, with data and publications most often created, analyzed, and stored electronically, using tools and methods expressed in software. While some of this software is general‐purpose office software, a great deal of it is developed specifically for research, often by researchers themselves. Research software is essential to progress in science, engineering, and all other fields, but it is often not developed, shared, or stored in a sustainable way. The following paper presents findings from an ethnography of two research software projects that have, over the last 10 years, cooperatively organized development efforts to produce important software enabling scientific breakthroughs in both astronomy and macromolecular modeling. The work of these two projects are framed in terms of James Carse's model of finite and infinite games. I argue that by incentivizing institutional governance that resembles the design of an infinite game, funding agencies can increase the sustainability of research software and improve various aspects of data‐driven scientific discovery.
Many open collaboration platforms (e.g., Wikipedia) have recently utilized automated algorithmic agents, called Bots, to solve the issue of stagnating user participation. However, understanding of how such a bot agent affects user activities in open collaboration is limited. In this study, we examine (i) whether bot intervention affects user participation and (ii) whether the impact of bot intervention could vary by characteristics of open collaboration works. We pursue our research goals by utilizing a rich dataset between 2005 and 2017 from an online open collaboration platform. The platform has recently utilized two types of unique bots that interact with human users in open collaboration. By employing difference-in-differences approach, we show that bot intervention in open collaboration leads to unintended consequences as it significantly demotivates user participation. We further show that the negative effect of bot intervention is alleviated if more active users are engaged or if covered topics have a higher concentration in a collaboration work. This provides directions for businesses to resolve the unintended demotivation problem in utilizing bot agents.
These days, user-generated content platforms such as social media, question-answering Websites, and open collaboration systems are a source of information for many. These platforms survive, thanks to the pool of active contributors who generate content. As a consequence, they continuously face the problem of acquiring new users and retain them in the platform.
Wikis are a type of collaborative repository system that enables users to create and edit shared content on the Web. The popularity and proliferation of Wikis have created a new set of challenges for trust research because the content in a Wiki can be contributed by a wide variety of users and can change rapidly. Nevertheless, most Wikis lack explicit trust management to help users decide how much they should trust an article or a fragment of an article. In this paper, we investigate the dynamic nature of revisions as we explore ways of utilizing revision history to develop an article fragment trust model. We use our model to compute trustworthiness of articles and article fragments. We also augment Wikis with a trust view layer with which users can visually identify text fragments of an article and view trust values computed by our model
Socialization of newcomers is critical both for conventional groups. It helps groups perform effectively and the newcomers develop commitment. However, little empirical research has investigated the impact of specific socialization tactics on newcomers' commitment to online groups. We examined WikiProjects, subgroups in Wikipedia organized around working on common topics or tasks. In study 1, we identified the seven socialization tactics used most frequently: invitations to join, welcome messages, requests to work on project-related tasks, offers of assistance, positive feedback on a new member's work, constructive criticism, and personal-related comments. In study 2, we examined their impact on newcomers' commitment to the project. Whereas most newcomers contributed fewer edits over time, the declines were slowed or reversed for those socialized with welcome messages, assistance, and constructive criticism. In contrast, invitations led to steeper declines in edits. These results suggest that different socialization tactics play different roles in socializing new members in online groups compared to offline ones. Author Keywords Socialization, Wikipedia, WikiProject
Wikipedia's success is often attributed to the large numbers of contributors who improve the accuracy, completeness and clarity of articles while reducing bias. However, because of the coordination needed to write an article collaboratively, adding contributors is costly. We examined how the number of editors in Wikipedia and the coordination methods they use affect article quality. We distinguish between explicit coordination, in which editors plan the article through communication, and implicit coordination, in which a subset of editors structure the work by doing the majority of it. Adding more editors to an article improved article quality only when they used appropriate coordination techniques and was harmful when they did not. Implicit coordination through concentrating the work was more helpful when many editors contributed, but explicit coordination through communication was not. Both types of coordination improved quality more when an article was in a formative stage. These results demonstrate the critical importance of coordination in effectively harnessing the "wisdom of the crowd" in online production environments.
Online production groups have the potential to transform the way that knowledge is produced and disseminated. One of the most widely used forms of online production is the wiki, which has been used in domains ranging from science to education to enterprise. We examined the development of and interactions between coordination and conflict in a sample of 6811 wiki production groups. We investigated the influence of four coordination mechanisms: intra-article communication, inter-user communication, concentration of workgroup structure, and policy and procedures. We also examined the growth of conflict, finding the density of users in an information space to be a significant predictor. Finally, we analyzed the effectiveness of the four coordination mechanisms on managing conflict, finding differences in how each scaled to large numbers of contributors. Our results suggest that coordination mechanisms effective for managing conflict are not always the same as those effective for managing task quality, and that designers must take into account the social benefits of coordination mechanisms in addition to their production benefits.
In this paper, we examine the social roles of software tools in the English-language Wikipedia, specifically focusing on autonomous editing programs and assisted editing tools. This qualitative research builds on recent research in which we quantitatively demonstrate the growing prevalence of such software in recent years. Using trace ethnography, we show how these often-unofficial technologies have fundamentally transformed the nature of editing and administration in Wikipedia. Specifically, we analyze "vandal fighting" as an epistemic process of distributed cognition, highlighting the role of non-human actors in enabling a decentralized activity of collective intelligence. In all, this case shows that software programs are used for more than enforcing policies and standards. These tools enable coordinated yet decentralized action, independent of the specific norms currently in force.
Wikipedia is a wiki-based encyclopedia that has become one of the most popular collaborative on-line knowledge systems. As in any large collaborative system, as Wikipedia has grown, conflicts and coordination costs have increased dramatically. Visual analytic tools provide a mechanism for addressing these issues by enabling users to more quickly and effectively make sense of the status of a collaborative environment. In this paper we describe a model for identifying patterns of conflicts in Wikipedia articles. The model relies on users' editing history and the relationships between user edits, especially revisions that void previous edits, known as "reverts". Based on this model, we constructed Revert Graph, a tool that visualizes the overall conflict patterns between groups of users. It enables visual analysis of opinion groups and rapid interactive exploration of those relationships via detail drill- downs. We present user patterns and case studies that show the effectiveness of these techniques, and discuss how they could generalize to other systems.
The problem of identifying trustworthy information on the World Wide Web is becoming increasingly acute as new tools such as wikis and blogs simplify and democratize publications. Wikipedia is the most extraordinary ex- ample of this phenomenon and, although a few mechanisms have been put in place to improve contributions quality, trust in Wikipedia content quality has been seriously questioned. We thought that a deeper understanding of what in general defines high-standard and expertise in domains related to Wikipedia - i.e. content quality in a collaborative environment - mapped onto Wikipedia elements would lead to a complete set of mechanisms to sustain trust in Wikipedia context. Our evaluation conducted on about 8,000 articles, represent- ing 65% of the overall Wikipedia editing activity, shows that the new trust evi- dence that we extracted from Wikipedia allows us to transparently and auto- matically compute trust values to isolate articles of great or low quality.
Wikipedia is a highly successful example of what mass col- laboration in an informal peer review system can accom- plish. In this paper, we examine the role that the quality of the contributions, the experience of the contributors and the ownership of the content play in the decisions over which contributions become part of Wikipedia and which ones are rejected by the community. We introduce and justify a ver- satile metric for automatically measuring the quality of a contribution. We find little evidence that experience helps contributors avoid rejection. In fact, as they gain experi- ence, contributors are even more likely to have their work rejected. We also find strong evidence of ownership behav- iors in practice despite the fact that ownership of content is discouraged within Wikipedia.
We examine the Information Quality aspects of Wikipedia. By a study of the discussion pages and other process-oriented pages within the Wikipedia project, it is possible to determine the information quality dimensions that participants in the editing process care about, how they talk about them, what tradeoffs they make between these dimensions and how the quality assessment and improvement process operates. This analysis helps in understanding how high quality is maintained in a project where anyone may participate with no prior vetting. It also carries implications for improving the quality of more conventional datasets.
Wikipedia, a wiki-based encyclopedia, has become one of the most successful experiments in collaborative knowledge building on the Internet. As Wikipedia continues to grow, the potential for conflict and the need for coordination increase as well. This article examines the growth of such non-direct work and describes the development of tools to characterize conflict and coordination costs in Wikipedia. The results may inform the design of new collaborative knowledge systems. Author Keywords Wikipedia, wiki, collaboration, conflict, user model, Web-based interaction, visualization.
Territoriality, the expression of ownership towards an object, can emerge when social actors occupy a shared social space. In the case of Wikipedia, the prevailing cultural norm is one that warns against ownership of one's work. However, we observe the emergence of territoriality in online space with respect to a subset of articles that have been tagged with the Maintained template through a qualitative study of 15 editors who have self-designated as Maintainers. Our participants communicated ownership, demarcated boundaries and asserted their control over artifacts for the sake of quality by appropriating existing features of Wikipedia. We then suggest design strategies to support these behaviors in the proper context within collaborative authoring systems more generally. Author Keywords Territoriality, ownership, collaboration, Wikipedia, authorship
Prior research on Wikipedia has characterized the growth in content and editors as being fundamentally exponential in nature, extrapolating current trends into the future. We show that recent editing activity suggests that Wikipedia growth has slowed, and perhaps plateaued, indicating that it may have come against its limits to growth. We measure growth, population shifts, and patterns of editor and administrator activities, contrasting these against past results where possible. Both the rate of page growth and editor growth has declined. As growth has declined, there are indicators of increased coordination and overhead costs, exclusion of newcomers, and resistance to new edits. We discuss some possible explanations for these new developments in Wikipedia including decreased opportunities for sharing existing knowledge and increased bureaucratic stress on the socio-technical system itself. The existing trends of exponential growth in digital technologies were the basis for Kurzweil's (17) argument that biological evolution and technological evolution follow a law of accelerating returns (i.e., exponential or even super-exponential growth). This lead to the notion of the "Singularity": a point in the near future when technological change becomes "so rapid and profound that it represents a rupture in the fabric of human history." 1 We argue that Wikipedia, one of the world's largest knowledge aggregators, does indeed mirror the growth of natural populations, but, following Darwin (7), we suggest that this growth becomes increasingly constrained and limited, and under those conditions there will be increased evidence of competition and dominance. In this paper, we present data that challenges the notion that Wikipedia exhibits unconstrained exponential growth in editor participation and contribution. We will show that growth has decreased substantially over the last two years, perhaps indicating some fundamental limiting constraints to growth. In ecological systems, when unfettered population growth approaches natural limits (e.g., in available resources), one generally observes increased competition. For Wikipedia, we will examine the data for indicators of increased competition that would be expected as a growing population system comes up against limits to growth. We present data from Wikipedia addressing three different aspects over time: the global activity level, a detailed analysis of the edit rates of various editor classes, and the population shifts in editor classes.
Effective information quality analysis needs powerful yet easy ways to obtain metrics. The English version of Wikipedia provides an extremely interesting yet challenging case for the study of Information Quality dynamics at both macro and micro levels. We propose seven IQ metrics which can be evaluated automatically and test the set on a representative sample of Wikipedia content. The methodology of the metrics construction and the results of tests, along with a number of statistical characterizations of Wikipedia articles, their content construction, process metadata and social context are reported.
This paper assesses the content- and population-dynamics of a largecessful outcomes. Their destiny relies on the capacity of project sample of wikis, over a timespan of several months, in order to identify basic features that may predict or induce different types of fate. We analyze and discuss, in particular, the correlation of various macroscopic indicators, structural features and governance policies with specific growth patterns. While recent analyses of wiki dynamics have mostly focused on popular projects such as Wikipedia, we suggest research directions towards a more general theory of the dynamics of such communities.search on a wide range of wikis at various stages of development.
Wikipedia is a large and rapidly growing Web-based collaborative authoring environment, where anyone on the Internet can create, modify, and delete pages about encyclopedic topics. A remarkable property of some Wikipedia pages is that they are written by up to thousands of authors who may have contradicting opinions. In this paper we show that a visual analysis of the who revises whom- network gives deep insight into controversies. We propose a set of analysis and visualization techniques that reveal the dominant authors of a page, the roles they play, and the alters they confront. Thereby we provide tools to understand howWikipedia authors collaborate in the presence of controversy.
Wikipedia's brilliance and curse is that any user can edit any of the encyclopedia entries. We introduce the notion of the impact of an edit, measured by the number of times the edited version is viewed. Using several datasets, including recent logs of all article views, we show that an overwhelming majority of the viewed words were written by frequent editors and that this majority is increasing. Similarly, using the same impact measure, we show that the probability of a typical article view being damaged is small but increasing, and we present empirically grounded classes of damage. Finally, we make policy recommendations for Wikipedia and other wikis in light of these findings.
Wiki systems typically display article history as a linear sequence of revisions in chronological order. This representation hides deeper relationships among the revisions, such as which earlier revision provided most of the content for a later revision, or when a revision effectively reverses the changes made by a prior revision. These relationships are valuable in understanding what happened between editors in conflict over article content. We present methods for detecting when a revision discards the work of one or more other revisions, a means of visualizing these relationships in-line with existing history views, and a computational method for detecting discarded work. We show through a series of examples that these tools can aid mediators of wiki content disputes by making salient the structure of the ongoing conflict. Further, the computational tools provide a means of determining whether or not a revision has been accepted by the community of editors surrounding the article.
We examined wiki use in a range of enterprise settings. We found many thriving wikis, but they were a minority of the thousands for which we obtained data. Even an actively used wiki can disappoint some important stakeholders. Careful stakeholder analysis and education may be crucial to successful wiki deployment. We identify a range of success factors, sources of wiki abandonment, and approaches to addressing the challenges. Some of our observations may extend to other social media.
A number of studies have assessed the reliability of entries in Wikipedia at specific times. One important difference between Wikipedia and traditional media, however, is the dynamic nature of its entries. An entry assessed today might be substantially extended or reworked tomorrow. This study paper assesses the frequency with which small, inaccurate changes are quickly corrected.
The Internet has fostered an unconventional and powerful style of collaboration: "wiki" web sites, where every visitor has the power to become an editor. In this paper we investigate the dynamics of Wikipedia, a prominent, thriving wiki. We make three contributions. First, we introduce a new exploratory data analysis tool, the history flow visualization, which is effective in revealing patterns within the wiki context and which we believe will be useful in other collaborative situations as well. Second, we discuss several collaboration patterns highlighted by this visualization tool and corroborate them with statistical analysis. Third, we discuss the implications of these patterns for the design and governance of online collaborative social spaces. We focus on the relevance of authorship, the value of community surveillance in ameliorating antisocial behavior, and how authors with competing perspectives negotiate their differences.
Wikipedia represents an intriguing new publishing paradigmâcan it be used to engage students in authentic collaborative writing activities? How can we design wiki publishing tools and curricula to support learning among student authors? We suggest that wiki publishing environments can create learning opportunities that address four dimensions of authenticity: personal, real world, disciplinary, and assessment. We have begun a series of design studies to investigate links between wiki publishing experiences and writing-to-learn. The results of an initial study in an undergraduate government course indicate that perceived audience plays an important role in helping students monitor the quality of writing; however, studentsâ perception of audience on the Internet is not straightforward. This preliminary iteration resulted in several guidelines that are shaping efforts to design and implement new wiki publishing tools and curricula for students and teachers.
We present a content-driven reputation system for Wikipedia authors. In our system, authors gain reputation when the edits they perform to Wikipedia articles are preserved by subsequent authors, and they lose reputation when their edits are rolled back or undone in short order. Thus, author reputation is computed solely on the basis of content evolution; user-to-user comments or ratings are not used. The author reputation we compute could be used to flag new contributions from low-reputation authors, or it could be used to allow only authors with high reputation to contribute to controversialor critical pages. A reputation system for the Wikipedia could also provide an incentive for high-quality contributions. We have implemented the proposed system, and we have used it to analyze the entire Italian and French Wikipedias, consisting of a total of 691, 551 pages and 5, 587, 523 revisions. Our results show that our notion of reputation has good predictive value: changes performed by low-reputation authors have a significantly larger than average probability of having poor quality, as judged by human observers, and of being later undone, as measured by our algorithms.
In a knowledge-based, networked economy, students leaving university need to have attained skills in collaborative and creative project-based work and to have developed critical, reflective practices. This paper outlines how a wiki can been used as part of social constructivist pedagogical practice which aims to develop advanced ICT literacies in university students. The paper describes the implementation of a wiki-based project as part of a subject in New Media Technologies at Queensland University of Technology. We discuss the strengths and challenges involved in using networked, collaborative learning strategies in institutional environments that still operate in traditional paradigms.
Wikipedia. Editor trends study.
http://strategy.wikimedia.org/?oldid=80283, March
2011.
A jury of your peers: Quality, Experience and Ownership in Wikipedia Groups at work: Theory and research, chapter Understanding Individual Motivation in Groups: The Collective Effort Model
Jan 2001
1-10
A Halfaker
A Kittur
R Kraut
J Riedl
S J Karau
K D Williams
A. Halfaker, A. Kittur, R. Kraut, and J. Riedl. A jury
of your peers: Quality, Experience and Ownership in
Wikipedia. In WikiSym '09, pages 1–10, New York,
NY, USA, 2009. ACM.
[11] S. J. Karau and K. D. Williams. Groups at work:
Theory and research, chapter Understanding
Individual Motivation in Groups: The Collective
Effort Model, pages 113–141. Lawrence Erlbaum
Associates, Inc., 2001.