Aaron Shaw’s research while affiliated with Northwestern University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (53)


The Introduction of README and CONTRIBUTING Files in Open Source Software Development
  • Preprint
  • File available

February 2025

·

5 Reads

Matthew Gaughan

·

Kaylea Champion

·

Sohyeon Hwang

·

Aaron Shaw

README and CONTRIBUTING files can serve as the first point of contact for potential contributors to free/libre and open source software (FLOSS) projects. Prominent open source software organizations such as Mozilla, GitHub, and the Linux Foundation advocate that projects provide community-focused and process-oriented documentation early to foster recruitment and activity. In this paper we investigate the introduction of these documents in FLOSS projects, including whether early documentation conforms to these recommendations or explains subsequent activity. We use a novel dataset of FLOSS projects packaged by the Debian GNU/Linux distribution and conduct a quantitative analysis to examine README (n=4226) and CONTRIBUTING (n=714) files when they are first published into projects' repositories. We find that projects create minimal READMEs proactively, but often publish CONTRIBUTING files following an influx of contributions. The initial versions of these files rarely focus on community development, instead containing descriptions of project procedure for library usage or code contribution. The findings suggest that FLOSS projects do not create documentation with community-building in mind, but rather favor brevity and standardized instructions.

Download

Generative Agent Simulations of 1,000 People

November 2024

·

177 Reads

·

10 Citations

Joon Sung Park

·

Carolyn Q. Zou

·

Aaron Shaw

·

[...]

·

Michael S. Bernstein

The promise of human behavioral simulation--general-purpose computational agents that replicate human behavior across domains--could enable broad applications in policymaking and social science. We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals--applying large language models to qualitative interviews about their lives, then measuring how well these agents replicate the attitudes and behaviors of the individuals that they represent. The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers two weeks later, and perform comparably in predicting personality traits and outcomes in experimental replications. Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions. This work provides a foundation for new tools that can help investigate individual and collective behavior.


Adopting Third-party Bots for Managing Online Communities

April 2024

·

5 Reads

·

4 Citations

Proceedings of the ACM on Human-Computer Interaction

Bots have become critical for managing online communities on platforms, especially to match the increasing technical sophistication of online harms. However, community leaders often adoptthird-party bots, creating room for misalignment in their assumptions, expectations, and understandings (i.e., their technological frames) about them. On platforms where sharing bots can be extremely valuable, how community leaders can revise their frames about bots to more effectively adopt them is unclear. In this work, we conducted a qualitative interview study with 16 community leaders on Discord examining how they adopt third-party bots. We found that participants addressed challenges stemming from uncertainties about a bot's security, reliability, and fit through emergent social ecosystems. Formal and informal opportunities to discuss bots with others across communities enabled participants to revise their technological frames over time, closing gaps in bot-specific skills and knowledge. This social process of learning shifted participants' perspectives of the labor of bot adoption into something that was satisfying and fun, underscoring the value of collaborative and communal approaches to adopting bots. Finally, by shaping participants' mental models of the nature, value, and use of bots, social ecosystems also raise some practical tensions in how they support user creativity and customization in third-party bot use. Together, the social nature of adopting third-party bots in our interviews offers insight into how we can better support the sharing of valuable user-facing tools across online communities.


Fig. 1. Turbulent times for social media. A new study by Annie Chen et al. disentangles the relationships between online behavior and prior beliefs. The study confirms that platforms like YouTube can, and should, do much more to restrict the reach of extremist content to the dedicated audiences that seek it out. Photo by Adem AY on Unsplash
Social media, extremism, and radicalization

August 2023

·

37 Reads

·

9 Citations

Science Advances

Fears that YouTube recommendations radicalize users are overblown, but social media still host and profit from dubious and extremist content.



Figure 1. Example of a thread from the talk page for "Darth Maladi" on the Wookieepedia wiki, dedicated to Star Wars information. Users discuss the article's topic and how to improve the article.
Figure 2. Scaled regression coefficients predicting the number of nonreverted words added in the first 700 edits. Polynomial control terms are excluded for clarity.
Figure 3. Scaled regression coefficients predicting the hazard of a wiki becoming inactive. Polynomial control terms are excluded for clarity.
Communication networks do not predict success in attempts at peer production

March 2023

·

49 Reads

·

5 Citations

Journal of Computer-Mediated Communication

Although peer production has created valuable information goods like Wikipedia, the GNU/Linux operating system, and Reddit, the majority of attempts at peer production achieve very little. In work groups and teams, coordination and social integration—manifested via dense, integrative communication networks—predict success. We hypothesize that the conditions in which new peer production communities operate make communication problems common and make coordination and integration more difficult, and that variation in the structure of project communication networks will predict project success. In this article, we measure communication networks for 999 early-stage peer production wikis. We assess whether communities displaying network markers of coordination and social integration are more productive and long-lasting. Contrary to our expectations, we find a very weak relationship between communication structure and collaborative performance. We propose that technology may serve as a partial substitute for communication in coordinating work and integrating newcomers in peer production.


Participation inequality in the gig economy

June 2022

·

275 Reads

·

23 Citations

In theory, the gig economy facilitates flexible, digitally mediated employment arrangements. Why do some people wind up doing gig work while others do not? We focus on how online participation inequalities, and Internet use experiences and skills, shape the composition of online gig workers. Specifically, we analyze a unique survey data set from a national sample of 1512 U.S. adults that includes information about background attributes and behaviors, detailed measures of Internet experiences and skills, as well as questions about whether study participants had completed specific steps necessary to becoming a task worker on two prominent gig economy platforms: Amazon Mechanical Turk and TaskRabbit. We use Bayesian regression to compare four stages of gig economy participation. Workers who participate in the gig economy tend to be younger, more highly educated, and more skilled Internet users. This implies that the gig economy increases labor market stratification and that digital participation inequalities compound labor inequalities.


Rules and Rule-Making in the Five Largest Wikipedias

May 2022

·

4 Reads

·

11 Citations

Proceedings of the International AAAI Conference on Web and Social Media

The governance of many online communities relies on rules created by participants. However, prior work provides limited evidence about how these self-governance efforts compare and relate to one another across communities. Studies tend either to analyze communities as discrete entities or consider communities that coexist within a hierarchically-managed platform. In this paper, we investigate both comparative and relational dimensions of self-governance in similar communities. We use exhaustive trace data from the five largest language editions of Wikipedia over almost 20 years since their founding, and consider both patterns in rule-making and overlaps in rule sets. We find similar rule-making activity across the five communities that replicates and extends prior work on English language Wikipedia alone. However, we also find that these Wikipedias have increasingly unique rule sets, even as editing activity concentrates on rules shared between them. Self-governing communities aligned in key ways may share a common core of rules and rule-making practices as they develop and sustain institutional variations.



The Hidden Costs of Requiring Accounts: Quasi-Experimental Evidence From Peer Production

November 2021

·

27 Reads

Online communities, like Wikipedia, produce valuable public information goods. Whereas some of these communities require would-be contributors to create accounts, many do not. Does this requirement catalyze cooperation or inhibit participation? Prior research provides divergent predictions but little causal evidence. We conduct an empirical test using longitudinal data from 136 natural experiments where would-be contributors to wikis were suddenly required to log in to contribute. Requiring accounts leads to a small increase in account creation, but reduces both high- and low-quality contributions from registered and unregistered participants. Although the change deters a large portion of low-quality participation, the vast majority of deterred contributions are of higher quality. We conclude that requiring accounts introduces an undertheorized tradeoff for public goods production in interactive communication systems.


Citations (44)


... Recent work suggests that language models, such as GPT, can make human-like judgments across a number of domains (Dillion et al., 2023). Park et al. (2024) conducted a test with 1,052 real people, applying large linguistic models to qualitative interviews about their lives and then measuring the extent to which these agents reproduce the attitudes and behaviors of the people they represent and obtained an accuracy of 85%. In this study, we have used generative AI tools to model and experiment with different variables that may influence the acceptance of a CBDC. ...

Reference:

Evaluating the Acceptance of CBDCs: Experimental Research with Artificial Intelligence (AI) Generated Synthetic Response
Generative Agent Simulations of 1,000 People
  • Citing Preprint
  • November 2024

... where private messages are stored) are robustly secured. Prior work has shown that community governance requires users to have substantial time, resources, and expertise to govern effectively [48,61,119]; these challenges could discourage many from running an instance or rely on a default but ill-suited strategy. Meanwhile, how community members can anticipate the impact of governance practices on their privacy is unclear. ...

Adopting Third-party Bots for Managing Online Communities
  • Citing Article
  • April 2024

Proceedings of the ACM on Human-Computer Interaction

... In critical contexts like democratic processes, concerns about AIGC on social media focus primarily on its negative impact on the integrity of online information. Research warns against the threat of deepfakes (Campbell et al. 2022), coordinated information campaigns , and offensive speech targeting opposing viewpoints or vulnerable populations (Shaw 2023). However, we know little about the scale, scope, and influence of AI-generated content online. ...

Social media, extremism, and radicalization

Science Advances

... Los estudios sobre cobertura temática en Wikipedia han girado sobre diversos campos, como el de la ciencia, las biografías, patrimonio cultural, cultura de masas o la actualidad social (Hill y Shaw, 2020;Reznik y Shatalov, 2016;Minguillón y otros, 2017). Sin embargo, no existe una buena y amplia panorámica de la participación de Wikipedia en el conocimiento de las obras literarias o de las obras impresas. ...

The Most Important Laboratory for Social Scientific and Computing Research in History

... Penelitian menyoroti bahwa manusia mencapai solusi yang lebih baik dalam tim kooperatif melalui mekanisme yang meningkatkan keragaman solusi sementara, pada akhirnya meningkatkan pemecahan masalah dan inovasi (Smaldino et al., 2024). Sementara koordinasi dan integrasi sosial sangat penting untuk keberhasilan produksi sebaya, teknologi juga dapat memainkan peran dalam memfasilitasi kolaborasi dan keberhasilan proyek, terutama dalam komunitas produksi rekan baru (Foote et al., 2023). Selain itu, kemajuan dalam teknologi kriptografi telah mengarah pada pengembangan skema tanda tangan kolaboratif multi-pihak, memastikan fleksibilitas, keamanan, dan kepercayaan dalam skenario penandatanganan kolaboratif (Tan et al., 2023). ...

Communication networks do not predict success in attempts at peer production

Journal of Computer-Mediated Communication

... Furthermore, the cultural context of a language edition [13,14] and its editor demographic [15] can also shape what is deemed relevant on Wikipedia [16]. Lastly, given the diversity of Wikipedia's many language editions, for example in size and popularity [17], rules [18], trends [19], knowledge propagation [20], article quality ratings [21], link structures [22], or topic representation [23,24] and categorization [25], one should also account for such differences when analyzing inequality on Wikipedia. ...

Rules and Rule-Making in the Five Largest Wikipedias
  • Citing Article
  • May 2022

Proceedings of the International AAAI Conference on Web and Social Media

... Open-StreetMap(OSM) 2 , for example, is an open-sourced map system that allows users to add location-related elements to it, and has been leveraged in commercial (e.g., MapBox 3 ) and non-profit projects [11,13,14,17]. Among the diverse sites of these VGI projects, campuses are among the most popular with researchers, who have assigned their participants tasks including seat-capacity/availability checks [5,8], queue-length estimation [8], environment and hygiene tracking [8], provision of lost-and-found information [9], and security patrolling [15], among others. However, a key limitation of these projects is that they have allowed their participants either to report static information (e.g., about facilities [11,13,16]) or to provide real-time status updates about specific locations (e.g. ...

Studying the Effects of Task Notification Policies on Participation and Outcomes in On-the-go Crowdsourcing
  • Citing Article
  • September 2016

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing

... There has been much research, for example, in precarity and economic insecurity associated with the gig economy. It has been found, for example, that workers active in the digital gig economy around platforms such as Amazon Mechanical Turk and TaskRabbit tend to be younger, highly educated and skilful at using the internet (Shaw et al., 2023): this leads to participation inequality, as well as to a deepening of the already-existing digital divide (Lythreatis et al., 2022). Other existing inequalities, such as around employment gender inequality, have also found to be replicated on digital employment platforms (Vyas, 2021), while one of the oft-touted promises of platforms -of providing more economic opportunities for those on the lowest incomes -do not seem to be realised in practice (Schor, 2017). ...

Participation inequality in the gig economy
  • Citing Article
  • June 2022

... However, Punjabi Wikipedia also shows a broader inclusion of international personalities, indicating a slightly wider scope in terms of content focus. This reflects different editorial priorities and perhaps varying user interests in each linguistic community (Khatri and Shaw, 2022). Overall, both Setswana and Punjabi Wikipedias are actively engaged in documenting the achievements and histories of prominent figures, although Punjabi Wikipedia includes a mix of notable lists alongside individual profiles. ...

The social embeddedness of peer production: A comparative qualitative analysis of three Indian language Wikipedia editions
  • Citing Conference Paper
  • April 2022

... As preregistered, study 4 participants were excluded if they did not complete the pre-study survey or if they filled out fewer than five surveys over the course of the study. We did not collect demographic information for the children in studies 1-3; however, general information about the demographics of MTurk workers can be found in Shaw and Hargittai's (2021) study. More female (vs. ...

Do the online activities of Amazon Mechanical Turk workers mirror those of the general population?: A comparison of two survey samples
  • Citing Article
  • January 2021

International Journal of Communication