Article

You Can't Stay Here: The Efficacy of Reddit's 2015 Ban Examined Through Hate Speech

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In 2015, Reddit closed several subreddits-foremost among them r/fatpeoplehate and r/CoonTown-due to violations of Reddit's anti-harassment policy. However, the effectiveness of banning as a moderation approach remains unclear: banning might diminish hateful behavior, or it may relocate such behavior to different parts of the site. We study the ban of r/fatpeoplehate and r/CoonTown in terms of its effect on both participating users and affected subreddits. Working from over 100M Reddit posts and comments, we generate hate speech lexicons to examine variations in hate speech usage via causal inference methods. We find that the ban worked for Reddit. More accounts than expected discontinued using the site; those that stayed drastically decreased their hate speech usage-by at least 80%. Though many subreddits saw an influx of r/fatpeoplehate and r/CoonTown "migrants," those subreddits saw no significant changes in hate speech usage. In other words, other subreddits did not inherit the problem. We conclude by reflecting on the apparent success of the ban, discussing implications for online moderation, Reddit and internet communities more broadly.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... in the targeted communities (Chandrasekharan et al. 2017(Chandrasekharan et al. , 2022. On the other hand, there are concerns about their unintended consequences; banned users often migrate to more radical, less regulated platforms, where their extremist views may intensify (Horta Ribeiro et al. 2021b) and spill over into mainstream platforms (Russo et al. 2023b,a). ...
... Notably, users participating in a banned subreddit keep their accounts. Prior research shows that quarantines reduce new user recruitment, though the effect is often modest (Chandrasekharan et al. 2017;Trujillo and Cresci 2022b), but do not significantly reduce existing users' toxicity (Chandrasekharan et al. 2022). Bans significantly reduce activity but can push users to other fringe platforms (Horta Ribeiro et al. 2021b), leading to spillover effects back onto mainstream platforms (Russo et al. 2023b;Schmitz, Muric, and Burghardt 2022;Russo Latona et al. 2024). ...
... Implications. The results of this study highlight the potential for platform-based moderation to facilitate positive outcomes beyond merely reducing harmful activity (Chandrasekharan et al. 2017(Chandrasekharan et al. , 2022. Specifically, banning fringe communities appears to have the unintended but beneficial effect of driving a small but meaningful proportion of users to recovery communities, potentially initiating their journey toward deradicalization. ...
Preprint
Full-text available
Online platforms have sanctioned individuals and communities associated with fringe movements linked to hate speech, violence, and terrorism, but can these sanctions contribute to the abandonment of these movements? Here, we investigate this question through the lens of exredpill, a recovery community on Reddit meant to help individuals leave movements within the Manosphere, a conglomerate of fringe Web based movements focused on men's issues. We conduct an observational study on the impact of sanctioning some of Reddit's largest Manosphere communities on the activity levels and user influx of exredpill, the largest associated recovery subreddit. We find that banning a related radical community positively affects participation in exredpill in the period following the ban. Yet, quarantining the community, a softer moderation intervention, yields no such effects. We show that the effect induced by banning a radical community is stronger than for some of the widely discussed real-world events related to the Manosphere and that moderation actions against the Manosphere do not cause a spike in toxicity or malicious activity in exredpill. Overall, our findings suggest that content moderation acts as a deradicalization catalyst.
... Online abuse, harassment, and toxic interactions have already been longstanding issues on centralized platforms, prompting various moderation strategies over time, ranging from algorithmic content filtering to user reporting systems Jhaver et al. (2021). In centralized social networks, moderation is usually enforced through top-down and platform-driven mechanisms Chandrasekharan et al. (2017). Decentralized platforms, where content moderation is often less centralized, rely more on user-driven moderation, with blocking actions serving as a tool for managing unwanted interactions Zignani et al. (2018). ...
... The identification of effective strategies to determine the effect of moderation policies put in place by platforms has attracted the interest of researchers, with a growing body of research investigating the behavioral and ecological effects of moderation interventions over the years Chandrasekharan et al. (2017); Gorwa et al. (2020);Jhaver et al. (2021); Morrow et al. (2022); Russo et al. (2023). Here we review past research analyzing the implementation and the effects of different types of platform moderation. ...
... For example, Reddit banned around 2,000 subreddits associated with hate speech in QAnon conspiracy theory Collins and Zadrozny (2020). Although community bans have demonstrably reduced the presence of harmful content on mainstream platforms Chandrasekharan et al. (2017), their overall impact remains subject to debate. Deplatformed users are known to migrate to fringe platforms, which typically lack comprehensive and balanced moderation and can possibly exacerbate extremist behaviors Horta Ribeiro et al. (2021);Zuckerman and Rajendra-Nicolucci (2021). ...
Preprint
Full-text available
Moderation and blocking behavior, both closely related to the mitigation of abuse and misinformation on social platforms, are fundamental mechanisms for maintaining healthy online communities. However, while centralized platforms typically employ top-down moderation, decentralized networks rely on users to self-regulate through mechanisms like blocking actions to safeguard their online experience. Given the novelty of the decentralized paradigm, addressing self-moderation is critical for understanding how community safety and user autonomy can be effectively balanced. This study examines user blocking on Bluesky, a decentralized social networking platform, providing a comprehensive analysis of over three months of user activity through the lens of blocking behaviour. We define profiles based on 86 features that describe user activity, content characteristics, and network interactions, addressing two primary questions: (1) Is the likelihood of a user being blocked inferable from their online behavior? and (2) What behavioral features are associated with an increased likelihood of being blocked? Our findings offer valuable insights and contribute with a robust analytical framework to advance research in moderation on decentralized social networks.
... Research on RoastMe has discussed the potential for harmful roasts in contexts such as sexualization (Poppi & Dynel, 2021), mental health (Kasunic & Kaufman, 2018), and disability (Dynel & Poppi, 2019). Regarding the topic of race, Chandrasekharan et al. (2017) conducted a study on the banned subreddit r/CoonTown, a platform previously devoted to promoting violent hate speech against African Americans. The research indicated that users from this banned subreddit migrated to RoastMe. ...
... Despite several investigations into offensive humor on RoastMe (e.g., Dynel & Poppi, 2019;Kasunic & Kaufman, 2018;Poppi & Dynel, 2021), only a handful of studies have touched on racial humor in this context. Chandrasekharan et al. (2017) examined the racial humor in RoastMe while observing the process of subreddit user migration. Kasunic and Kaufman (2018) highlighted the discomfort experienced by a Black Reddit user in response to racial roasts during their interviews, despite their main focus being on the ambiguity faced by community members in discerning between innocuous and offensive roasts. ...
... This study examines conversations from the Reddit community of RoastMe, which is a dedicated space for individuals to submit themselves for humorous humiliation. This popular community has been the subject of several studies that have analyzed and questioned the norms and ethics of the subreddit (Chandrasekharan et al., 2017;Hitkul et al., 2020;Kasunic & Kaufman, 2018;Poppi & Dynel, 2021). This study collected and examined a year of posts and comments from this community to determine the extent to which roasters use racial stereotypes and associations when roasting posters. ...
Article
Racial stereotypes are harmful to those they apply to, leading to the totalization of identity and problematic interactions. Invocations of stereotypes are generally taboo in conversation but are often deemed permissible when couched in humor. This project uses text mining to explore racial stereotyping in a Reddit community focused on denigrative humor, r/RoastMe. Using the social identity model of deindividuation effects (SIDE) and priming theory, this work examines the prevalence and content of stereotypes relating to users’ racial identities. The results demonstrate that jokes directed toward non-White individuals often employ stereotypes of race and nationality, providing evidence for the effects of minimal cues in online interaction. Moreover, racialized current events were found to influence the frequency of relevant racial associations.
... Observational methods applied to social data. Recent studies show that quasi-causal methods can be applied to social media data to e.g., distill the outcomes of a given situation (Olteanu, Varol, and Kıcıman 2016), measure the impact of an intervention (Chandrasekharan et al. 2018), or estimate the effect of online social support (Cunha, Weber, and Pappa 2017). The application of these methods to social data, including propensity score matching (De Choudhury et al. 2016), difference-in-differences (Chandrasekharan et al. 2018), and instrumental variables (Zhang, Li, and Hong 2016), was found to reduce confounding biases. ...
... Recent studies show that quasi-causal methods can be applied to social media data to e.g., distill the outcomes of a given situation (Olteanu, Varol, and Kıcıman 2016), measure the impact of an intervention (Chandrasekharan et al. 2018), or estimate the effect of online social support (Cunha, Weber, and Pappa 2017). The application of these methods to social data, including propensity score matching (De Choudhury et al. 2016), difference-in-differences (Chandrasekharan et al. 2018), and instrumental variables (Zhang, Li, and Hong 2016), was found to reduce confounding biases. Chandrasekharan et al. (2018)'s work is closest to ours, as it employs techniques from the causal inference literature to quantify the impact of an intervention on hateful behavior on Reddit. ...
... The application of these methods to social data, including propensity score matching (De Choudhury et al. 2016), difference-in-differences (Chandrasekharan et al. 2018), and instrumental variables (Zhang, Li, and Hong 2016), was found to reduce confounding biases. Chandrasekharan et al. (2018)'s work is closest to ours, as it employs techniques from the causal inference literature to quantify the impact of an intervention on hateful behavior on Reddit. Yet, the intervention they study is platform-specifica ban on an existing community on Reddit-whereas we look at the impact of external (non-platform specific) events on both Reddit and Twitter. ...
Preprint
User-generated content online is shaped by many factors, including endogenous elements such as platform affordances and norms, as well as exogenous elements, in particular significant events. These impact what users say, how they say it, and when they say it. In this paper, we focus on quantifying the impact of violent events on various types of hate speech, from offensive and derogatory to intimidation and explicit calls for violence. We anchor this study in a series of attacks involving Arabs and Muslims as perpetrators or victims, occurring in Western countries, that have been covered extensively by news media. These attacks have fueled intense policy debates around immigration in various fora, including online media, which have been marred by racist prejudice and hateful speech. The focus of our research is to model the effect of the attacks on the volume and type of hateful speech on two social media platforms, Twitter and Reddit. Among other findings, we observe that extremist violence tends to lead to an increase in online hate speech, particularly on messages directly advocating violence. Our research has implications for the way in which hate speech online is monitored and suggests ways in which it could be fought.
... Notorious examples are the ban that Donald Trump received in 2021 from Facebook and X (formerly Twitter) [13] and the deplatforming of three particularly toxic influencers from X [6]. Additionally, X removed accounts involved in coordinated inauthentic behavior [14] and Reddit permanently shut down different communities because of racism, sexism and hatefulness [15,16]. In June 2020, Reddit itself hosted one of the biggest deplatforming campaigns in the history of social media -The Great Ban-which resulted in around 2,000 subreddits being banned due to ongoing spread of toxicity and hate speech. 1 Among these are popular communities such as r/The Donald and r/ChapoTrapHouse. ...
... Another body of works focused on assessing the effects of deplatforming in a subset of subreddits affected by the ban or in entirely different subreddits. Chandrasekharan et al. [15] and Saleem and Ruths [21] evaluated how The Great Ban impacted two specific subreddits: r/fatpeoplehate and r/coontown. They revealed how a large portion of users abandoned the platform and the remaining ones notably decreased their toxicity levels. ...
... Effectiveness of the moderation. In the current literature, the effectiveness of content moderation has typically been evaluated by examining the changes in these interventions in terms of activity and toxicity [15,6]. In this context, our study showed that a significant portion of toxic users left the platform, while those who remained exhibited a modest decrease in toxicity along with a notable decline in activity. ...
Preprint
Full-text available
In today's online environments users experience harm and abuse on a daily basis. Therefore, content moderation is crucial to ensure their safety and well-being. However, the effectiveness of many moderation interventions is still uncertain. We evaluate the effectiveness of The Great Ban, one of the largest deplatforming interventions carried out by Reddit that affected almost 2,000 communities. We analyze 53M comments shared by nearly 34K users, providing in-depth results on both the intended and unintended consequences of this ban. We found that 15.6\% of the moderated users abandoned the platform while the remaining ones decreased their overall toxicity by 4.1\%. Nonetheless, a subset of those users increased their toxicity by 70\% after the intervention. In any case, increases in toxicity did not lead to marked increases in activity or engagement, meaning that the most toxic users had overall a limited impact. Our findings bring to light new insights on the effectiveness of deplatforming. Furthermore, they also contribute to informing future content moderation strategies.
... These interactions give rise to a range of behaviors, some of which result in negative outcomes, while others lead to positive ones, both of which can impact users and platforms [10,18,19,25,28,50]. Prior research has concentrated on detecting and examining undesirable behaviors including toxicity [9,11,42], hate speech [11], personal attacks [54], and employed empirical approaches to uncover norm violations at scale [12] and determine the prevalence of antisocial behavior [41]. Additional approaches include predicting conversational outcomes based on initial comments [3,59], analyzing the structure of toxic conversations [39,49], and assessing the resilience-i.e., ability to bounce back-of online conversations following adverse events [36]. ...
... These interactions give rise to a range of behaviors, some of which result in negative outcomes, while others lead to positive ones, both of which can impact users and platforms [10,18,19,25,28,50]. Prior research has concentrated on detecting and examining undesirable behaviors including toxicity [9,11,42], hate speech [11], personal attacks [54], and employed empirical approaches to uncover norm violations at scale [12] and determine the prevalence of antisocial behavior [41]. Additional approaches include predicting conversational outcomes based on initial comments [3,59], analyzing the structure of toxic conversations [39,49], and assessing the resilience-i.e., ability to bounce back-of online conversations following adverse events [36]. ...
... Chandrasekharan et al. [12] conducted a large-scale empirical study on norm violations across 100 subreddits to identify emergent norms, while Park et al. [41] examined the prevalence of antisocial behavior within these communities. Various studies have also looked at toxicity and hate speech [9,11,34], mental well-being [46][47][48], moderation outcomes [10,27], and forecasting conversational outcomes [3,13,36,38]. Our work aims to enhance the understanding of the kinds of behavior and content that online communities find desirable. ...
Preprint
Full-text available
A major task for moderators of online spaces is norm-setting, essentially creating shared norms for user behavior in their communities. Platform design principles emphasize the importance of highlighting norm-adhering examples and explicitly stating community norms. However, norms and values vary between communities and go beyond content-level attributes, making it challenging for platforms and researchers to provide automated ways to identify desirable behavior to be highlighted. Current automated approaches of detecting desirability are limited to measures of prosocial behavior, but we do not know whether these measures fully capture the spectrum of what communities value. In this paper, we use upvotes, which express community approval, as a proxy for desirability and conduct an analysis of highly-upvoted comments across 85 popular sub-communities on Reddit. Using a large language model, we extract values from these comments and compile 97 macro\textit{macro}, meso\textit{meso}, and micro\textit{micro} values based on their frequency across communities. Furthermore, we find that existing computational models for measuring prosociality were inadequate to capture 86 of the values we extracted. Finally, we show that our approach can not only extract most of the qualitatively-identified values from prior taxonomies, but also uncover new values that are actually encouraged in practice. This work has implications for improving moderator understanding of their community values, motivates the need for nuanced models of desirability beyond prosocial measures, and provides a framework that can supplement qualitative work with larger-scale content analyses.
... Recent scholarship has demonstrated that interdependence among online communities is widespread, important to explaining success and failure, and likely to provide new insights for designing and managing communities (Cunha et al. 2019;Mitts, Pisharody, and Shapiro 2022;Chandrasekharan et al. 2017;Kairam, Wang, and Leskovec 2012;Tan and Lee 2015;Tan 2018;Vincent, Johnson, and Hecht 2018). Most relevant to this study are studies that adopt the theoretical lens of organizational ecology to understand competition and mutualism between overlapping online communities Wang, Butler, and Ren 2012;Zhu, Kraut, and Kittur 2014;. ...
... VAR models can only represent linear dynamics and competitive and mutualistic interactions that do not vary over time Cenci, Sugihara, and Saavedra 2019); however, online communities inhabit dynamic environments and experience shocks such as influx of newcomers and attention (Zhang et al. 2019;Kiene, Monroy-Hernández, and Hill 2016;Lin et al. 2017;Ratkiewicz et al. 2010). For example, policy changes and bans can influence related communities (Chandrasekharan et al. 2017;Ribeiro et al. 2021;Matias 2016). Therefore, this study uses nonlinear time-series analysis to investigate how ecological relationships in clusters of overlapping communities vary over time. ...
Preprint
Full-text available
Online communities are important organizational forms where members socialize and share information. Curiously, different online communities often overlap considerably in topic and membership. Recent research has investigated competition and mutualism among overlapping online communities through the lens of organizational ecology; however, it has not accounted for how the nonlinear dynamics of online attention may lead to episodic competition and mutualism. Neither has it explored the origins of competition and mutualism in the processes by which online communities select or adapt to their niches. This paper presents a large-scale study of 8,806 Reddit communities belonging to 1,919 clusters of high user overlap over a 5-year period. The method uses nonlinear time series methods to infer bursty, often short-lived ecological dynamics. Results reveal that mutualism episodes are longer lived and slightly more frequent than competition episodes. Next, it tests whether online communities find their niches by specializing to avoid competition using panel regression models. It finds that competitive ecological interactions lead to decreasing topic and user overlaps; however, changes that decrease such niche overlaps do not lead to mutualism. The discussion proposes that future designs may enable online community ecosystem management by informing online community leaders to organize "spin-off" communities or via feeds and recommendations.
... Since that time, the practice has grown along with the Internet and social media. The first major deplatforming event likely took place in 2015 when Reddit banned two ''subreddits,'' one used for fat-shaming and one dedicated to violent hate speech against African Americans (Chandrasekharan et al., 2017). Since then, the tactic is increasingly used against certain health-related speech (particularly related to and to curtail spread of such speech by political actors. ...
... These efforts attempt to describe the effects on the same social medium where the deplatforming took place. One study (Chandrasekharan et al., 2017) found that after bans of the subreddits described above, more users than expected discontinued using the site, and those who stayed drastically reduced their hate speech usage. While some subscribers to the banned accounts moved to other subreddits, there was no significant increase in hate speech frequency. ...
Article
Full-text available
Content moderation decisions can have variable impacts on the events and discourses they aim to regulate. This study analyzes Twitter data from before and after the removal of key Arizona Election Audit Twitter accounts in March of 2021. After collecting tweets that refer to the election audit in Arizona in this designated timeframe, a before/after comparison examines the structure of the networks, the volume of the participating population, and the themes of their discourse. Several significant changes are observed, including a drop in participation from accounts that were not deplatformed and a de-centralization of the Twitter network. Conspiracy theories remain in the discourse, but their themes become more diffuse, and their calls to action more abstract. Recruiting calls to join in on promoting and publicizing the audit mostly come to an end. The decision by Twitter to deplatform key election audit accounts appears to have greatly disrupted the hub structure at the center of the emergent network that formed as a response to the election audit. By intervening in the network, moderators successfully defused much of the Twitter-based participation in the Arizona Election Review of 2021. This instance demonstrates the efficacy of network-driven interventions in platform moderation, specifically for events or accounts that use social media to organize or encourage bad-faith attacks on civic instituions.
... The escalation of online hate speech presents a significant threat to individuals and society [23,75]. With the proliferation of social media, people now have access to a vast audience to disseminate harmful content that attacks individuals or groups based on their race [31,73,80], gender [35,49,124], religion [13,20,84], sexual orientation [33,34,46], or disability status [120,121,126]. ...
... Online hate is a pervasive and harmful phenomenon that affects individuals and society [23,59,99]. Researching people's perception of online hate posts is important for understanding the causes [29,60,129], consequences [74,119,123], and potential solutions to this problem [47,75]. ...
Preprint
Full-text available
This study investigates how online counterspeech, defined as direct responses to harmful online content with the intention of dissuading the perpetrator from further engaging in such behavior, is influenced by the match between a target of the hate speech and a counterspeech writer's identity. Using a sample of 458 English-speaking adults who responded to online hate speech posts covering race, gender, religion, sexual orientation, and disability status, our research reveals that the match between a hate post's topic and a counter-speaker's identity (topic-identity match, or TIM) shapes perceptions of hatefulness and experiences with counterspeech writing. Specifically, TIM significantly increases the perceived hatefulness of posts related to race and sexual orientation. TIM generally boosts counter-speakers' satisfaction and perceived effectiveness of their responses, and reduces the difficulty of crafting them, with an exception of gender-focused hate speech. In addition, counterspeech that displayed more empathy, was longer, had a more positive tone, and was associated with higher ratings of effectiveness and perceptions of hatefulness. Prior experience with, and openness to AI writing assistance tools like ChatGPT, correlate negatively with perceived difficulty in writing online counterspeech. Overall, this study contributes insights into linguistic and identity-related factors shaping counterspeech on social media. The findings inform the development of supportive technologies and moderation strategies for promoting effective responses to online hate.
... These interventions can create unintended consequences, as individuals may seek open groups without restrictions 39 . In the context of social contagions, online groups have been closed to curb the spread of hate speech, for instance banning certain subreddits on Reddit 40,41 . While some users discontinue their usage of the platform, some relocate their activity to other groups. ...
Article
Full-text available
People organize in groups and contagions spread across them. A simple stochastic process, yet complex to model due to dynamical correlations within and between groups. Moreover, groups can evolve if agents join or leave in response to contagions. To address the lack of analytical models that account for dynamical correlations and adaptation in groups, we introduce the method of generalized approximate master equations. We first analyze how nonlinear contagions differ when driven by group-level or individual-level dynamics. We then study the characteristic levels of group activity that best describe the stochastic process and that optimize agents’ ability to adapt to it. Naturally lending itself to study adaptive hypergraphs, our method reveals how group structure unlocks new dynamical regimes and enables distinct suitable adaptation strategies. Our approach offers a highly accurate model of binary-state dynamics on hypergraphs, advances our understanding of contagion processes, and opens the study of adaptive group-structured systems.
... When individuals or groups violate platform rules, social media companies may resort to deplatforming as a means of enforcement [5,15,26,31]. Notably, in 2019 several influential figures faced removal from Facebook and Instagram due to involvement in organized hate and violence [12,23]. Subsequently, some of these individuals reported significant impacts on their follower count, reputation, and overall influence [21]. ...
Article
Full-text available
User migration across social media platforms has accelerated in response to changes in ownership, policy, and user preferences. This study examines the migration from X/Twitter to emerging alternate platforms such as Threads, Mastodon, and Truth Social. Using a large dataset from X/Twitter, we analyze the extent of user departures and their destination platforms. Additionally, we investigate whether a user’s follower count on X/Twitter correlates with their follower count on other platforms, assessing the transferability of audience size. Surprisingly, our findings indicate that users with larger followings on X/Twitter are more likely to migrate. Moreover, follower counts on X/Twitter are strongly correlated with those on Threads but not with those on Mastodon or Truth Social.
... However, it is worth noting that TW/CW mechanisms are limited in scope, as they apply the cases in which the poster's intent is not malicious. For example, toxic behaviors, hate speech, or online harassment are typically addressed through community-based or machine learning-based moderation approaches [14,43], as we cannot expect individuals exhibiting such behavior to use TW/CW voluntarily. In these cases, the content creator's harmful intent precludes the use of self-reported warnings. ...
Preprint
The prevalence of distressing content on social media raises concerns about users' mental well-being, prompting the use of trigger warnings (TW) and content warnings (CW). However, inconsistent implementation of TW/CW across platforms and the lack of standardized practices confuse users regarding these warnings. To better understand how users experienced and utilized these warnings, we conducted a semi-structured interview study with 15 general social media users. Our findings reveal challenges across three key stakeholders: viewers, who need to decide whether to engage with warning-labeled content; posters, who struggle with whether and how to apply TW/CW to the content; and platforms, whose design features shape the visibility and usability of warnings. While users generally expressed positive attitudes toward warnings, their understanding of TW/CW usage was limited. Based on these insights, we proposed a conceptual framework of the TW/CW mechanisms from multiple stakeholders' perspectives. Lastly, we further reflected on our findings and discussed the opportunities for social media platforms to enhance users' TW/CW experiences, fostering a more trauma-informed social media environment.
... Regardless, future work can build upon these models by including more factors about surrounding posts that are competing for the same finite number of spots on these feeds. Furthermore, future work could also employ more advanced statistical methods like a stochastic transitivity model (Johnson and Kuhn 2013) or a causal framework used in prior work (Chandrasekharan et al. 2017;Saha, Chandrasekharan, and De Choudhury 2019;Jhaver, Rathi, and Saha 2024). These causal approaches will help examine the causal relationship between the different factors that influence and are influenced by algorithmic ranking. ...
Preprint
Platforms are increasingly relying on algorithms to curate the content within users' social media feeds. However, the growing prominence of proprietary, algorithmically curated feeds has concealed what factors influence the presentation of content on social media feeds and how that presentation affects user behavior. This lack of transparency can be detrimental to users, from reducing users' agency over their content consumption to the propagation of misinformation and toxic content. To uncover details about how these feeds operate and influence user behavior, we conduct an empirical audit of Reddit's algorithmically curated trending feed called r/popular. Using 10K r/popular posts collected by taking snapshots of the feed over 11 months, we find that the total number of comments and recent activity (commenting and voting) helped posts remain on r/popular longer and climb the feed. Using over 1.5M snapshots, we examine how differing ranks on r/popular correlated with engagement. More specifically, we find that posts below rank 80 showed a sharp decline in activity compared to posts above, and that posts at the top of r/popular had a higher proportion of undesired comments than those lower down. Our findings highlight that the order in which content is ranked can influence the levels and types of user engagement within algorithmically curated feeds. This relationship between algorithmic rank and engagement highlights the extent to which algorithms employed by social media platforms essentially determine which content is prioritized and which is not. We conclude by discussing how content creators, consumers, and moderators on social media platforms can benefit from empirical audits aimed at improving transparency in algorithmically curated feeds.
... On the platform Reddit, similar interventions are imposed on whole fora (called "subreddits"). One analysis of the banning of hate communities on Reddit found that overall hate speech on the platform decreased as a result [60]. Reddit also imposes soft-moderation on communities, called "quarantines," which warn users that the content of the community they are about to view contains potentially offensive material. ...
Article
Full-text available
Numerous studies have reported an increase in hate speech on X (formerly Twitter) in the months immediately following Elon Musk’s acquisition of the platform on October 27th, 2022; relatedly, despite Musk’s pledge to “defeat the spam bots,” a recent study reported no substantial change in the concentration of inauthentic accounts. However, it is not known whether any of these trends endured. We address this by examining material posted on X from the beginning of 2022 through June 2023, the period that includes Musk’s full tenure as CEO. We find that the increase in hate speech just before Musk bought X persisted until at least May of 2023, with the weekly rate of hate speech being approximately 50% higher than the months preceding his purchase, although this increase cannot be directly attributed to any policy at X. The increase is seen across multiple dimensions of hate, including racism, homophobia, and transphobia. Moreover, there is a doubling of hate post “likes,” indicating increased engagement with hate posts. In addition to measuring hate speech, we also measure the presence of inauthentic accounts on the platform; these accounts are often used in spam and malicious information campaigns. We find no reduction (and a possible increase) in activity by these users after Musk purchased X, which could point to further negative outcomes, such as the potential for scams, interference in elections, or harm to public health campaigns. Overall, the long-term increase in hate speech, and the prevalence of potentially inauthentic accounts, are concerning, as these factors can undermine safe and democratic online environments, and increase the risk of offline harms.
... To combat hate speech, a large body of prior research has demonstrated the capability to identify hate speech through technical means [21,33], human labor [15], or combinations of the two [36]. In efforts to improve automatic detection, some work highlighted the importance of context learning when facing implicit forms of hate speech [25] or text modification attacks with typos and nonhate words [31]. ...
Article
Full-text available
During times of crisis, heightened anxiety and fear create fertile ground for hate speech and misinformation, as people are more likely to fall for and be influenced by it. This paper looks into the interwoven relationship between anti-Asian hatred and COVID-19 misinformation amid the pandemic. By analyzing 785,798 Asian hate tweets and surveying 308 diverse participants, this empirical study explores how hateful content portrays the Asian community, including its truthfulness and targets, as well as what makes such portrayals harmful. We observed a high prevalence of misinformative hate speech that was lengthier, less emotional, and expressed more motivational drives than general hate speech. Overall, we found that anti-Asian rhetoric was characterized by an antagonism and inferiority framing, with misinformative hate underscoring antagonism and general hate emphasizing calls for action. Among all entities being explicitly criticized, China and the Chinese were constantly named to assign blame, with misinformative hate more likely to finger-point than general hate. Our survey results indicated that hateful messages with misinformation, demographic targeting, or divisive references were perceived as significantly more damaging. Individuals who placed less importance on free speech, had personal encounters with hate speech, or believed in the natural origin of COVID-19 were more likely to perceive higher severity. Taken together, this work highlights the distinct compositions of hate within misinformative hate speech that influences perceived harmfulness and adds to the complexity of defining and moderating harmful content. We discuss the implications for designing more context-and culture-sensitive counter-strategies and building more adaptive and explainable moderation approaches.
... Controversies have beset the platform before: some object to the existence of a subreddit, others object when it is removed. Reddit banned r/Creepshots and its ilk in 2012 [7], r/fatpeoplehate and other hate subs in 2015 (Chandrasekharan, et al., 2017), and r/The_Donald in 2020 (Ribeiro, et al., 2021). The latest controversy was different. ...
Article
Full-text available
Though there is robust literature on the history of the advice genre, Reddit is an unrecognized but significant medium for the genre. This lack of attention, in part, stems from the lack of a coherent timeline and framework for understanding the emergence of dozens of advice-related subreddits. Noting the challenges of Reddit historiography, I trace the development of the advice genre on the platform, using the metaphors of evolutionary and family trees. I make use of data dumps of early Reddit submissions and interviews with subreddit founders and moderators to plot the development of advice subreddits through the periods of subreddit explosion (2009–2010), the emergence of judgment subreddits (2011–2013; 2019–2021), and the rise of meta subreddits (2020–2023). Additionally, I specify a lexicon for understanding the relationships between subreddits using the metaphor of tree branches. For example, new subreddits might spawn, fork, or split relative to existing subreddits, and their content is cultivated by meta subreddits by way of filtration, compilation, and syndication.
... As tech companies mature, however, additional expectations, and an expectation for steady growth, are placed on them [55]. For example, as Reddit tried to raise a Series C funding round, prospective investors were concerned about the risk to the brand presented by Reddit's early tolerance of hate groups on the site (an artifact of earlier, high-growth VC interests) [17]. In the U.S., the SEC mandates that public tech firms like Alphabet, Meta, Apple, etc., release public, quarterly earnings reports [92]. ...
Preprint
Full-text available
This paper advances a theoretical argument about the role capital plays in structuring CHI research. We introduce the concept of technological capture to theorize the mechanism by which this happens. Using this concept, we decompose the effect on CHI into four broad forms: technological capture creates market-creating, market-expanding, market-aligned, and externality-reducing CHI research. We place different CHI subcommunities into these forms -- arguing that many of their values are inherited from capital underlying the field. Rather than a disciplinary- or conference-oriented conceptualization of the field, this work theorizes CHI as tightly-coupled with capital via technological capture. The paper concludes by discussing some implications for CHI.
... A limited number of studies suggest that some interventions might be efficacious on some platforms [29][30][31][32][33][34] ; however, most of these studies do not examine platforms with a sufficiently large user base to draw public scrutiny and the studies themselves show mixed results 31,32,[35][36][37][38][39][40][41] . The most relevant example from prior work to our inquiry provides some evidence that Twitter reduced misinformation on the platform by suddenly removing 70,000 accounts -including the account of the sitting President of the United States -following political violence at the US Capitol on January 6, 2021 39 . ...
Preprint
Full-text available
Users dissatisfied with exposure to objectionable online content have begun to migrate en masse to new social media platforms. These new platforms share architectural features with legacy platforms, but offer content moderation services that legacy platforms no longer employ at scale. Such migrations assume that moderation interventions, such as deplatforming and content removal, are efficacious; however, this claim has not been evaluated based on evidence. We therefore evaluated the efficacy of Twitter’s attempts to curtail vaccine misinformation during the COVID-19 pandemic. We found that vaccine skeptical accounts generated a larger share of tweets about vaccines, increased in virality, and became more misinformative after Twitter began removing content and accounts. We also found evidence that Twitter’s mass deplatforming of 70,000 accounts on January 8, 2021 preceded an increase in misinformation, calling into question the efficacy of these removals. Novel platforms that share Twitter’s architecture may therefore face similar moderation challenges.
... On the one hand, concerns about hateful content and the increased demand for content moderation have motivated extensive research on automated hate speech detection (e.g., Waseem and Hovy, 2016;Hanu and Unitary team, 2020;Hartvigsen et al., 2022;Bianchi et al., 2022) and the effectiveness of content moderation efforts (e.g., Chandrasekharan et al., 2017;Jhaver et al., 2021;Jiménez Durán, 2022;Beknazar-Yuzbashev et al., 2022;Jiménez Durán et al., 2022;Müller and Schwarz, 2022a). One upshot of this research is that algorithms are susceptible to false positives, often triggered by swear words or otherwise innocent words that happen to often be found in the context of hate speech (e.g., Attanasio et al., 2022). ...
Preprint
Full-text available
There is an ongoing debate about how to moderate toxic speech on social media and how content moderation affects online discourse. We propose and validate a methodology for measuring the content-moderation-induced distortions in online discourse using text embeddings from computational linguistics. We test our measure on a representative dataset of 5 million US political Tweets and find that removing toxic Tweets distorts online content. This finding is consistent across different embedding models, toxicity metrics, and samples. Importantly, we demonstrate that content-moderation-induced distortions are not caused by the toxic language. Instead, we show that, as a side effect, content moderation shifts the mean and variance of the embedding space, distorting the topic composition of online content. Finally, we propose an alternative approach to content moderation that uses generative Large Language Models to rephrase toxic Tweets to preserve their salvageable content rather than removing them entirely. We demonstrate that this rephrasing strategy reduces toxicity while minimizing distortions in online content.
... 2) Hate Terms-lists (HTs-lists): We used the following six sets of HTs-lists, which cover the following hate context. [12] contains Reddit word list from two subreddits: r/f*tpeoplehate and r/C**nTown, where Reddit hate lexicon 1 . ...
Preprint
Full-text available
Hate speech classification has become an important problem due to the spread of hate speech on social media platforms. For a given set of Hate Terms lists (HTs-lists) and Hate Speech data (HS-data), it is challenging to understand which hate term contributes the most for hate speech classification. This paper contributes two approaches to quantitatively measure and qualitatively visualise the relationship between co-occurring Hate Terms (HTs). Firstly, we propose an approach for the classification of hate-speech by producing a Severe Hate Terms list (Severe HTs-list) from existing HTs-lists. To achieve our goal, we proposed three metrics (Hatefulness, Relativeness, and Offensiveness) to measure the severity of HTs. These metrics assist to create an Inter-agreement HTs-list, which explains the contribution of an individual hate term toward hate speech classification. Then, we used the Offensiveness metric values of HTs above a proposed threshold minimum Offense (minOffense) to generate a new Severe HTs-list. To evaluate our approach, we used three hate speech datasets and six hate terms lists. Our approach shown an improvement from 0.845 to 0.923 (best) as compared to the baseline. Secondly, we also proposed Stable Hate Rule (SHR) mining to provide ordered co-occurrence of various HTs with minimum Stability (minStab). The SHR mining detects frequently co-occurring HTs to form Stable Hate Rules and Concepts. These rules and concepts are used to visualise the graphs of Transitivities and Lattices formed by HTs.
... The manual identification of malicious contents can be achieved by expert review [71] or community reporting [49]. ...
Preprint
Full-text available
In the modern world, our cities and societies face several technological and societal challenges, such as rapid urbanization, global warming & climate change, the digital divide, and social inequalities, increasing the need for more sustainable cities and societies. Addressing these challenges requires a multifaceted approach involving all the stakeholders, sustainable planning, efficient resource management, innovative solutions, and modern technologies. Like other modern technologies, social media informatics also plays its part in developing more sustainable and resilient cities and societies. Despite its limitations, social media informatics has proven very effective in various sustainable cities and society applications. In this paper, we review and analyze the role of social media informatics in sustainable cities and society by providing a detailed overview of its applications, associated challenges, and potential solutions. This work is expected to provide a baseline for future research in the domain.
... Other researchers produced similar results, finding deplatforming to be an effective tool of platform governance for mainstream sites (Rauchfleisch & Kaiser, 2021;. Reddit's decision to remove fat-shaming and racist subreddits in 2015, for example, was determined a success as offensive and hate-fuelled content apparently decreased (Chandrasekharan et al., 2017;Saleem & Ruths, 2018), while others have 'found that deplatforming significantly reduce[s] the popularity of many anti-social ideas associated with influencers' such as Yiannopoulos and Jones (Jhaver et al., 2021). Deplatforming has additionally been found to mitigate the spread of disinformation somewhat, though the sharing of content often continues via new account or new sharing streams and at times is more a disruption in the content stream than a deterrent or undoing (Bruns et al., 2021;. ...
Preprint
The rapid rise of video content on platforms such as TikTok and YouTube has transformed information dissemination, but it has also facilitated the spread of harmful content, particularly hate videos. Despite significant efforts to combat hate speech, detecting these videos remains challenging due to their often implicit nature. Current detection methods primarily rely on unimodal approaches, which inadequately capture the complementary features across different modalities. While multimodal techniques offer a broader perspective, many fail to effectively integrate temporal dynamics and modality-wise interactions essential for identifying nuanced hate content. In this paper, we present CMFusion, an enhanced multimodal hate video detection model utilizing a novel Channel-wise and Modality-wise Fusion Mechanism. CMFusion first extracts features from text, audio, and video modalities using pre-trained models and then incorporates a temporal cross-attention mechanism to capture dependencies between video and audio streams. The learned features are then processed by channel-wise and modality-wise fusion modules to obtain informative representations of videos. Our extensive experiments on a real-world dataset demonstrate that CMFusion significantly outperforms five widely used baselines in terms of accuracy, precision, recall, and F1 score. Comprehensive ablation studies and parameter analyses further validate our design choices, highlighting the model's effectiveness in detecting hate videos. The source codes will be made publicly available at https://github.com/EvelynZ10/cmfusion.
Article
Social media is an integral part of the journalism ecosystem for both reporting and distributing news content. One space in the social media landscape that is receiving new significant growth in the journalism sphere while having differences in its social media logic is Reddit. An organization that has dedicated resources to the usage of the platform, the Washington Post , is the focus of the current exploratory study to better understand how news brands are utilizing Reddit. This quantitative content analysis uses social media logic to examine the type of content that the Post shared and also considers the users’ responses to that content. Findings show that, despite a supposed emphasis on community building for news brands, the Post more heavily focused on distributing its own content. Additional findings include the importance of posting harder news on the platform instead of softer news for both Reddit score and comments and more.
Article
From politicians to podcast hosts, online platforms have systematically banned (''deplatformed'') influential users for breaking platform guidelines. Previous inquiries on the effectiveness of this intervention are inconclusive because 1) they consider only a few deplatforming events; 2) they consider only overt engagement traces (e.g., likes and posts) but not passive engagement (e.g., views); 3) they do not consider all the potential places influencers impacted by the deplatforming event might migrate to. We address these limitations in a longitudinal, quasi-experimental study of 165 deplatforming events targeting 101 influencers. We identify deplatforming events through Reddit posts and then manually curate the data, ensuring the correctness of a large dataset of deplatforming events. Then, we link these events to Google Trends and Wikipedia page views, platform-agnostic measures of online attention that capture the general public's interest in specific influencers. Through a difference-in-differences approach, we find that deplatforming reduces online attention toward influencers. After 12 months, we estimate that online attention toward deplatformed influencers is reduced by -63% (95% CI [-75%,-46%]) on Google and by -43% (95% CI [-57%,-24%]) on Wikipedia. Further, as we study over a hundred deplatforming events, we can analyze in which cases deplatforming is more or less impactful, revealing nuances about the intervention. Notably, we find that both permanent and temporary deplatforming reduces online attention toward influencers and that deplatforming influencers from multiple platforms further reduces the online attention they receive. Overall, this work contributes to the ongoing effort to map the effectiveness of content moderation interventions, driving platform governance away from speculation.
Article
Historically, transgender people of color (TPOC) have been silenced in white trans spaces for not fitting into transnormativity - the typical white, binary, skinny, and privileged image of trans people, and for raising concerns related to race, culture, and ethnicity. Social media and online communities serve as supportive spaces for transgender (shortened to trans) individuals; however, trans people of color require even more support combating systematic oppression, managing increased levels of discrimination, and navigating their cultural backgrounds. In order to understand how TPOC use social media, we explore the experiences of TPOC on Reddit. We used the Reddit API to obtain Reddit posts from four prominent transgender subreddits (r/ftm, r/mtf, r/trans, and r/Non-Binary) which included the phrase "people of color" or the abbreviation "POC", resulting in a total of 145 posts and 2867 comments. Thematic analysis was then used to identify three themes of discussion - alienation, support, and existing in physical spaces, which informed our design considerations. Experiences shared in the Reddit posts indicated that TPOC feel overshadowed by white trans individuals in online communities and desire to build connections with other TPOC both online and in person. We propose design recommendations for both Reddit as a platform and subreddit moderators that regulate online trans communities to encourage growing networks among TPOC, improve communication among users and moderators, and design spaces that center POC voices within subreddits, all of which provide a much more supportive online environment for TPOC.
Article
Cyberhate presents a multifaceted, context-sensitive challenge that existing detection methods often struggle to tackle effectively. Large language models (LLMs) exhibit considerable potential for improving cyberhate detection due to their advanced contextual understanding. However, detection alone is insufficient; it is crucial for software to also promote healthier user behaviors and empower individuals to actively confront the spread of cyberhate. This study investigates whether integrating large language models (LLMs) with persuasive technology (PT) can effectively detect cyberhate and encourage prosocial user behavior in digital spaces. Through an empirical study, we examine users’ perceptions of a self-monitoring persuasive strategy designed to reduce cyberhate. Specifically, the study introduces the Comment Analysis Feature to limit cyberhate spread, utilizing a prompt-based fine-tuning approach combined with LLMs. By framing users’ comments within the relevant context of cyberhate, the feature classifies input as either cyberhate or non-cyberhate and generates context-aware alternative statements when necessary to encourage more positive communication. A case study evaluated its real-world performance, examining user comments, detection accuracy, and the impact of alternative statements on user engagement and perception. The findings indicate that while most of the users (83%) found the suggestions clear and helpful, some resisted them, either because they felt the changes were irrelevant or misaligned with their intended expression (15%) or because they perceived them as a form of censorship (36%). However, a substantial number of users (40%) believed the interventions enhanced their language and overall commenting tone, with 68% suggesting they could have a positive long-term impact on reducing cyberhate. These insights highlight the potential of combining LLMs and PT to promote healthier online discourse while underscoring the need to address user concerns regarding relevance, intent, and freedom of expression.
Article
Purpose This study aims to introduce a novel methodology for visually analyzing psychological tension in social networks, particularly in the context of disturbances related to coronavirus vaccination. It also aims to enhance the interpretation of online discourse dynamics by integrating mathematical, linguistic and visual-analytical methods. Design/methodology/approach The study uses a comprehensive approach, including tweet array generation via the Vicinitas API, key term extraction, sentiment analysis and visualization tools such as Word-Cloud, VosViewer, Gephisto and Gephi. This methodology is tested on protests against coronavirus vaccination to evaluate its effectiveness in capturing the intricacies of digital discourse. Findings The study identifies key themes, including psychological stress linked to vaccination, protest movements and sentiments regarding trust and belief. Hierarchical representations and visualizations reveal the nuances within digital discourse, demonstrating the methodology’s capacity to discern patterns in social media interactions. Practical implications Practically, this methodology offers a robust tool for monitoring and interpreting online discourse, particularly in scenarios demanding immediate response, such as public health crises. Public health officials can use this method to detect early signs of misinformation or psychological distress, enabling timely interventions. Policymakers and analysts can leverage these insights to design communication strategies that build trust and mitigate public anxiety, aligning with societal needs for transparency and accountability. Originality/value This research introduces a novel integration of computational tools and expert insights, specifically designed for real-time analysis of psychological tension in online discourse. Unlike previous methods that focus on text analysis or sentiment evaluation independently, this approach uniquely combines sentiment dynamics with predictive modeling, offering a comprehensive lens for understanding digital interactions during sensitive public health crises.
Preprint
Full-text available
This paper investigates the behavior of Reddit users who re- lied on alternative mobile apps, such as Apollo and RiF, before and after their forced shutdown by Reddit on July 1, 2023. The announcement of the shutdown led many observers to predict significant negative consequences, such as mass migration away from the platform. Using data from January to November 2023, we analyze user engagement and migration rates for users of these alternative clients before and after the forced discontinuation of their apps. We find that 22% of alternative client users permanently left Reddit as a result, and 45% of the users who openly threatened to leave if the changes were enacted followed through with their threats. Overall, we find that the shutdown of third-party apps had no discernible impact on overall platform activity. While the preceding protests were severe, ultimately for most users the cost of switching to the official client was likely far less than the effort required to switch to an entirely different platform. Scientific attention has increased to understand the contributing factors and effects of migration between online platforms, but real-world examples with available data remain rare. Our study addresses this by examining a large-scale online migratory movement.
Article
This study examines how journalists are grappling with platform migration following Elon Musk’s acquisition of Twitterin October 2022. Using a mixed-method approach that combines computational analysis of the activities of 861 journalists on Twitter and Mastodon with qualitative interviews of 11 active journalists, this study aims to (1) examine the extent to which journalists have exhibited different forms of Twitter disengagement post-acquisition; (2) identify the motivating and discouraging factors influencing their move, guided by the push-pull-mooring model; and (3) explore how journalists managed their online presence across platforms. The results indicated minimal Twitter non-use following Musk’s takeover, and full migration was not observed within a 6-month post-acquisition period. Factors such as the flood of fake news and the loss of the blue-tick verification served as push factors, while the appeal of Mastodon’s enhanced user control and stronger community values acted as pull factors. However, the practical reliance on Twitter’s functionalities, audience base, and professional obligations made total abandonment challenging.
Article
Social media platforms like Facebook and Reddit host thousands of user-governed online communities. These platforms sanction communities that frequently violate platform policies; however, public perceptions of such sanctions remain unclear. In a pre-registered survey conducted in the US, I explore bystander perceptions of content moderation for communities that frequently feature hate speech, violent content, and sexually explicit content. Two community-wide moderation interventions are tested: (1) community bans, where all community posts are removed, and (2) community warning labels, where an interstitial warning label precedes access. I examine how third-person effects and support for free speech influence user approval of these interventions on any platform. My regression analyses show that presumed effects on others are a significant predictor of backing for both interventions, while free speech beliefs significantly influence participants’ inclination for using warning labels. Analyzing the open-ended responses, I find that community-wide bans are often perceived as too coarse, and users instead value sanctions in proportion to the severity and type of infractions. I report on concerns that norm-violating communities could reinforce inappropriate behaviors and show how users’ choice of sanctions is influenced by their perceived effectiveness. I discuss the implications of these results for HCI research on online harms and content moderation.
Article
Individuals rely on messaging platforms to form and maintain intimate relationships, trusting shared information will remain within intended digital confines. However, the screenshot feature allows people to capture and store pieces of private conversations as a separate file on their device, rendering them shareable with third parties. While usage of this feature can be benign, this study focuses on its ability to breach privacy expectations within messaging platforms, termed within communication privacy management theory as privacy turbulence. This study recognizes the power of both interpersonal dynamics and platform affordances in constraining existing norms around screenshot collection and sharing others’ private messages. Experimental results (n = 302) suggest obscuring received messages upon use of the screenshot feature and stating an explicit privacy rule significantly reduce screenshot collection and sharing, respectively. Implications for communication theory and the future of messaging platform design will be discussed.
Article
Online communities are important spaces for members of marginalized groups to organize and support one another. To better understand the experiences of fat people - a group whose marginalization often goes unrecognized - in online communities, we conducted 12 semi-structured interviews with fat people. Our participants leveraged online communities to engage in consciousness raising around fat identity, learning to locate ''the problem of being fat'' not within themselves or their own bodies but rather in the oppressive design of the society around them. Participants were then able to use these communities to mitigate everyday experiences of anti-fatness, such as navigating hostile healthcare systems. However, to access these benefits, our participants had to navigate myriad sociotechnical harms, ranging from harassment to discriminatory algorithms. In light of these findings, we suggest that researchers and designers of online communities support selective fat visibility, consider fat people in the design of content moderation systems, and investigate algorithmic discrimination toward fat people. More broadly, we call on researchers and designers to contend with the social and material realities of fat experience, as opposed to the prevailing paradigm of treating fat people as problems to be solved in-and-of-themselves. This requires recognizing fat people as a marginalized social group and actively confronting anti-fatness as it is embedded in the design of technology.
Article
AI chatbots are increasingly integrated into various sectors, including healthcare. We examine their role in responding to queries related to Alzheimer’s Disease and Related Dementias (AD/ADRD). We obtained real-world queries from AD/ADRD online communities (OC)—Reddit (r/Alzheimers) and ALZConnected. First, we conducted a small-scale qualitative examination where we prompted ChatGPT, Bard, and Llama-2 with 101 OC posts to generate responses and compared them with OC responses through inductive coding and thematic analysis. We found that although AI can provide emotional and informational support like OCs, they do not engage in deeper conversations, provide references, and share personal experiences. These insights motivated us to conduct a large-scale quantitative examination of comparing AI (GPT) and OC responses (90K) to 13.5K posts, in terms of psycholinguistics, lexico-semantics, and content. AI responses tend to be more verbose, readable, and complex. AI responses exhibited greater empathy, but more formal and analytical language, lacking personal narratives and linguistic diversity. We found that various LLMs, including GPT, Llama, and Mistral, exhibit consistent patterns in responding to AD/ADRD-related queries, underscoring the robustness of our insights across LLMs. Our study sheds light on the potential of AI in digital health and underscores design considerations of AI to complement human interactions.
Article
Full-text available
The exponential growth of social media platforms has necessitated the development of robust and scalable content moderation systems to ensure safe and respectful digital environments. This paper explores the design and implementation of automated content moderation systems that leverage advanced technologies such as natural language processing (NLP), computer vision, and machine learning. These systems aim to detect, analyze, and mitigate harmful content, including hate speech, misinformation, and explicit material, in real time. Key challenges addressed include achieving accuracy in diverse linguistic and cultural contexts, minimizing false positives and negatives, and ensuring compliance with legal and ethical guidelines. A hybrid moderation framework combining automated algorithms and human oversight is proposed to balance efficiency with contextual understanding. Advanced NLP models, such as transformer-based architectures, are utilized for text analysis, while convolutional neural networks (CNNs) are employed for image and video content assessment. Real-world applications, such as flagging inappropriate posts, restricting access to harmful media, and managing user reports, are examined to highlight the system's efficacy. Additionally, the paper emphasizes the importance of transparency, user privacy, and bias mitigation in designing these systems. Case studies from leading social media platforms illustrate the impact of automated moderation on reducing harmful content and fostering healthier online communities. By advancing the capabilities of automated systems, this research underscores their pivotal role in the evolving digital landscape, offering scalable solutions to the challenges of modern content moderation.
Preprint
Full-text available
Effective content moderation in online communities is often a delicate balance between maintaining content quality and fostering user participation. In this paper, we introduce post guidance, a novel approach to community moderation that proactively guides users' contributions using rules that trigger interventions as users draft a post to be submitted. For instance, rules can surface messages to users, prevent post submissions, or flag posted content for review. This uniquely community-specific, proactive, and user-centric approach can increase adherence to rules without imposing additional burdens on moderators. We evaluate a version of Post Guidance implemented on Reddit, which enables the creation of rules based on both post content and account characteristics, via a large randomized experiment, capturing activity from 97,616 posters in 33 subreddits over 63 days. We find that Post Guidance (1) increased the number of ``successful posts'' (posts not removed after 72 hours), (2) decreased moderators' workload in terms of manually-reviewed reports, (3) increased contribution quality, as measured by community engagement, and (4) had no impact on posters' own subsequent activity, within communities adopting the feature. Post Guidance on Reddit was similarly effective for community veterans and newcomers, with greater benefits in communities that used the feature more extensively. Our findings indicate that post guidance represents a transformative approach to content moderation, embodying a paradigm that can be easily adapted to other platforms to improve online communities across the Web.
Article
The proliferation of social media platforms has afforded social scientists unprecedented access to vast troves of data on human interactions, facilitating the study of online behavior at an unparalleled scale. These platforms typically structure conversations as threads, forming tree-like structures known as ''discussion trees.'' This paper examines the structural properties of online discussions on Reddit by analyzing both global (community-level) and local (post-level) attributes of these discussion trees. We conduct a comprehensive statistical analysis of a year's worth of Reddit data, encompassing a quarter of a million posts and several million comments. Our primary objective is to disentangle the relative impacts of global and local properties and evaluate how specific features shape discussion tree structures. The results reveal that both local and global features contribute significantly to explaining structural variation in discussion trees. However, local features, such as post content and sentiment, collectively have a greater impact, accounting for a larger proportion of variation in the width, depth, and size of discussion trees. Our analysis also uncovers considerable heterogeneity in the impact of various features on discussion structures. Notably, certain global features play crucial roles in determining specific discussion tree properties. These features include the subreddit's topic, age, popularity, and content redundancy. For instance, posts in subreddits focused on politics, sports, and current events tend to generate deeper and wider discussion trees. This research enhances our understanding of online conversation dynamics and offers valuable insights for both content creators and platform designers. By elucidating the factors that shape online discussions, our work contributes to ongoing efforts to improve the quality and effectiveness of digital discourse.
Article
Trust is crucial for the functioning of complex societies, and an important concern for CSCW. Our purpose is to use research from philosophy, social science, and CSCW to provide a novel account of trust in the 'post-truth' era. Testimony, from one speaker to another, underlies many social systems. Epistemic trust, or testimonial credibility, is the likelihood to accept a speaker's claim due to beliefs about their competence or sincerity. Epistemic trust is closely related to several 'pathological epistemic phenomena': democratic (il)legitimacy, the spread of misinformation, and echo chambers. To the best of our knowledge, this theoretical contribution is novel in the field of social computing. We further argue that epistemic trust is no philosophical novelty: it is measurable. Weakly supervised text classification approaches achieve F_1 scores of around 80 to 85 per cent on detecting epistemic distrust. This is also, to the best of our knowledge, a novel task in natural language processing. We measure expressions of epistemic distrust across 954 political communities on Reddit. We find that expressions of epistemic distrust are relatively rare, although there are substantial differences between communities. Conspiratorial communities and those focused on controversial political topics tend to express more distrust. Communities with strong epistemic norms enforced by moderation are likely to express low levels. While we find users to be an important potential source of contagion of epistemic distrust, community norms appear to dominate. It is likely that epistemic trust is more useful as an aggregated risk factor. Finally, we argue that policymakers should be aware of epistemic trust considering their reliance on legitimacy underwritten by testimony.
Article
Despite frequent efforts to combat racism, almost no research has explored how to cultivate positive experiences of thriving Black culture on Reddit. In this case study, we surveyed users of r/BlackPeopleTwitter (BPT)--a large, popular subreddit that showcases screenshots of hilarious or insightful social media posts made by Black people (mainly from Black Twitter). Our research questions seek to understand users' motivations for visiting BPT, how they experience a sense of virtual community (SOVC) and membership in BPT, and how BPT's governance influences these experiences. We find that that users come to BPT primarily for excellent humor and entertainment, sociopolitical context on issues relevant to Black people, and/or partaking in the shared Black experience. Black users are more likely to report higher SOVC and to identify as members, whereas non-Black users are more likely to identify as guests or visitors to the community. To protect Black expression, the BPT moderation team implemented a governance strategy for verifying racial identity and limiting participation to only verified users in certain threads. Our data suggest that this policy is a contentious but influential aspect of SOVC that simultaneously constructs and challenges the sense of the subreddit existing as a safe space for Black people. We synthesize these results by discussing how: differing platform affordances across Twitter and Reddit combine to cultivate a thriving Black community on Reddit; the need for Black authenticity on an otherwise anonymous platform can guide future research in identity verification; and the limitations of this study motivate future work to support all marginalized communities online.
Article
The role of a moderator is often characterized as solely punitive, however, moderators have the power to not only execute reactive and punitive actions but also create norms and support the values they want to see within their communities. One way moderators can proactively foster healthy communities is through positive reinforcement, but we do not currently know whether moderators on Reddit enforce their norms by providing positive feedback to desired contributions. To fill this gap in our knowledge, we surveyed 115 Reddit moderators to build two taxonomies: one for the content and behavior that actual moderators want to encourage and another taxonomy of actions moderators take to encourage desirable contributions. We found that prosocial behavior, engaging with other users, and staying within the topic and norms of the subreddit are the most frequent behaviors that moderators want to encourage. We also found that moderators are taking actions to encourage desirable contributions, specifically through built-in Reddit mechanisms (e.g., upvoting), replying to the contribution, and explicitly approving the contribution in the moderation queue. Furthermore, moderators reported taking these actions specifically to reinforce desirable behavior to the original poster and other community members, even though many of the actions are anonymous, so the recipients are unaware that they are receiving feedback from moderators. Importantly, some moderators who do not currently provide feedback do not object to the practice. Instead, they are discouraged by the lack of explicit tools for positive reinforcement and the fact that their fellow moderators are not currently engaging in methods for encouragement. We consider the taxonomy of actions moderators take, the reasons moderators are deterred from providing encouragement, and suggestions from the moderators themselves to discuss implications for designing tools to provide positive feedback.
Article
Full-text available
Interrupted time series (ITS) analysis is a valuable study design for evaluating the effectiveness of population-level health interventions that have been implemented at a clearly defined point in time. It is increasingly being used to evaluate the effectiveness of interventions ranging from clinical therapy to national public health legislation. Whereas the design shares many properties of regression-based approaches in other epidemiological studies, there are a range of unique features of time series data that require additional methodological considerations. In this tutorial we use a worked example to demonstrate a robust approach to ITS analysis using segmented regression. We begin by describing the design and considering when ITS is an appropriate design choice. We then discuss the essential, yet often omitted, step of proposing the impact model a priori. Subsequently, we demonstrate the approach to statistical analysis including the main segmented regression model. Finally we describe the main methodological issues associated with ITS analysis: over-dispersion of time series data, autocorrelation, adjusting for seasonal trends and controlling for time-varying confounders, and we also outline some of the more complex design adaptations that can be used to strengthen the basic ITS design.
Conference Paper
Full-text available
Social media sites like Facebook and Instagram remove content that is against community guidelines or is perceived to be deviant behavior. Users also delete their own content that they feel is not appropriate within personal or community norms. In this paper, we examine characteristics of over 30,000 pro-eating disorder (pro-ED) posts that were at one point public on Instagram but have since been removed. Our work shows that straightforward signals can be found in deleted content that distinguish them from other posts, and that the implications of such classification are immense. We build a classifier that compares public pro-ED posts with this removed content that achieves moderate accuracy of 69%. We also analyze the characteristics in content in each of these post categories and find that removed content reflects more dangerous actions, self-harm tendencies, and vulnerability than posts that remain public. Our work provides early insights into content removal in a sensitive community and addresses the future research implications of the findings.
Article
Full-text available
Social network research has begun to take advantage of fine-grained communications regarding coordination, decision-making, and knowledge sharing. These studies, however, have not generally analyzed how external events are associated with a social network's structure and communicative properties. Here, we study how external events are associated with a network's change in structure and communications. Analyzing a complete dataset of millions of instant messages among the decision-makers in a large hedge fund and their network of outside contacts, we investigate the link between price shocks, network structure, and change in the affect and cognition of decision-makers embedded in the network. When price shocks occur the communication network tends not to display structural changes associated with adaptiveness. Rather, the network "turtles up". It displays a propensity for higher clustering, strong tie interaction, and an intensification of insider vs. outsider communication. Further, we find changes in network structure predict shifts in cognitive and affective processes, execution of new transactions, and local optimality of transactions better than prices, revealing the important predictive relationship between network structure and collective behavior within a social network.
Conference Paper
Full-text available
It is generally accepted as common wisdom that receiving social feedback is helpful to (i) keep an individual engaged with a community and to (ii) facilitate an individual's positive behavior change. However, quantitative data on the effect of social feedback on continued engagement in an online health community is scarce. In this work we apply Mahalanobis Distance Matching (MDM) to demonstrate the importance of receiving feedback in the "loseit" weight loss community on Reddit. Concretely we show that (i) even when correcting for differences in word choice, users receiving more positive feedback on their initial post are more likely to return in the future, and that (ii) there are diminishing returns and social feedback on later posts is less important than for the first post. We also give a description of the type of initial posts that are more likely to attract this valuable social feedback. Though we cannot yet argue about ultimate weight loss success or failure, we believe that understanding the social dynamics underlying online health communities is an important step to devise more effective interventions.
Article
Full-text available
We seek to measure political candidates' ideological positioning from their speeches. To accomplish this, we infer ideological cues from a corpus of political writings annotated with known ideologies. We then represent the speeches of U.S. Presidential candidates as sequences of cues and lags (filler distinguished only by its length in words). We apply a domain-informed Bayesian HMM to infer the proportions of ideologies each candidate uses in each campaign. The results are validated against a set of preregistered, domain expert-authored hypotheses.
Conference Paper
Full-text available
We present an approach to detecting hate speech in online text, where hate speech is defined as abusive speech targeting specific group characteristics, such as ethnic origin, religion, gender, or sexual orientation. While hate speech against any group may exhibit some common characteristics, we have observed that hatred against each different group is typically characterized by the use of a small set of high frequency stereotypical words; however, such words may be used in either a positive or a negative sense, making our task similar to that of words sense disambiguation. In this paper we describe our definition of hate speech, the collection and annotation of our hate speech corpus, and a mechanism for detecting some commonly used methods of evading common "dirty word" filters. We describe pilot classification experiments in which we classify anti-semitic speech reaching an accuracy 94%, precision of 68% and recall at 60%, for an F1 measure of. 6375.
Article
Full-text available
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: (i) matched sampling on the univariate propensity score, which is a generalization of discriminant matching, (ii) multivariate adjustment by subclassification on the propensity score where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and (iii) visual representation of multivariate covariance adjustment by a two- dimensional plot.
Conference Paper
Full-text available
Text messaging through the Internet or cellular phones has become a major medium of personal and commercial communication In the same time, flames (such as rants, taunts, and squalid phrases) are offensive/abusive phrases which might attack or offend the users for a variety of reasons An automatic discriminative software with a sensitivity parameter for flame or abusive language detection would be a useful tool Although a human could recognize these sorts of useless annoying texts among the useful ones, it is not an easy task for computer programs In this paper, we describe an automatic flame detection method which extracts features at different conceptual levels and applies multi-level classification for flame detection While the system is taking advantage of a variety of statistical models and rule-based patterns, there is an auxiliary weighted pattern repository which improves accuracy by matching the text to its graded entries.
Conference Paper
Full-text available
We present two studies of online ephemerality and anonymity based on the popular discussion board /b/ at 4chan.org: a website with over 7 million users that plays an influential role in Internet culture. Although re-searchers and practitioners often assume that user iden-tity and data permanence are central tools in the design of online communities, we explore how /b/ succeeds de-spite being almost entirely anonymous and extremely ephemeral. We begin by describing /b/ and performing a content analysis that suggests the community is dom-inated by playful exchanges of images and links. Our first study uses a large dataset of more than five million posts to quantify ephemerality in /b/. We find that most threads spend just five seconds on the first page and less than five minutes on the site before expiring. Our sec-ond study is an analysis of identity signals on 4chan, finding that over 90% of posts are made by fully anony-mous users, with other identity signals adopted and dis-carded at will. We describe alternative mechanisms that /b/ participants use to establish status and frame their interactions.
Article
Full-text available
In observational studies designed to estimate the effects of interventions or exposures, such as cigarette smoking, it is desirable to try to control background differences between the treated group (e.g., current smokers) and the control group (e.g., never smokers) on covariates X (e.g., age, education). Matched sampling attempts to effect this control by selecting subsets of the treated and control groups with similar distributions of such covariates. This paper examines the consequences of matching using affinely invariant methods when the covariate distributions are ``discriminant mixtures of proportional ellipsoidally symmetric'' (DMPES) distributions, a class herein defined, which generalizes the ellipsoidal symmetry class of Rubin and Thomas [Ann. Statist. 20 (1992) 1079--1093]. The resulting generalized results help indicate why earlier results hold quite well even when the simple assumption of ellipsoidal symmetry is not met [e.g., Biometrics 52 (1996) 249--264]. Extensions to conditionally affinely invariant matching with conditionally DMPES distributions are also discussed.
Article
Platforms like Reddit have attracted large and vibrant communities, but the individuals in those communities are free to migrate to other platforms at any time. History has borne this out with the mass migration from Slashdot to Digg. The underlying motivations of individuals who migrate between platforms, and the conditions that favor migration online are not well-understood. We examine Reddit during a period of community unrest affecting millions of users in the summer of 2015, and analyze large-scale changes in user behavior and migration patterns to Reddit-like alternative platforms. Using self-reported statements from user comments, surveys, and a computational analysis of the activity of users with accounts on multiple platforms, we identify the primary motivations driving user migration. While a notable number of Reddit users left for other platforms, we found that an important pull factor that enabled Reddit to retain users was its long tail of niche content. Other platforms may reach critical mass to support popular or “mainstream” topics, but Reddit’s large userbase provides a key advantage in supporting niche topics.
Article
Recent work has demonstrated the value of social media monitoring for health surveillance (e.g., tracking influenza or depression rates). It is an open question whether such data can be used to make causal inferences (e.g., determining which activities lead to increased depression rates). Even in traditional, restricted domains, estimating causal effects from observational data is highly susceptible to confounding bias. In this work, we estimate the effect of exercise on mental health from Twitter, relying on statistical matching methods to reduce confounding bias. We train a text classifier to estimate the volume of a user's tweets expressing anxiety, depression, or anger, then compare two groups: those who exercise regularly (identified by their use of physical activity trackers like Nike+), and a matched control group. We find that those who exercise regularly have significantly fewer tweets expressing depression or anxiety; there is no significant difference in rates of tweets expressing anger. We additionally perform a sensitivity analysis to investigate how the many experimental design choices in such a study impact the final conclusions, including the quality of the classifier and the construction of the control group.
Article
Although the social medium Twitter grants users freedom of speech, its instantaneous nature and retweeting features also amplify hate speech. Because Twitter has a sizeable black constituency, racist tweets against blacks are especially detrimental in the Twitter community, though this effect may not be obvious against a backdrop of half a billion tweets a day.1 We apply a supervised machine learning approach, employing inexpensively acquired labeled data from diverse Twitter accounts to learn a binary classifier for the labels “racist” and “nonracist.” The classifier has a 76% average accuracy on individual tweets, suggesting that with further improvements, our work can contribute data on the sources of anti-black hate speech.
Book
How insights from the social sciences, including social psychology and economics, can improve the design of online communities. Online communities are among the most popular destinations on the Internet, but not all online communities are equally successful. For every flourishing Facebook, there is a moribund Friendster—not to mention the scores of smaller social networking sites that never attracted enough members to be viable. This book offers lessons from theory and empirical research in the social sciences that can help improve the design of online communities. The authors draw on the literature in psychology, economics, and other social sciences, as well as their own research, translating general findings into useful design claims. They explain, for example, how to encourage information contributions based on the theory of public goods, and how to build members' commitment based on theories of interpersonal bond formation. For each design claim, they offer supporting evidence from theory, experiments, or observational studies.
Conference Paper
Since its earliest days, harassment and abuse have plagued the Internet. Recent research has focused on in-domain methods to detect abusive content and faces several challenges, most notably the need to obtain large training corpora. In this paper, we introduce a novel computational approach to address this problem called Bag of Communities (BoC)---a technique that leverages large-scale, preexisting data from other Internet communities. We then apply BoC toward identifying abusive behavior within a major Internet community. Specifically, we compute a post's similarity to 9 other communities from 4chan, Reddit, Voat and MetaFilter. We show that a BoC model can be used on communities "off the shelf" with roughly 75% accuracy---no training examples are needed from the target community. A dynamic BoC model achieves 91.18% accuracy after seeing 100,000 human-moderated posts, and uniformly outperforms in-domain methods. Using this conceptual and empirical work, we argue that the BoC approach may allow communities to deal with a range of common problems, like abusive behavior, faster and with fewer engineering resources.
Conference Paper
Online communities have the potential to be supportive, cruel, or anywhere in between. The development of positive norms for interaction can help users build bonds, grow, and learn. Using millions of messages sent in Twitch chatrooms, we explore the effectiveness of methods for encouraging and discouraging specific behaviors, including taking advantage of imitation effects through setting positive examples and using moderation tools to discourage antisocial behaviors. Consistent with aspects of imitation theory and deterrence theory, users imitated examples of behavior that they saw, and more so for behaviors from high status users. Proactive moderation tools, such as chat modes which restricted the ability to post certain content, proved effective at discouraging spam behaviors, while reactive bans were able to discourage a wider variety of behaviors. This work considers the intersection of tools, authority, and types of behaviors, offering a new frame through which to consider the development of moderation strategies.
Conference Paper
Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior.
Conference Paper
Social network research has begun to take advantage of fine-grained communications regarding coordination, decision-making, and knowledge sharing. These studies, however, have not generally analyzed how external events are associated with a social network's structure and communicative properties. Here, we study how external events are associated with a network's change in structure and communications. Analyzing a complete dataset of millions of instant messages among the decision-makers in a large hedge fund and their network of outside contacts, we investigate the link between price shocks, network structure, and change in the affect and cognition of decision-makers embedded in the network. When price shocks occur the communication network tends not to display structural changes associated with adaptiveness. Rather, the network 'turtles up'. It displays a propensity for higher clustering, strong tie inter- action, and an intensification of insider vs. outsider communication. Further, we find changes in network structure pre- dict shifts in cognitive and affective processes, execution of new transactions, and local optimality of transactions better than prices, revealing the important predictive relationship between network structure and collective behavior within a social network.
Conference Paper
Pro-eating disorder (pro-ED) communities on social media encourage the adoption and maintenance of disordered eating habits as acceptable alternative lifestyles rather than threats to health. In particular, the social networking site Instagram has reacted by banning searches on several pro-ED tags and issuing content advisories on others. We pre-sent the first large-scale quantitative study investigating pro-ED communities on Instagram in the aftermath of moderation -- our dataset contains 2.5M posts between 2011 and 2014. We find that the pro-ED community has adopted non-standard lexical variations of moderated tags to circumvent these restrictions. In fact, increasingly complex lexical variants have emerged over time. Communities that use lexical variants show increased participation and support of pro-ED (15-30%). Finally, the tags associated with content on these variants express more toxic, self-harm, and vulnerable content. Despite Instagram’s moderation strategies, pro-ED communities are active and thriving. We discuss the effectiveness of content moderation as an intervention for communities of deviant behavior.
Conference Paper
Social media has emerged as a promising source of data for public health. This paper examines how these platforms can provide empirical quantitative evidence for understanding dietary choices and nutritional challenges in “food deserts” -- Census tracts characterized by poor access to healthy and affordable food. We present a study of 3 million food related posts shared on Instagram, and observe that content from food deserts indicate consumption of food high in fat, cholesterol and sugar; a rate higher by 5-17% compared to non-food desert areas. Further, a topic model analysis reveals the ingestion language of food deserts to bear distinct attributes. Finally, we investigate to what extent Instagram ingestion language is able to infer whether a tract is a food desert. We find that a predictive model that uses ingestion topics, socio-economic and food deprivation status attributes yields high accuracy (>80%) and improves over baseline methods by 6-14%. We discuss the role of social media in helping address inequalities in food access and health.
Conference Paper
This paper explores temporary identities on social media platforms and individuals' uses of these identities with respect to their perceptions of anonymity. Given the research on multiple profile maintenance, little research has examined the role that some social media platforms play in affording users with temporary identities. Further, most of the research on anonymity stops short of the concept of varying perceptions of anonymity. This paper builds on these research areas by describing the phenomenon of temporary "throwaway accounts" and their uses on reddit.com, a popular social news site. In addition to ethnographic trace analysis to examine the contexts in which throwaway accounts are adopted, this paper presents a predictive model that suggests that perceptions of anonymity significantly shape the potential uses of throwaway accounts and that women are much more likely to adopt temporary identities than men.
Article
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Previous theoretical arguments have shown that subclassification on the propensity score will balance all observed covariates. Subclassification on an estimated propensity score is illustrated, using observational data on treatments for coronary artery disease. Five subclasses defined by the estimated propensity score are constructed that balance 74 covariates, and thereby provide estimates of treatment effects using direct adjustment. These subclasses are applied within sub-populations, and model-based adjustments are then used to provide estimates of treatment effects within these sub-populations. Two appendixes address theoretical issues related to the application: the effectiveness of subclassification on the propensity score in removing bias, and balancing properties of propensity scores with incomplete data.
Chapter
P. R. Rosenbaum received his PhD in Statistics from Harvard University in 1980. Since 1986, he has been in the Statistics Department of the Wharton School of the University of Pennsylvania. He is a fellow of the American Statistical Association and he received the George W. Snedecor Award from the Committee of Presidents of Statistical Societies.
Book
An observational study is a nonexperimental investigation of the effects caused by a treatment. Unlike an experiment, in an observational study, the investigator does not control the assignment of treatments, with the consequence that the individuals in different treatment groups may not have been comparable prior to treatment. Analytical adjustments, such as matching, are used to remove overt bias, that is, pretreatment differences that are accurately measured and recorded. There may be pretreatment differences that were not recorded, called hidden biases, and addressing these is a central concern.
Article
Twitter is often used in quantitative studies that identify geographically-preferred topics, writing styles, and entities. These studies rely on either GPS coordinates attached to individual messages, or on the user-supplied location field in each profile. In this paper, we compare these data acquisition techniques and quantify the biases that they introduce; we also measure their effects on linguistic analysis and text-based geolocation. GPS-tagging and self-reported locations yield measurably different corpora, and these linguistic differences are partially attributable to differences in dataset composition by age and gender. Using a latent variable model to induce age and gender, we show how these demographic variables interact with geography to affect language use. We also show that the accuracy of text-based geolocation varies with population demographics, giving the best results for men above the age of 40.
Article
User contributions in the form of posts, comments, and votes are essential to the success of online communities. However, allowing user participation also invites undesirable behavior such as trolling. In this paper, we characterize antisocial behavior in three large online discussion communities by analyzing users who were banned from these communities. We find that such users tend to concentrate their efforts in a small number of threads, are more likely to post irrelevantly, and are more successful at garnering responses from other users. Studying the evolution of these users from the moment they join a community up to when they get banned, we find that not only do they write worse than other users over time, but they also become increasingly less tolerated by the community. Further, we discover that antisocial behavior is exacerbated when community feedback is overly harsh. Our analysis also reveals distinct groups of users with different levels of antisocial behavior that can change over time. We use these insights to identify antisocial users early on, a task of high practical importance to community maintainers.
Article
The Something Awful Forums (SAF) is an online community comprised of a loosely connected federation of forums, united in a distinctive brand of humor with a focus on the quality of member contributions. In this case study we find that the site has sustained success while deviating from common conventions and norms of online communities. Humor and the quality of content contributed by SAF members foster practices that seem counterintuitive to the development of a stable and thriving community. In this case study we show how design decisions are contextual and inter-dependent and together these heuristics create a different kind of online third place that challenges common practices.
Article
Social media systems rely on user feedback and rating mechanisms for personalization, ranking, and content filtering. However, when users evaluate content contributed by fellow users (e.g., by liking a post or voting on a comment), these evaluations create complex social feedback effects. This paper investigates how ratings on a piece of content affect its author's future behavior. By studying four large comment-based news communities, we find that negative feedback leads to significant behavioral changes that are detrimental to the community. Not only do authors of negatively-evaluated content contribute more, but also their future posts are of lower quality, and are perceived by the community as such. Moreover, these authors are more likely to subsequently evaluate their fellow users negatively, percolating these effects through the community. In contrast, positive feedback does not carry similar effects, and neither encourages rewarded authors to write more, nor improves the quality of their posts. Interestingly, the authors that receive no feedback are most likely to leave a community. Furthermore, a structural analysis of the voter network reveals that evaluations polarize the community the most when positive and negative votes are equally split.
Article
As online communities grow and the volume of user-generated content increases, the need for community management also rises. Community management has three main purposes: to create a positive experience for existing participants, to promote appropriate, socionormative behaviors, and to encourage potential participants to make contributions. Research indicates that the quality of content a potential participant sees on a site is highly influential; off-topic, negative comments with malicious intent are a particularly strong boundary to participation or set the tone for encouraging similar contributions. A problem for community managers, therefore, is the detection and elimination of such undesirable content. As a community grows, this undertaking becomes more daunting. Can an automated system aid community managers in this task? In this paper, we address this question through a machine learning approach to automatic detection of inappropriate negative user contributions. Our training corpus is a set of comments from a news commenting site that we tasked Amazon Mechanical Turk workers with labeling. Each comment is labeled for the presence of profanity, insults, and the object of the insults. Support vector machines trained on these data are combined with relevance and valence analysis systems in a multistep approach to the detection of inappropriate negative user contributions. The system shows great potential for semiautomated community management. © 2012 Wiley Periodicals, Inc.
Chapter
When subjects are consecutively given more than one treatment in an experiment, including for crossover design, earlier treatments may affect the results observed during later treatments. To counteract this, the treatment order may be varied between different subjects, or counterbalanced. While complete counterbalancing often requires large numbers of participants, partial counterbalancing, often through the use of a Latin squares design, can provide a reasonable alternative.
Article
Offensive language has arisen to be a big issue to the health of both online communities and their users. To the online community, the spread of offensive language undermines its reputation, drives users away, and even directly affects its growth. To users, viewing offensive language brings negative influence to their mental health, especially for children and youth. When offensive language is detected in a user message, a problem arises about how the offensive language should be removed, i.e. the offensive language filtering problem. To solve this problem, manual filtering approach is known to produce the best filtering result. However, manual filtering is costly in time and labor thus can not be widely applied. In this paper, we analyze the offensive language in text messages posted in online communities, and propose a new automatic sentence-level filtering approach that is able to semantically remove the offensive language by utilizing the grammatical relations among words. Comparing with ex-isting automatic filtering approaches, the proposed filtering approach provides filtering results much closer to manual filtering. To demonstrate our work, we created a dataset by manu-ally filtering over 11,000 text comments from the YouTube website. Experiments on this dataset show over 90% agree-ment in filtered results between the proposed approach and manual filtering approach. Moreover, we show the overhead of applying proposed approach to user comments filtering is reasonable, making it practical to be adopted in real life applications.
Conference Paper
Can a system of distributed moderation quickly and consistently separate high and low quality comments in an online conversation? Analysis of the site Slashdot.org suggests that the answer is a qualified yes, but that important challenges remain for designers of such systems. Thousands of users act as moderators. Final scores for comments are reasonably dispersed and the community generally agrees that moderations are fair. On the other hand, much of a conversation can pass before the best and worst comments are identified. Of those moderations that were judged unfair, only about half were subsequently counterbalanced by a moderation in the other direction. And comments with low scores, not at top-level, or posted late in a conversation were more likely to be overlooked by moderators.
Conference Paper
Generative models of text typically associate a multinomial with every class label or topic. Even in simple models this requires the estimation of thousands of parameters; in multifaceted latent variable models, standard approaches require additional latent "switching" variables for every token, complicating inference. In this paper, we propose an alternative generative model for text. The central idea is that each class label or latent topic is endowed with a model of the deviation in log-frequency from a constant background distribution. This approach has two key advantages: we can enforce sparsity to prevent overfitting, and we can combine generative facets through simple addition in log space, avoiding the need for latent switching variables. We demonstrate the applicability of this idea to a range of scenarios: classification, topic modeling, and more complex multifaceted generative models.
Conference Paper
The scourge of cyberbullying has assumed alarming proportions with an ever-increasing number of adolescents admitting to having dealt with it either as a victim or as a bystander. Anonymity and the lack of meaningful supervision in the electronic medium are two factors that have exacerbated this social menace. Comments or posts involving sensitive topics that are personal to an individual are more likely to be internalized by a victim, often resulting in tragic outcomes. We decompose the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems. We experiment with a corpus of 4500 YouTube comments, applying a range of binary and multiclass classifiers. We find that binary classifiers for individual labels outperform multiclass classifiers. Our findings show that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.
Article
Subtitled "An introduction to human ecology," this work attempts systematically to treat "least effort" (and its derivatives) as the principle underlying a multiplicity of individual and collective behaviors, variously but regularly distributed. The general orientation is quantitative, and the principle is widely interpreted and applied. After a brief elaboration of principles and a brief summary of pertinent studies (mostly in psychology), Part One (Language and the structure of the personality) develops 8 chapters on its theme, ranging from regularities within language per se to material on individual psychology. Part Two (Human relations: a case of intraspecies balance) contains chapters on "The economy of geography," "Intranational and international cooperation and conflict," "The distribution of economic power and social status," and "Prestige values and cultural vogues"—all developed in terms of the central theme. 20 pages of references with some annotation, keyed to the index. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Many scientific problems require that treatment comparisons be adjusted for posttreatment variables, but the estimands underlying standard methods are not causal effects. To address this deficiency, we propose a general framework for comparing treatments adjusting for posttreatment variables that yields principal effects based on principal stratification. Principal stratification with respect to a posttreatment variable is a cross-classification of subjects defined by the joint potential values of that posttreatment variable tinder each of the treatments being compared. Principal effects are causal effects within a principal stratum. The key property of principal strata is that they are not affected by treatment assignment and therefore can be used just as any pretreatment covariate. such as age category. As a result, the central property of our principal effects is that they are always causal effects and do not suffer from the complications of standard posttreatment-adjusted estimands. We discuss briefly that such principal causal effects are the link between three recent applications with adjustment for posttreatment variables: (i) treatment noncompliance, (ii) missing outcomes (dropout) following treatment noncompliance. and (iii) censoring by death. We then attack the problem of surrogate or biomarker endpoints, where we show, using principal causal effects, that all current definitions of surrogacy, even when perfectly true, do not generally have the desired interpretation as causal effects of treatment on outcome. We go on to forrmulate estimands based on principal stratification and principal causal effects and show their superiority.
Article
The difference-in-differences (DID) estimator is one of the most popular tools for applied research in economics to evaluate the effects of public interventions and other treatments of interest on some relevant outcome variables. However, it is well known that the DID estimator is based on strong identifying assumptions. In particular, the conventional DID estimator requires that, in the absence of the treatment, the average outcomes for the treated and control groups would have followed parallel paths over time. This assumption may be implausible if pre-treatment characteristics that are thought to be associated with the dynamics of the outcome variable are unbalanced between the treated and the untreated. That would be the case, for example, if selection for treatment is influenced by individual-transitory shocks on past outcomes (Ashenfelter's dip). This article considers the case in which differences in observed characteristics create non-parallel outcome dynamics between treated and controls. It is shown that, in such a case, a simple two-step strategy can be used to estimate the average effect of the treatment for the treated. In addition, the estimation framework proposed in this article allows the use of covariates to describe how the average effect of the treatment varies with changes in observed characteristics.
These are the 5 subreddits Reddit banned under its game-changing anti-harassment policy - and why it banned them
  • Caitlin Dewey
Reddit (Finally) Bans CoonTown. http://gawker.com/reddit-finally-bans-coontown-1722332877
  • Sam Biddle
Reddit Users Revolt After Site BansFat People Hate" And Other Communities. https://www.buzzfeed.com/mbvd/reddit-users-revolt-after-site-bans-fat-people-hate-and-othe
  • Michelle Broder Van Dyke
Black Hole. https://www.splcenter.org/fighting-hate/intelligence-report
  • Keegan Hankes
Reddit announces new anti-harassment rules
  • Adi Robertson
A First Amendment For Social Platforms. https://medium.com/@BuzzFeed/a-first-amendment-for-social-platforms-202c0eab7054
  • Nabiha Syed
  • Ben Smith
Hate speech, machine classification and statistical modelling of information flows on Twitter: Interpretation and communication for policy decision making
  • Peter Burnap
  • Matthew Leighton Williams
Controlling Bad Behavior in Online Communities: An Examination of Moderation Work
  • Aiden R Mcgillicuddy Jean-Gregoire Bernard
  • Jocelyn Ann Cranefield
Coontown": A noxious racist corner of Reddit survives recent purge
  • Justin Wm Moyer