Conference Paper

AI-Driven Mediation Strategies for Audience Depolarisation in Online Debates

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Automated measures, such as content removal and account suspensions, are scalable and can effectively reduce online hate speech [21]. However, automated measures that are not properly calibrated may falsely remove content, which may be perceived as an infringement on individuals' freedom of speech [22,24,38] and thus even spur more hostility [26,34]. In contrast, manual moderation, such as the removal of problematic content or accounts by human moderators, can be more precise [21]. ...
... Previous research has demonstrated that LLM-generated messages are generally persuasive across various applications [35] but outside of counterspeech. For example, LLMs can generate messages that successfully mediate between opposing groups [59], decrease conspiracy beliefs [17] and promote civility in online conversations [6,22]. Thus, it is likely that crafting custom messages through an LLM could also encourage online offenders to reconsider their hateful posts and, therefore, potentially reduce hate speech. ...
... Generally, whether LLMs are persuasive varies across different use cases [57,63]. For example, outside of counterspeech, some works ask users to have long discussions with chatbots and then assess whether their beliefs have changed as a result [6,17,22,59]. In contrast, one-time interventions such as counterspeech are minimally invasive and may thus be ineffective. ...
Preprint
Full-text available
Online hate speech poses a serious threat to individual well-being and societal cohesion. A promising solution to curb online hate speech is counterspeech. Counterspeech is aimed at encouraging users to reconsider hateful posts by direct replies. However, current methods lack scalability due to the need for human intervention or fail to adapt to the specific context of the post. A potential remedy is the use of generative AI, specifically large language models (LLMs), to write tailored counterspeech messages. In this paper, we analyze whether contextualized counterspeech generated by state-of-the-art LLMs is effective in curbing online hate speech. To do so, we conducted a large-scale, pre-registered field experiment (N=2,664) on the social media platform Twitter/X. Our experiment followed a 2x2 between-subjects design and, additionally, a control condition with no counterspeech. On the one hand, users posting hateful content on Twitter/X were randomly assigned to receive either (a) contextualized counterspeech or (b) non-contextualized counterspeech. Here, the former is generated through LLMs, while the latter relies on predefined, generic messages. On the other hand, we tested two counterspeech strategies: (a) promoting empathy and (b) warning about the consequences of online misbehavior. We then measured whether users deleted their initial hateful posts and whether their behavior changed after the counterspeech intervention (e.g., whether users adopted a less toxic language). We find that non-contextualized counterspeech employing a warning-of-consequence strategy significantly reduces online hate speech. However, contextualized counterspeech generated by LLMs proves ineffective and may even backfire.
... To enhance the generalizability of our findings, we adopted two measures: (1) We chose two social topics for discussion instead of one. Specifically, drawing from recent HCI research on social discussions, we chose the topics "Should self-driving cars be allowed on roads?" [32] and "Do violent video games promote violence in youths?" [91]. We selected these two topics because they are highly relevant to the everyday lives of our participants, making it easier for them to have opinions and thoughts on them. ...
... For measures that did not meet the normality assumption, we applied the Kruskal-Wallis test and performed Dunn's test as a post-hoc analysis. For the second analysis, we followed previous literature [32] to define the polarization of a stance as | − neutral |, where neutral represents the neutral midpoint on a rating scale. Given our 6-point scale, we assigned neutral = 3.5. ...
... • Topic 1 -Self-Driving Cars Should be allowed on Public Roads. (Topic: [32]) ...
Preprint
Multi-agent systems - systems with multiple independent AI agents working together to achieve a common goal - are becoming increasingly prevalent in daily life. Drawing inspiration from the phenomenon of human group social influence, we investigate whether a group of AI agents can create social pressure on users to agree with them, potentially changing their stance on a topic. We conducted a study in which participants discussed social issues with either a single or multiple AI agents, and where the agents either agreed or disagreed with the user's stance on the topic. We found that conversing with multiple agents (holding conversation content constant) increased the social pressure felt by participants, and caused a greater shift in opinion towards the agents' stances on each topic. Our study shows the potential advantages of multi-agent systems over single-agent platforms in causing opinion change. We discuss design implications for possible multi-agent systems that promote social good, as well as the potential for malicious actors to use these systems to manipulate public opinion.
... AI-enabled agents are increasingly deployed into online social environments such as group conversational and collaborative spaces as assistants, facilitators, or collaborators to support varied tasks such as conversation summarization [65], brainstorming [47], and conflict resolution [17,58]. Integrated into part of the group context, these agents insert automated messages into group spaces where human-human conversation and interaction naturally occur. ...
... Agents have shown promise to support group tasks such as idea generation [47], information seeking [4], and decision-making [42,49]. They can also intervene in team interpersonal dynamics to, for instance, support conflict resolution [17] or encourage balanced human participation [13]. These agents function in a diverse range of group contexts, including video conferencing [2,41,47] and collaborative canvas [21]; and most commonly, agent interjects into human-human conversations by posting text messages such as facilitation messages to encourage participation [13] or relevant information to help decision-making [72]. ...
Preprint
Full-text available
AI agents are increasingly tasked with making proactive suggestions in online spaces where groups collaborate, but can be unhelpful or even annoying, due to not fitting the group's preferences or behaving in socially inappropriate ways. Fortunately, group spaces have a rich history of prior social interactions and affordances for social feedback to support creating agents that align to a group's interests and norms. We present Social-RAG, a workflow for grounding agents to social information about a group, which retrieves from prior group interactions, selects relevant social signals, and then feeds the context into a large language model to generate messages to the group. We implement this into PaperPing, our system that posts academic paper recommendations in group chat, leveraging social signals determined from formative studies with 39 researchers. From a three-month deployment in 18 channels, we observed PaperPing posted relevant messages in groups without disrupting their existing social practices, fostering group common ground.
... These platforms have since been adopted by various communities and elected bodies and integrated into participatory democracy processes. Such platforms, including some social media platforms, have been observed to be advantageous in (a) supporting "constructive rather than confrontational discourse" [62], (b) persuading large-scale engagement on policy issues through voting and sharing of perspectives, concerns and needs [40,102,113], (c) facilitating awareness and analysis of diverse perspectives and opinions [17,40,41,62,120], and (d) promote critical consciousness and action (especially among youth) on socio-political issues [17]. Moreover, recent research has also examined the communicative ecologies of politicians and how policymakers of different ideologies and partisan affiliations collaborate on legislative matters [117]. ...
... Yeo at al., [120] investigated how to use LLM to generate different types of reflective textual nudges finding that persona-oriented ones can generate a more deliberative environment. Similarly, Gover et al. [41], used a bot to implement different conflict resolution strategies [107], finding that cooperative strategies work better than forceful ones. ...
Article
Full-text available
Digitally-supported participatory methods are often used in policy-making to develop inclusive policies by collecting and integrating citizen's opinions. However, these methods fail to capture the complexity and nuances in citizen's needs, i.e., citizens are generally unaware of other's needs, perspectives, and experiences. Consequently, policies developed with this underlying gap tend to overlook the alignment of multistakeholder perspectives, and design policies based on the optimization of high-level demographic features. In our contribution , we propose a method to enable citizens understand other's perspectives and calibrate their positions. First, we collected requirements and design principles to develop our approach by involving stakeholders and experts in policymaking in a series of workshops. Then, we conducted a crowdsourcing study with 420 participants to compare the effect of different text and images, on people's initial and final motivations and their willingness to change opinions. We observed that both influence participant's opinion change, however, the effect is more pronounced for textual modality. Finally, we discuss overarching implications of designing with empathy to mediate alignment of citizen's perspectives.
... Costello et al. [17] instructed an LLM to debunk conspiracy theories, achieving notable long-term effects, with persuasion lasting up to two months post-intervention. Finally, several studies examined the use of LLMs as a moderation tool for mitigating conflicts and for reducing toxic messages on social media [10,33,40]. In spite of these results however, the persuasiveness of AI-generated contextualized counterspeech is still unexplored. ...
Preprint
AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct an LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation.
... Several researchers have used LLMs as a mediator to improve online argumentation. For example, Govers et al. [33] conducted an experiment with American participants, where people reviewed polarizing online threads containing comments both from public and LLM-based mediators. They found that highly cooperative and persuasive strategies deployed by mediator-bots could successfully change reader' opinions on polarizing issues. ...
Preprint
Full-text available
This paper examines how large language models (LLMs) can help people write constructive comments in online debates on divisive social issues and whether the notions of constructiveness vary across cultures. Through controlled experiments with 600 participants from India and the US, who reviewed and wrote constructive comments on online threads on Islamophobia and homophobia, we found potential misalignment in how LLMs and humans perceive constructiveness in online comments. While the LLM was more likely to view dialectical comments as more constructive, participants favored comments that emphasized logic and facts more than the LLM did. Despite these differences, participants rated LLM-generated and human-AI co-written comments as significantly more constructive than those written independently by humans. Our analysis also revealed that LLM-generated and human-AI co-written comments exhibited more linguistic features associated with constructiveness compared to human-written comments on divisive topics. When participants used LLMs to refine their comments, the resulting comments were longer, more polite, positive, less toxic, and more readable, with added argumentative features that retained the original intent but occasionally lost nuances. Based on these findings, we discuss ethical and design considerations in using LLMs to facilitate constructive discourse online.
Preprint
Full-text available
As artificial intelligence (AI) technologies, including generative AI, continue to evolve, concerns have arisen about over-reliance on AI, which may lead to human deskilling and diminished cognitive engagement. Over-reliance on AI can also lead users to accept information given by AI without performing critical examinations, causing negative consequences, such as misleading users with hallucinated contents. This paper introduces extraheric AI, a human-AI interaction conceptual framework that fosters users' higher-order thinking skills, such as creativity, critical thinking, and problem-solving, during task completion. Unlike existing human-AI interaction designs, which replace or augment human cognition, extraheric AI fosters cognitive engagement by posing questions or providing alternative perspectives to users, rather than direct answers. We discuss interaction strategies, evaluation methods aligned with cognitive load theory and Bloom's taxonomy, and future research directions to ensure that human cognitive skills remain a crucial element in AI-integrated environments, promoting a balanced partnership between humans and AI.
Article
Full-text available
Chatbots have become an increasingly popular tool in the field of health services and communications. Despite chatbots' significance amid the COVID-19 pandemic, few studies have performed a rigorous evaluation of the effectiveness of chatbots in improving vaccine confidence and acceptance. In Thailand, Hong Kong, and Singapore, from February 11th to June 30th, 2022, we conducted multisite randomised controlled trials (RCT) on 2,045 adult guardians of children and seniors who were unvaccinated or had delayed vaccinations. After a week of using COVID-19 vaccine chatbots, the differences in vaccine confidence and acceptance were compared between the intervention and control groups. Compared to non-users, fewer chatbot users reported decreased confidence in vaccine effectiveness in the Thailand child group [Intervention: 4.3 % vs. Control: 17%, P = 0.023]. However, more chatbot users reported decreased vaccine acceptance [26% vs. 12%, P = 0.028] in Hong Kong child group and decreased vaccine confidence in safety [29% vs. 10%, P = 0.041] in Singapore child group. There was no statistically significant change in vaccine confidence or acceptance in the Hong Kong senior group. Employing the RE-AIM framework, process evaluation indicated strong acceptance and implementation support for vaccine chatbots from stakeholders, with high levels of sustainability and scalability. This multisite, parallel RCT study on vaccine chatbots found mixed success in improving vaccine confidence and acceptance among unvaccinated Asian subpopulations. Further studies that link chatbot usage and real-world vaccine uptake are needed to augment evidence for employing vaccine chatbots to advance vaccine confidence and acceptance.
Conference Paper
Full-text available
Intelligent agents are showing increasing promise for clinical decision-making in a variety of healthcare settings. While a substantial body of work has contributed to the best strategies to convey these agents’ decisions to clinicians, few have considered the impact of personalizing and customizing these communications on the clinicians’ performance and receptiveness. This raises the question of how intelligent agents should adapt their tone in accordance with their target audience. We designed two approaches to communicate the decisions of an intelligent agent for breast cancer diagnosis with different tones: a suggestive (non-assertive) tone and an imposing (assertive) one. We used an intelligent agent to inform about: (1) number of detected findings; (2) cancer severity on each breast and per medical imaging modality; (3) visual scale representing severity estimates; (4) the sensitivity and specificity of the agent; and (5) clinical arguments of the patient, such as pathological co-variables. Our results demonstrate that assertiveness plays an important role in how this communication is perceived and its benefits. We show that personalizing assertiveness according to the professional experience of each clinician can reduce medical errors and increase satisfaction, bringing a novel perspective to the design of adaptive communication between intelligent agents and clinicians.
Article
Full-text available
Sharing multimedia content, without obtaining consent from the people involved causes multiparty privacy conflicts (MPCs). However, social-media platforms do not proactively protect users from the occurrence of MPCs. Hence, users resort to out-of-band, informal communication channels, attempting to mitigate such conflicts. So far, previous works have focused on hard interventions that do not adequately consider the contextual factors (e.g., social norms, cognitive priming) or are employed too late (i.e., the content has already been seen). In this work, we investigate the potential of conversational agents as a medium for negotiating and mitigating MPCs. We designed MediationBot, a mediator chatbot that encourages consent collection, enables users to explain their points of view, and proposes solutions to finding a middle ground. We evaluated our design using a Wizard-of-Oz experiment with N = 32 participants, where we found that MediationBot can effectively help participants to reach an agreement and to prevent MPCs. It produced a structured conversation where participants had well-clarified speaking turns. Overall, our participants found MediationBot to be supportive as it proposes useful middle-ground solutions. Our work informs the future design of mediator agents to support social-media users against MPCs.
Conference Paper
Full-text available
Chatbots are increasingly used to replace human interviewers and survey forms for soliciting information from users. This paper presents two studies that investigate how the formality of a chatbot’s conversational style can affect the likelihood of users engaging with and disclosing sensitive information to a chatbot. In our first study, we show that the domain and sensitivity of the information being requested impact users’ preferred conversational style. Specifically, when users were asked to disclose sensitive health information, they perceived a formal style as more competent and appropriate. In our second study, we investigate the health domain further by analysing the quality of user utterances as users talk to a chatbot about their dental flossing. We found that users who do not floss every day gave higher quality responses when talking to a formal chatbot. These findings can help designers choose a chatbot’s language formality for their given use case.
Article
Full-text available
According to Mercier and Sperber (2009, 2011, 2017), people have an immediate and intuitive feeling about the strength of an argument. These intuitive evaluations are not captured by current evaluation methods of argument strength, yet they could be important to predict the extent to which people accept the claim supported by the argument. In an exploratory study, therefore, a newly developed intuitive evaluation method to assess argument strength was compared to an explicit argument strength evaluation method (the PAS scale; Zhao et al., 2011), on their ability to predict claim acceptance (predictive validity) and on their sensitivity to differences in the manipulated quality of arguments (construct validity). An experimental study showed that the explicit argument strength evaluation performed well on the two validity measures. The intuitive evaluation measure, on the other hand, was not found to be valid. Suggestions for other ways of constructing and testing intuitive evaluation measures are presented.
Article
Full-text available
Social conformity is the act of individuals adjusting their personal opinions to agree with an opposing majority. Previous work has identified multiple determinants of social conformity in controlled laboratory studies, but they remain largely untested in naturalistic online environments. For this study, we developed a realistic debating website, which 48 participants used for one week. We deployed four versions of the website using a 2 (high vs. low social presence) x 2 (high vs. low emphasis on majority–minority group composition) between-subjects factorial design. We found that participants were significantly more likely to conform when the platform promotes high social presence, despite its emphasis on group composition. Our qualitative findings further reveal how different aspects of social presence embedded in platform design (i.e., user representation, interactivity, and response visibility) contribute to heightened conformity behaviour. Our results provide evidence of the organic manifestation of conformity in online groups discussing subjective content and confirm the effect of platform design on online conformity behaviour. We conclude with a discussion on the implications of our findings on how future online platforms can be designed accounting for conformity influences.
Article
Full-text available
People often seek out information as a means of coping with challenging situations. Attuning to negative information can be adaptive because it alerts people to the risks in their environment, thereby preparing them for similar threats in the future. But is this behaviour adaptive during a pandemic when bad news is ubiquitous? We examine the emotional consequences of exposure to brief snippets of COVID-related news via a Twitter feed (Study 1), or a YouTube reaction video (Study 2). Compared to a no-information exposure group, consumption of just 2–4 minutes of COVID-related news led to immediate and significant reductions in positive affect (Studies 1 and 2) and optimism (Study 2). Exposure to COVID-related kind acts did not have the same negative consequences, suggesting that not all social media exposure is detrimental for well-being. We discuss strategies to counteract the negative emotional consequences of exposure to negative news on social media.
Article
Full-text available
Agentic narcissism and vulnerable narcissism have been widely studied in relation to social media use. However, with research on communal narcissism in its early stages, the current study examines communal narcissism in relation to social media use. Specifically, the current study investigates whether communal narcissism is related to use and frequency of use of the popular social networking sites Instagram, Reddit and Twitter, and if communal narcissism relates to the importance of receiving feedback and to the quality-rating of self-presented content on those platforms. A total of 334 individuals were recruited from Amazon Mechanical Turk, with two-thirds being male (66.7%). A regression analysis showed that communal narcissism was related to increased use of Instagram and Twitter but not Reddit. Sharing content, the importance of feedback and better than average ratings had positive associations with communal narcissism. The relationship between communal narcissism and sharing on social media was fully mediated by wanting validation on social media and higher ratings of self-presented content. Communal narcissism had a notably strong relationship with wanting validation on all platforms and our results suggest that communal narcissism might be especially relevant in the context of social media use.
Chapter
Full-text available
Social conformity is the act of individuals adjusting personal judgements to conform to expectations of opposing majorities in group settings. While conformity has been studied in online groups with emphasis on its contextual determinants (e.g., group size, social presence, task objectivity), the effect of age – of both the individual and the members of the opposing majority group – is yet to be thoroughly investigated. This study investigates differences in conformity behaviour in young adults (Generation Z) and middle-aged adults (Generation X) attempting an online group quiz containing stereotypically age-biased questions, when their personal responses are challenged by older and younger peers. Our results indicate the influence of age-related stereotypes on participants’ conformity behaviour with both young and middle-aged adults stereotypically perceiving the competency of their peers based on peer age. Specifically, participants were more inclined to conform to older majorities and younger majorities in quiz questions each age group was stereotypically perceived to be more knowledgeable about (1980’s history and social media & latest technology respectively). We discuss how our findings highlight the need to re-evaluate popular online user representations, to mitigate undesirable effects of age-related stereotypical perceptions leading to conformity.
Article
Full-text available
Online chat functions as a discussion channel for diverse social issues. However, deliberative discussion and consensus-reaching can be difficult in online chats in part because of the lack of structure. To explore the feasibility of a conversational agent that enables deliberative discussion, we designed and developed DebateBot, a chatbot that structures discussion and encourages reticent participants to contribute. We conducted a 2 (discussion structure: unstructured vs. structured) × 2 (discussant facilitation: unfacilitated vs. facilitated) between-subjects experiment (N = 64, 12 groups). Our findings are as follows: (1) Structured discussion positively affects discussion quality by generating diverse opinions within a group and resulting in a high level of perceived deliberative quality. (2) Facilitation drives a high level of opinion alignment between group consensus and independent individual opinions, resulting in authentic consensus reaching. Facilitation also drives more even contribution and a higher level of task cohesion and communication fairness. Our results suggest that a chatbot agent could partially substitute for a human moderator in deliberative discussions.
Article
Full-text available
This study estimates empirically derived guidelines for effect size interpretation for research in social psychology overall and sub-disciplines within social psychology, based on analysis of the true distributions of the two types of effect size measures widely used in social psychology (correlation coefficient and standardized mean differences). Analysis of empirically derived distributions of 12,170 correlation coefficients and 6,447 Cohen's d statistics extracted from studies included in 134 published meta-analyses revealed that the 25th, 50th, and 75th percentiles corresponded to correlation coefficient values of 0.12, 0.24, and 0.41 and to Cohen's d values of 0.15, 0.36, and 0.65 respectively. The analysis suggests that the widely used Cohen's guidelines tend to overestimate medium and large effect sizes. Empirically derived effect size distributions in social psychology overall and its sub-disciplines can be used both for effect size interpretation and for sample size planning when other information about effect size is not available.
Article
Full-text available
Research in online content moderation has a long history of exploring different forms that moderation can take, including both user-driven moderation models on community-based platforms like Wikipedia, Facebook Groups, and Reddit, and centralized corporate moderation models on platforms like Twitter and Instagram. In this work I review different approaches to moderation research with the goal of providing a roadmap for researchers studying community self-moderation. I contrast community-based moderation research with platforms and policies-focused moderation research, and argue that the former has an important role to play in shaping discussions about the future of online moderation. I provide six guiding questions for future research that, if answered, can support the development of a form of user-driven moderation that is widely implementable across a variety of social spaces online, offering an alternative to the corporate moderation models that dominate public debate and discussion.
Article
Full-text available
People make decisions every day or form an opinion based on persuasion processes, whether through advertising, planning leisure activities with friends or public speeches. Most of the time, however, subliminal persuasion processes triggered by behavioral cues (rather than the content of the message) play a far more important role than most people are aware of. To raise awareness of the different aspects of persuasion (how and what), we present a multimodal dialog system consisting of two virtual agents that use synthetic speech in a discussion setting to present pros and cons to a user on a controversial topic. The agents are able to adapt their emotions based on explicit feedback of the users to increase their perceived persuasiveness during interaction using Reinforcement Learning.
Article
Full-text available
Social conformity occurs when individuals in group settings change their personal opinion to be in agreement with the majority's position. While recent literature frequently reports on conformity in online group settings, the causes for online conformity are yet to be fully understood. This study aims to understand how social presencei.e., the sense of being connected to others via mediated communication, influences conformity among individuals placed in online groups while answering subjective and objective questions. Acknowledging its multifaceted nature, we investigate three aspects of online social presence: user representation (generic vs.user-specific avatars), interactivity (discussion vs.no discussion ), and response visibility (public vs.private ). Our results show an overall conformity rate of 30% and main effects from task objectivity, group size difference between the majority and the minority, and self-confidence on personal answer. Furthermore, we observe an interaction effect between interactivity and response visibility, such that conformity is highest in the presence of peer discussion and public responses, and lowest when these two elements are absent. We conclude with a discussion on the implications of our findings in designing online group settings, accounting for the effects of social presence on conformity.
Article
Full-text available
Extreme, anti-establishment actors are being characterized increasingly as ‘dangerous individuals’ by the social media platforms that once aided in making them into ‘Internet celebrities’. These individuals (and sometimes groups) are being ‘deplatformed’ by the leading social media companies such as Facebook, Instagram, Twitter and YouTube for such offences as ‘organised hate’. Deplatforming has prompted debate about ‘liberal big tech’ silencing free speech and taking on the role of editors, but also about the questions of whether it is effective and for whom. The research reported here follows certain of these Internet celebrities to Telegram as well as to a larger alternative social media ecology. It enquires empirically into some of the arguments made concerning whether deplatforming ‘works’ and how the deplatformed use Telegram. It discusses the effects of deplatforming for extreme Internet celebrities, alternative and mainstream social media platforms and the Internet at large. It also touches upon how social media companies’ deplatforming is affecting critical social media research, both into the substance of extreme speech as well as its audiences on mainstream as well as alternative platforms.
Article
Full-text available
This article examines the experiences of people with disabilities, a traditionally marginalized group in US politics, with social media platforms during the 2016 presidential election. Using focus groups with participants with a wide range of disabilities, the significance of YouTube, Twitter, and Facebook is discussed. Results highlight ambivalent experiences with these platforms, which support some elements of political inclusion (more accessible and more relevant election information) but at the same time also exacerbate aspects of marginality (stress, anxiety, isolation). Four coping strategies devised by participants to address digital stress (self-censorship, unfollowing/unfriending social media contacts, signing off, and taking medication) are illustrated. The relationship between these contrasting findings, social media design and affordances, as well as potential strategies to eliminate an emerging trade-off between discussing politics online and preserving mental health and social connectedness for people with disabilities are discussed.
Article
Full-text available
AI-mediated communication (AI-MC) represents a new paradigm where communication is augmented or generated by an intelligent system. As AI-MC becomes more prevalent, it is important to understand the effects that it has on human interactions and interpersonal relationships. Previous work tells us that in human interactions with intelligent systems, misattribution is common and trust is developed and handled differently than in interactions between humans. This study uses a 2 (successful vs. unsuccessful conversation) x 2 (standard vs. AI-mediated messaging app) between subjects design to explore whether AI mediation has any effects on attribution and trust. We show that the presence of AI-generated smart replies serves to increase perceived trust between human communicators and that, when things go awry, the AI seems to be perceived as a coercive agent, allowing it to function like a moral crumple zone and lessen the responsibility assigned to the other human communicator. These findings suggest that smart replies could be used to improve relationships and perceptions of conversational outcomes between interlocutors. Our findings also add to existing literature regarding perceived agency in smart agents by illustrating that in this type of AI-MC, the AI is considered to have agency only when communication goes awry.
Article
Full-text available
Social conformity occurs when an individual changes their behaviour in line with the majority's expectations. Although social conformity has been investigated in small group settings, the effect of gender - of both the individual and the majority/minority - is not well understood in online settings. Here we systematically investigate the impact of groups' gender composition on social conformity in online settings. We use an online quiz in which participants submit their answers and confidence scores, both prior to and following the presentation of peer answers that are dynamically fabricated. Our results show an overall conformity rate of 39%, and a significant effect of gender that manifests in a number of ways: gender composition of the majority, the perceived nature of the question, participant gender, visual cues of the system, and final answer correctness. We conclude with a discussion on the implications of our findings in designing online group settings, accounting for the effects of gender on conformity.
Article
Full-text available
Effect sizes are the currency of psychological research. They quantify the results of a study to answer the research question and are used to calculate statistical power. The interpretation of effect sizes—when is an effect small, medium, or large?—has been guided by the recommendations Jacob Cohen gave in his pioneering writings starting in 1962: Either compare an effect with the effects found in past research or use certain conventional benchmarks. The present analysis shows that neither of these recommendations is currently applicable. From past publications without pre-registration, 900 effects were randomly drawn and compared with 93 effects from publications with pre-registration, revealing a large difference: Effects from the former (median r = 0.36) were much larger than effects from the latter (median r = 0.16). That is, certain biases, such as publication bias or questionable research practices, have caused a dramatic inflation in published effects, making it difficult to compare an actual effect with the real population effects (as these are unknown). In addition, there were very large differences in the mean effects between psychological sub-disciplines and between different study designs, making it impossible to apply any global benchmarks. Many more pre-registered studies are needed in the future to derive a reliable picture of real population effects.
Article
Full-text available
We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p< .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p< .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely high-powered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (< 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied
Chapter
Full-text available
Chatbots are a rapidly expanding application of dialogue systems with companies switching to bot services for customer support, and new applications for users interested in casual conversation. One style of casual conversation is argument; many people love nothing more than a good argument. Moreover, there are a number of existing corpora of argumentative dialogues, annotated for agreement and disagreement, stance, sarcasm and argument quality. This paper introduces Debbie, a novel arguing bot, that selects arguments from conversational corpora, and aims to use them appropriately in context. We present an initial working prototype of Debbie, with some preliminary evaluation and describe future work.
Article
Full-text available
The links between protests and state responses have taken on increased visibility in light of the Arab Spring movements. But we still have unanswered questions about the relationship between protest behaviors and responses by the state. We frame this in terms of concession and disruption costs. Costs are typically defined as government behaviors that impede dissidents’ capacity for collective action. We change this causal arrow and hypothesize how dissidents can generate costs that structure the government's response to a protest. By disaggregating costs along dimensions of concession and disruption we extend our understanding of protest behaviors and the conditions under which they are more (or less) effective. Utilizing a new cross-national protest-event data set, we test our theoretical expectations against protests from 1990 to 2014 and find that when protesters generate high concession costs, the state responds in a coercive manner. Conversely, high disruption costs encourage the state to accommodate demands. Our research provides substantial insights and inferences about the dynamics of government response to protest.
Article
Full-text available
Effective task management is essential to successful team collaboration. While the past decade has seen considerable innovation in systems that track and manage group tasks, these innovations have typically been outside of the principal communication channels: email, instant messenger, and group chat. Teams formulate, discuss, refine, assign, and track the progress of their collaborative tasks over electronic communication channels, yet they must leave these channels to update their task-tracking tools, creating a source of friction and inefficiency. To address this problem, we explore how bots might be used to mediate task management for individuals and teams. We deploy a prototype bot to eight different teams of information workers to help them create, assign, and keep track of tasks, all within their main communication channel. We derived seven insights for the design of future bots for coordinating work.
Article
Full-text available
In this article, we take issue with the claim by Sunstein and others that online discussion takes place in echo chambers, and suggest that the dynamics of online debates could be more aptly described by the logic of ‘trench warfare’, in which opinions are reinforced through contradiction as well as confirmation. We use a unique online survey and an experimental approach to investigate and test echo chamber and trench warfare dynamics in online debates. The results show that people do indeed claim to discuss with those who hold opposite views from themselves. Furthermore, our survey experiments suggest that both confirming and contradicting arguments have similar effects on attitude reinforcement. Together, this indicates that both echo chamber and trench warfare dynamics – a situation where attitudes are reinforced through both confirmation and disconfirmation biases – characterize online debates. However, we also find that two-sided neutral arguments have weaker effects on reinforcement than one-sided confirming and contradicting arguments, suggesting that online debates could contribute to collective learning and qualification of arguments.
Article
Full-text available
Test-retest reliabilities, internal consistencies, and convergent test validities were examined for four measures of interpersonal behavior in handling conflict. Subjects were 86 graduate students in management. Instruments were those developed by Blake and Mouton, Lawrence and Lorsch, Hall, and by Thomas and Kilmann. Reliabilities were in the low-to-moderate range, with more recent instruments somewhat superior. Some problems with the first two measures were observed. The two most recent instruments, by Hall and by Thomas and Kilmann, show some convergence across all five modes of handling conflict. Convergence among other instruments varies by mode of handling conflict. Inspection of items suggests some reasons for the limited convergence.
Article
A conversational agent (CA) effectively facilitates online group discussions at scale. However, users may have expectations about how well the CA would perform that do not match with the actual performance, compromising technology acceptance. We built a facilitator CA that detects a member who has low contribution during a synchronous group chat discussion and asks the person to participate more. We designed three techniques to set end-user expectations about how accurately the CA identifies an under-contributing member: 1)information: explicitly communicating the accuracy of the detection algorithm, 2)explanation: providing an overview of the algorithm and the data used for the detection, and 3)adjustment: enabling users to gain a feeling of control over the algorithm. We conducted an online experiment with 163 crowdworkers in which each group completed a collaborative decision-making task and experienced one of the techniques. Through surveys and interviews, we found that the explanation technique was the most effective strategy overall as it reduced user embarrassment, increased the perceived intelligence of the CA, and helped users better understand the detection algorithm. In contrast, the information technique reduced members' contributions and the adjustment technique led to a more negative perceived discussion experience. We also discovered that the interactions with other team members diluted the effects of the techniques on users' performance expectations and acceptance of the CA. We discuss implications for better designing expectation-setting techniques for AI-team collaboration such as ways to improve collaborative decision outcomes and quality of contributions.
Article
Artificial Intelligence (AI) is a transformative force in communication and messaging strategy, with potential to disrupt traditional approaches. Large language models (LLMs), a form of AI, are capable of generating high-quality, humanlike text. We investigate the persuasive quality of AI-generated messages to understand how AI could impact public health messaging. Specifically, through a series of studies designed to characterize and evaluate generative AI in developing public health messages, we analyze COVID-19 pro-vaccination messages generated by GPT-3, a state-of-the-art instantiation of a large language model. Study 1 is a systematic evaluation of GPT-3's ability to generate pro-vaccination messages. Study 2 then observed peoples' perceptions of curated GPT-3-generated messages compared to human-authored messages released by the CDC (Centers for Disease Control and Prevention), finding that GPT-3 messages were perceived as more effective, stronger arguments, and evoked more positive attitudes than CDC messages. Finally, Study 3 assessed the role of source labels on perceived quality, finding that while participants preferred AI-generated messages, they expressed dispreference for messages that were labeled as AI-generated. The results suggest that, with human supervision, AI can be used to create effective public health messages, but that individuals prefer their public health messages to come from human institutions rather than AI sources. We propose best practices for assessing generative outputs of large language models in future social science research and ways health professionals can use AI systems to augment public health messaging.
Article
Discussions about polarizing topics are essential to have, yet they can easily become hostile, aggressive, or distressing on current social media platforms. Content moderation interventions aim to mitigate this issue, though such approaches are reactionary, removing harmful content only after it has been posted. We conducted a mixed-methods experiment with 40 participants to investigate how a design friction that manipulates the temporal flow during a contentious conversation can foster interpersonal mindfulness, a trait critical for productive communication. Dyads were randomly assigned into the Control Group which received no intervention, and the Experiment Group where participants were limited to sending one message per two-minute interval. Triangulating quantitative and qualitative data from conversation logs, questionnaires, interviews, and computational text analysis, our findings revealed a two-fold effect: Experiment Group participants felt simultaneously frustrated by the intervention as it disrupted the pacing of their conversation and interfered with rapport-building, and appreciative of the intervention as it nudged them towards writing thoughtful and task-focused messages. We discuss implications of these findings for future investigation into the design of temporal interventions to influence interpersonal mindfulness during polarizing online conversations.
Article
Social media is a modern person’s digital voice to project and engage with new ideas and mobilise communities—a power shared with extremists. Given the societal risks of unvetted content-moderating algorithms for Extremism , Radicalisation , and Hate speech (ERH) detection, responsible software engineering must understand the who, what, when, where, and why such models are necessary to protect user safety and free expression. Hence, we propose and examine the unique research field of ERH context mining to unify disjoint studies. Specifically, we evaluate the start-to-finish design process from socio-technical definition-building and dataset collection strategies to technical algorithm design and performance. Our 2015-2021 51-study Systematic Literature Review (SLR) provides the first cross-examination of textual, network, and visual approaches to detecting extremist affiliation, hateful content, and radicalisation towards groups and movements. We identify consensus-driven ERH definitions and propose solutions to existing ideological and geographic biases, particularly due to the lack of research in Oceania/Australasia. Our hybridised investigation on Natural Language Processing, Community Detection, and visual-text models demonstrates the dominating performance of textual transformer-based algorithms. We conclude with vital recommendations for ERH context mining researchers and propose an uptake roadmap with guidelines for researchers, industries, and governments to enable a safer cyberspace.
Article
Major depression constitutes a serious challenge in personal and public health. Tens of millions of people each year suffer from depression and only a fraction receives adequate treatment. We explore the potential to use social media to detect and diagnose major depressive disorder in individuals. We first employ crowdsourcing to compile a set of Twitter users who report being diagnosed with clinical depression, based on a standard psychometric instrument. Through their social media postings over a year preceding the onset of depression, we measure behavioral attributes relating to social engagement, emotion, language and linguistic styles, ego network, and mentions of antidepressant medications. We leverage these behavioral cues, to build a statistical classifier that provides estimates of the risk of depression, before the reported onset. We find that social media contains useful signals for characterizing the onset of depression in individuals, as measured through decrease in social activity, raised negative affect, highly clustered egonetworks, heightened relational and medicinal concerns, and greater expression of religious involvement. We believe our findings and methods may be useful in developing tools for identifying the onset of major depression, for use by healthcare agencies; or on behalf of individuals, enabling those suffering from depression to be more proactive about their mental health.
Article
Over the past few years, new ideological movements like the Alt-Right have captured the attention and concern of both mainstream media, policy makers, and scholars alike. Today, the methods by which right-wing extremists are radicalized are increasingly taking place within social media platforms and online communities. However, no research has yet investigated methods for proactively detecting online communities that may be displaying overall warning signs of mass ongoing ideological and political radicalization. In our work, we use a variety of text analysis methods to investigate the behavioral patterns of a radical right-wing community on Reddit (r/altright) over a 6-month period until right before it was banned for violation of Reddit terms of service. We find that this community showed aggregated behavioral patterns that aligned with past literature on warning behaviors of individual extremists in online environments, and that these behavioral patterns were not seen in a comparison group of eight other online political communities, similar in size and user engagement. Our research helps build upon the established literature on the detection of extremism in online environments, and has implications for proactive monitoring of online communities.
Article
General paranoia is the term that best describes a user's social media experience. The spaces we go to socialize online are full of suspicion, potential bad-faith actors, and advertisements that seem to know your every move. This attention-grabbing, habit-forming culture is sold on dreams of limitless love between family, friends, and community (as promised by the Facebook slogan "bring the world closer together"). Genuine connection can be found online, but the heart of this network lies outside fiber optic cables.
Article
The recent rise of political extremity and radicalization has presented unique challenges to contemporary politics, governance, and social cohesion in many societies in the world. In this study, we propose an imagined audience approach to understand how social media’s expanded expression capabilities are related to users’ political extremity and reduced network interaction. We demonstrate the usefulness of our imagined audience framework using a multi-country survey data set. Results from the United States, South Korea, and Japan reveal that expressive use of social media is associated with more extreme political attitudes and heightened intolerance, but the effect is contingent on whom the expresser has in mind as their audience. In particular, expressing one’s political self has a depolarizing effect for those expecting low audience reinforcement. We test the boundary conditions of our model in a more restricted information environment with a Chinese sample. We conclude by discussing the significance of our imagined audience approach and its relevance to today’s technology-mediated self-presentation.
Article
This study explores the effects of gender stereotypes on evaluating artificial intelligence (AI) recommendations. We predict that gender stereotypes will affect human-AI interactions, resulting in somewhat different persuasive effects of AI recommendations for utilitarian vs. hedonic products. We found that participants in the male AI agent condition gave higher competence scores than in the female AI agent condition. Contrariwise, perceived warmth was higher in the female AI agent condition than in the male condition. More importantly, a significant interaction effect between AI gender and product type was found, suggesting that participants showed more positive attitudes toward the AI recommendations when the male AI recommended a utilitarian (vs. hedonic) product. Conversely, a hedonic product was evaluated more positively when advised by the female (vs. male) AI agent.
Article
Political polarization on the digital sphere poses a real challenge to many democracies around the world. Although the issue has received some scholarly attention, there is a need to improve the conceptual precision in the increasingly blurry debate. The use of computational communication science approaches allows us to track political conversations in a fine-grained manner within their natural settings – the realm of interactive social media. The present study combines different algorithmic approaches to studying social media data in order to capture both the interactional structure and content of dynamic political talk online. We conducted an analysis of political polarization across social media platforms (analyzing Facebook, Twitter, and WhatsApp) over 16 months, with close to a quarter million online contributions regarding a political controversy in Israel. Our comprehensive measurement of interactive political talk enables us to address three key aspects of political polarization: (1) interactional polarization – homophilic versus heterophilic user interactions; (2) positional polarization – the positions expressed, and (3) affective polarization – the emotions and attitudes expressed. Our findings indicate that political polarization on social media cannot be conceptualized as a unified phenomenon, as there are significant cross-platform differences. While interactions on Twitter largely conform to established expectations (homophilic interaction patterns, aggravating positional polarization, pronounced inter-group hostility), on WhatsApp, de-polarization occurred over time. Surprisingly, Facebook was found to be the least homophilic platform in terms of interactions, positions, and emotions expressed. Our analysis points to key conceptual distinctions and raises important questions about the drivers and dynamics of political polarization online.
Article
Despite decades of research concerning social conformity and its effects on face-to-face groups, it is yet to be comprehensively investigated in online contexts. In our work, we investigate the impact of contextual determinants (such as majority group size, the number of opposing minorities and their sizes, and the nature of the task) and personal determinants (such as self-confidence, personality and gender) on online social conformity. In order to achieve this, we deployed an online quiz with subjective and objective multiple-choice questions. For each question, participants provided their answer and self-reported confidence. Following this, they were shown a fabricated bar chart that positioned the participant either in the majority or minority, presenting the distribution of group answers across different answer options. Each question tested a unique group distribution in terms of the number of minorities against the majority and their corresponding group sizes. Subsequently, participants were given the opportunity to change their answer and reported confidence. Upon completing the quiz, participants undertook a personality test and participated in a semi-structured interview. Our results show that 78% of the participants conformed to the majority’s answers at least once during the quiz. Further analysis reveals that the tendency to conform was significantly higher for objective questions, especially when a participant was unsure of their answer and faced an opposing majority with a significant size. While we saw no significant gender differences in conformity, participants with higher conscientiousness and neuroticism tended to conform more frequently than others. We conclude that online social conformity is a function of majority size, nature of the task, self-confidence and certain personality traits.
Conference Paper
In this report, we review our early steps in leveraging the promise of Artificial Intelligence to potentially create more inclusive co-design experiences for marginalized populations. We discuss our research agenda and the first phases where we utilized Wizard of Oz prototypes to engage participants in an online co-design environment. We found that the types of interactions between the facilitator and participant determined the amount of engagement with the co-design process.
Conference Paper
When a non-native speaker talks with a native speaker, he/she sometimes feels hard to take speaking turns due to language proficiency. The resulting conversation between a non-native speaker and a native speaker is not always productive. In this paper, we propose a conversational agent to support a non-native speaker in his/her second language conversation. The agent joins the conversation and makes intervention by a simple script based on turn-taking rules for taking the agent's turn, and gives the next turn to the non-native speaker for prompting him/her to speak. Evaluation of the proposed agent suggested that it successfully facilitated the non-native speaker's participation over 30% of the agent's interventions, and significantly increased the frequency of turn-taking.
Article
This paper will explore the ways in which online, anonymous interaction on the website 4chan.org can complicate traditionally situated discursive theory. Through an examination of the politically incorrect board (/pol/), this paper begins to reanalyze scale, alignment, and double-voicing approaches in ways that necessitate novel understandings of digitally placed discourse. This website has demonstrated unique engagements with these categories, engaging with global and personal discourses through anonymity, geographically "situated" flag markers, and green-text narrative techniques, among others. This essay contains a number of examples found through 4chan's /pol/ via qualitative-oriented, inscriptive gathering techniques of discourses concerning both the continuing European and American migration issues, as they are explored by a globally situated, digital community. Through banal, everyday engagements with both the material and the website features themselves, users craft new realizations of identity and interaction in a space that seeks to make all anonymous.
Conference Paper
Since its earliest days, harassment and abuse have plagued the Internet. Recent research has focused on in-domain methods to detect abusive content and faces several challenges, most notably the need to obtain large training corpora. In this paper, we introduce a novel computational approach to address this problem called Bag of Communities (BoC)---a technique that leverages large-scale, preexisting data from other Internet communities. We then apply BoC toward identifying abusive behavior within a major Internet community. Specifically, we compute a post's similarity to 9 other communities from 4chan, Reddit, Voat and MetaFilter. We show that a BoC model can be used on communities "off the shelf" with roughly 75% accuracy---no training examples are needed from the target community. A dynamic BoC model achieves 91.18% accuracy after seeing 100,000 human-moderated posts, and uniformly outperforms in-domain methods. Using this conceptual and empirical work, we argue that the BoC approach may allow communities to deal with a range of common problems, like abusive behavior, faster and with fewer engineering resources.
Article
We examined the persuasive effects of ironic and sarcastic versus no humor appeals in health messages and the potential differential effects of ironic versus sarcastic humor. Findings of a controlled experiment (N = 303) suggested that sarcastic messages, as compared to no humor messages, resulted in less negative affect, more counterarguing, and decreased perceived argument strength. Ironic messages led to more counterarguing than no humor messages. Significant differences in counterarguing, perceived argument strength, and attitudes toward the risky behavior were detected between the two humor types. Counterarguing mediated the indirect effect of message type on attitudes toward the risky behavior.
Article
We report results from an exploratory analysis examining "last-minute" self-censorship, or content that is filtered after being written, on Facebook. We collected data from 3.9 million users over 17 days and associate self-censorship behavior with features describing users, their social graph, and the interactions between them. Our results indicate that 71% of users exhibited some level of last-minute self-censorship in the time period, and provide specific evidence supporting the theory that a user's "perceived audience" lies at the heart of the issue: posts are censored more frequently than comments, with status updates and posts directed at groups censored most frequently of all sharing use cases investigated. Furthermore, we find that: people with more boundaries to regulate censor more; males censor more posts than females and censor even more posts with mostly male friends than do females, but censor no more comments than females; people who exercise more control over their audience censor more content; and, users with more politically and age diverse friends censor less, in general. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.