Yi-Ling Chung’s research while affiliated with University of Trento and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (17)


Figure 1: Counterspeech dynamics. (1) Perpetrator(s) generate Hate Speech. This may be witnessed by either targets and/or bystanders. (2) Counterspeaker(s) respond with counterspeech, which may be directed at the perpetrator(s), bystanders (e.g. to provide alternative perspectives), or other targets (e.g. in support). Counterspeakers may themselves be targets or bystanders, or could be members of organised counterspeech groups. They can have in-or out-group identities with respect to either the perpetrator(s) or the target(s). Counterspeech is directed at recipients, who can be one or more of (a) the perpetrator(s), (b) the target(s), or (c) other bystanders. Both counterspeakers and targets can be individual or multiple (one-to-one, one-tomany and so on).
Figure 2: Flow diagram showing the identification, eligibility screening, and inclusion phases of the selection of items analysed in this review.
Understanding Counterspeech for Online Harm Mitigation
  • Article
  • Full-text available

September 2024

·

75 Reads

·

5 Citations

Northern European Journal of Language Technology

Yi-Ling Chung

·

·

Florence Enock

·

[...]

·

Verena Rieser

Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful content through removal. Advances in the development of large language models mean that the process of producing counterspeech could be made more efficient by automating its generation, which would enable large-scale online campaigns. However, we currently lack a systematic understanding of several important factors relating to the efficacy of counterspeech for hate mitigation, such as which types of counterspeech are most effective, what are the optimal conditions for implementation, and which specific effects of hate it can best ameliorate. This paper aims to fill this gap by systematically reviewing counterspeech research in the social sciences and comparing methodologies and findings with natural language processing (NLP) and computer science efforts in automatic counterspeech generation. By taking this multi-disciplinary view, we identify promising future directions in both fields.

Download

Large language models can consistently generate high-quality content for election disinformation operations

August 2024

·

7 Reads

Advances in large language models have raised concerns about their potential use in generating compelling election disinformation at scale. This study presents a two-part investigation into the capabilities of LLMs to automate stages of an election disinformation operation. First, we introduce DisElect, a novel evaluation dataset designed to measure LLM compliance with instructions to generate content for an election disinformation operation in localised UK context, containing 2,200 malicious prompts and 50 benign prompts. Using DisElect, we test 13 LLMs and find that most models broadly comply with these requests; we also find that the few models which refuse malicious prompts also refuse benign election-related prompts, and are more likely to refuse to generate content from a right-wing perspective. Secondly, we conduct a series of experiments (N=2,340) to assess the "humanness" of LLMs: the extent to which disinformation operation content generated by an LLM is able to pass as human-written. Our experiments suggest that almost all LLMs tested released since 2022 produce election disinformation operation content indiscernible by human evaluators over 50% of the time. Notably, we observe that multiple models achieve above-human levels of humanness. Taken together, these findings suggest that current LLMs can be used to generate high-quality content for election disinformation operations, even in hyperlocalised scenarios, at far lower costs than traditional methods, and offer researchers and policymakers an empirical benchmark for the measurement and evaluation of these capabilities in current and future models.




DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

July 2023

·

8 Reads

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.


Figure 1: Counterspeech dynamics. (1) Perpetrator(s) generate Hate Speech. This may be witnessed by either targets and/or bystanders. (2) Counterspeaker(s) respond with counterspeech, which may be directed at the perpetrator(s), bystanders (e.g. to provide alternative perspectives), or other targets (e.g. in support). Counterspeakers may themselves be targets or bystanders, or could be members of organised counterspeech groups. They can have in-or out-group identities with respect to either the perpetrator(s) or the target(s). Counterspeech is directed at recipients, who can be one or more of (a) the perpetrator(s), (b) the target(s), or (c) other bystanders. Both counterspeakers and targets can be individual or multiple (one-to-one, one-to-many and so on).
Figure 2: Flow diagram showing the identification, eligibility screening, and inclusion phases of the selection of items analysed in this review.
Understanding Counterspeech for Online Harm Mitigation

July 2023

·

256 Reads

Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful content through removal. Advances in the development of large language models mean that the process of producing counterspeech could be made more efficient by automating its generation, which would enable large-scale online campaigns. However, we currently lack a systematic understanding of several important factors relating to the efficacy of counterspeech for hate mitigation, such as which types of counterspeech are most effective, what are the optimal conditions for implementation, and which specific effects of hate it can best ameliorate. This paper aims to fill this gap by systematically reviewing counterspeech research in the social sciences and comparing methodologies and findings with computer science efforts in automatic counterspeech generation. By taking this multi-disciplinary view, we identify promising future directions in both fields.



Figure 1: Confusion matrix on monolingual training for EN, IT, and FR from left to right (upper part); multilingual model tested on EN, IT, and FR from left to right (down part). The predictions are represented by columns and gold class is represented in rows.
Multilingual Counter Narrative Type Classification

September 2021

·

44 Reads

The growing interest in employing counter narratives for hatred intervention brings with it a focus on dataset creation and automation strategies. In this scenario, learning to recognize counter narrative types from natural text is expected to be useful for applications such as hate speech countering, where operators from non-governmental organizations are supposed to answer to hate with several and diverse arguments that can be mined from online sources. This paper presents the first multilingual work on counter narrative type classification, evaluating SoTA pre-trained language models in monolingual, multilingual and cross-lingual settings. When considering a fine-grained annotation of counter narrative classes, we report strong baseline classification results for the majority of the counter narrative types, especially if we translate every language to English before cross-lingual prediction. This suggests that knowledge about counter narratives can be successfully transferred across languages.


Empowering NGOs in Countering Online Hate Messages

July 2021

·

30 Reads

Studies on online hate speech have mostly focused on the automated detection of harmful messages. Little attention has been devoted so far to the development of effective strategies to fight hate speech, in particular through the creation of counter-messages. While existing manual scrutiny and intervention strategies are time-consuming and not scalable, advances in natural language processing have the potential to provide a systematic approach to hatred management. In this paper, we introduce a novel ICT platform that NGO operators can use to monitor and analyze social media data, along with a counter-narrative suggestion tool. Our platform aims at increasing the efficiency and effectiveness of operators' activities against islamophobia. We test the platform with more than one hundred NGO operators in three countries through qualitative and quantitative evaluation. Results show that NGOs favor the platform solution with the suggestion tool, and that the time required to produce counter-narratives significantly decreases.


Empowering NGOs in countering online hate messages

July 2021

·

22 Reads

·

6 Citations

Online Social Networks and Media

Studies on online hate speech have mostly focused on the automated detection of harmful messages. Little attention has been devoted so far to the development of effective strategies to fight hate speech, in particular through the creation of counter-messages. While existing manual scrutiny and intervention strategies are time-consuming and not scalable, advances in natural language processing have the potential to provide a systematic approach to hatred management. In this paper, we introduce a novel ICT platform that NGO operators can use to monitor and analyse social media data, along with a counter-narrative suggestion tool. Our platform aims at increasing the efficiency and effectiveness of operators’ activities against islamophobia. We test the platform with more than one hundred NGO operators in three countries through qualitative and quantitative evaluation. Results show that NGOs favour the platform solution with the suggestion tool, and that the time required to produce counter-narratives significantly decreases.


Citations (11)


... Several studies have identified and categorized effective CS strategies. Chung et al. (2023) conducted a systematic review, identifying eight strategies used in social sciences and real-world policydriven campaigns. These strategies include presenting facts to counter misinformation and using humor or satire to diffuse hostility. ...

Reference:

PANDA -- Paired Anti-hate Narratives Dataset from Asia: Using an LLM-as-a-Judge to Create the First Chinese Counterspeech Dataset
Understanding Counterspeech for Online Harm Mitigation

Northern European Journal of Language Technology

... Other targets such as Gender and Disability are very few thus they have less meaning when separated. Besides, the authors in [28] show various types of hate according to different aspects such as culture, economics, crimes, rapism, women oppression, history, and other/generic types. However, some hate speech appears in complex linguistic styles (e.g. ...

NLP for Counterspeech against Hate: A Survey and How-To Guide
  • Citing Conference Paper
  • January 2024

... Prior work has focused on modeling replies to hate speech, including corpora construction (Mathew et al. 2019;Chung et al. 2019), fine-grained categorization (Mathew et al. 2019;Yu et al. 2023), and generation (Zhu and Bhat 2021;Gupta et al. 2023;Chung and Bright 2024). Still, Copyright © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). ...

On the Effectiveness of Adversarial Robustness for Abuse Mitigation with Counterspeech
  • Citing Conference Paper
  • January 2024

... While there was some perceptions that these campaigns had been partially successful, the rise of social media as a forum for the self-presentation of footballers for many brought about a kind of regression with abuse once again rife, also closely associated with the issue of racism [26]. Empirical work has consistently documented relatively high levels of abuse towards footballers [27,28]. However the vast majority of quantitative studies have, to our knowledge, been directed towards male sports stars, despite obvious press attention to abuse towards female athletes as well [29]. ...

Tracking Abuse on Twitter Against Football Players in the 2021 – 22 Premier League Season
  • Citing Article
  • January 2023

SSRN Electronic Journal

... 2 The detection and generation of counterspeech is important because it underpins the promise of AIpowered assistive tools for hate mitigation. Identifying counterspeech is vital also for analytical research in the area: for instance, to disentangle the dynamics of perpetrators, victims and bystanders (Mathew et al., 2018;Garland et al., 2020Garland et al., , 2022, as well as determining which responses are most effective in combating hate speech (Mathew et al., 2018(Mathew et al., , 2019Chung et al., 2021a). ...

Multilingual Counter Narrative Type Classification

... Text generation, especially counterspeech generation, is one of the many open challenges in NLP research that has made breakthroughs in recent years [12,13,79,96]. Researchers have explored various aspects of counterspeech generation using LLMs, such as generating contextually relevant responses [34,96], knowledge-grounded counterspeech generation [12], and ensuring that the generated counterspeech adheres to ethical and societal norms [62]. ...

Italian Counter Narrative Generation to Fight Online Hate Speech
  • Citing Chapter
  • January 2020

... Counterspeech can be more effective than other moderation actions, such as content and user removals, without limiting free speech [11]. Favorable results were obtained by scholars [4,37], NGOs [12,13], and ordinary users [31,38] alike, as part of observational [26], quasi-experimental [4], and experimental [37,50] studies. These positive results also extend to AI-generated counterspeech. ...

Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech
  • Citing Conference Paper
  • January 2021

... According to what we know, CONAN (Chung et al., 2019) has presented a massive, multilingual dataset of well-produced hate speech/ counter-narrative pairs, providing superior counter-narratives that are thought to be the best and most varied among other counter-narrative datasets (Fanton et al., 2021). Similarly, Chung, Tekiroglu & Guerini (2021) have suggested models for generating counterstories that emphasize educational and multilingual answers. They have developed a knowledge-driven pipeline that can produce appropriate and instructive English counter-narratives without experiencing hallucinatory phenomena. ...

Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech
  • Citing Preprint
  • June 2021

... Automatically producing counterspeech is a timely and important task for two reasons. First, composing counterspeech is time-consuming and requires considerable expertise to be effective (Chung et al., 2021b). Recently, large language models have been able to produce fluent and personalised arguments tailored to user expectations addressing various topics and tasks. ...

Empowering NGOs in countering online hate messages
  • Citing Article
  • July 2021

Online Social Networks and Media

... Moreover, the widespread nature of online toxicity makes manual counterspeech highly impractical. To overcome these limitations, automated counterspeech systems reference counterspeech persuasion personalization [10,40] ✓ ✓ [8,20] ✓ ✓ [4,37,46,50 leveraging generative AI technologies-such as large language models (LLMs)-have been developed [5,66]. Yet, evaluating their effectiveness remains challenging [28]. ...

Generating Counter Narratives against Online Hate Speech: Data and Strategies
  • Citing Conference Paper
  • January 2020