Roy Ka-Wei Lee’s research while affiliated with Singapore University of Technology and Design and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (25)


Are Current Task-Oriented Dialogue Systems Able to Satisfy Impolite Users?
  • Article

January 2025

·

3 Citations

IEEE Transactions on Computational Social Systems

·

Nancy F. Chen

·

Roy Ka-Wei Lee

Task-oriented dialogue (TOD) systems play a critical role in assisting users with various tasks, such as ticket booking and service inquiries. While these systems have demonstrated significant potential in addressing customer needs, they typically assume that users will interact with the dialogue agent in a polite manner. This assumption, however, is often unrealistic, as users may express impatience or frustration through impolite behavior. Addressing this gap, this article investigates the impact of impolite user behavior on the performance of TOD systems. To this end, we developed a novel corpus of impolite dialogues and conducted comprehensive experiments to evaluate the performance of state-of-the-art TOD systems on this dataset. Our results reveal a notable limitation: existing TOD systems struggle to handle impolite user utterances effectively, leading to degraded performance. To mitigate this issue, we introduce a data augmentation approach designed to improve the systems’ ability to manage impolite dialogues. Although this method achieves measurable improvements, managing impolite user interactions remains a challenging research problem. By making our impolite dialogue corpus publicly accessible, we aim to encourage further research in this underexplored area. This study underscores the need for more robust TOD systems capable of handling diverse user behaviors, ultimately enhancing their applicability in real-world scenarios.



Analyzing user archetypes in Singapore's Telegram groups on COVID-19 and climate change
  • Preprint
  • File available

June 2024

·

45 Reads

Social media platforms, particularly Telegram, play a pivotal role in shaping public perceptions and opinions on global and national issues. Unlike traditional news media, Telegram allows for the proliferation of user-generated content with minimal oversight, making it a significant venue for the spread of controversial and misinformative content. During the COVID-19 pandemic, Telegram's popularity surged in Singapore, a country with one of the highest rates of social media use globally. We leverage Singapore-based Telegram data to analyze information flows within groups focused on COVID-19 and climate change. Using k-means clustering, we identified distinct user archetypes, including Skeptic, Engaged Advocate, Observer, and Analyst, each contributing uniquely to the discourse. We developed a model to classify users into these clusters (Precision: Climate change: 0.99; COVID-19: 0.95). By identifying these user archetypes and examining their contributions to information dissemination, we sought to uncover patterns to inform effective strategies for combating misinformation and enhancing public discourse on pressing global issues.

Download

TikTok’s Project Texas - Social Media Data Governance Across Geopolitical Lines

December 2023

·

37 Reads

·

1 Citation

Digital Government Research and Practice

TikTok, a video sharing social media platform, is a subsidiary of ByteDance, a Chinese-owned company, with its headquarters in Singapore. There’s an ongoing concern in the United States Congress that American users’ data may be exposed to unauthorized access or manipulated by foreign entities. In an effort to alleviate these concerns and retain user trust, TikTok launched Project Texas, aimed at safeguarding American users’ data. This commentary examines the broader implications of Project Texas on data governance, particularly in the context of geopolitical boundaries. It delves into the geopolitics of social media and the autonomy of data governance in the digital age, offering advice for all social media platforms regarding these issues.


CoAIcoder: Examining the Effectiveness of AI-assisted Human-to-Human Collaboration in Qualitative Analysis

August 2023

·

124 Reads

·

34 Citations

ACM Transactions on Computer-Human Interaction

While AI-assisted individual qualitative analysis has been substantially studied, AI-assisted collaborative qualitative analysis (CQA) – a process that involves multiple researchers working together to interpret data – remains relatively unexplored. After identifying CQA practices and design opportunities through formative interviews, we designed and implemented CoAIcoder, a tool leveraging AI to enhance human-to-human collaboration within CQA through four distinct collaboration methods. With a between-subject design, we evaluated CoAIcoder with 32 pairs of CQA-trained participants across common CQA phases under each collaboration method. Our findings suggest that while using a shared AI model as a mediator among coders could improve CQA efficiency and foster agreement more quickly in the early coding stage, it might affect the final code diversity. We also emphasize the need to consider the independence level when using AI to assist human-to-human collaboration in various CQA scenarios. Lastly, we suggest design implications for future AI-assisted CQA systems.


ChatGPT and Bard Responses to Polarizing Questions

July 2023

·

61 Reads

Recent developments in natural language processing have demonstrated the potential of large language models (LLMs) to improve a range of educational and learning outcomes. Of recent chatbots based on LLMs, ChatGPT and Bard have made it clear that artificial intelligence (AI) technology will have significant implications on the way we obtain and search for information. However, these tools sometimes produce text that is convincing, but often incorrect, known as hallucinations. As such, their use can distort scientific facts and spread misinformation. To counter polarizing responses on these tools, it is critical to provide an overview of such responses so stakeholders can determine which topics tend to produce more contentious responses -- key to developing targeted regulatory policy and interventions. In addition, there currently exists no annotated dataset of ChatGPT and Bard responses around possibly polarizing topics, central to the above aims. We address the indicated issues through the following contribution: Focusing on highly polarizing topics in the US, we created and described a dataset of ChatGPT and Bard responses. Broadly, our results indicated a left-leaning bias for both ChatGPT and Bard, with Bard more likely to provide responses around polarizing topics. Bard seemed to have fewer guardrails around controversial topics, and appeared more willing to provide comprehensive, and somewhat human-like responses. Bard may thus be more likely abused by malicious actors. Stakeholders may utilize our findings to mitigate misinformative and/or polarizing responses from LLMs



Figure 1. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram (updated February 2023).
Figure 2. Publications by years.
Methods Used: Journal Articles.
Misogynistic Extremism: A Scoping Review

June 2023

·

944 Reads

·

13 Citations

Trauma Violence & Abuse

In recent years, the concept of "misogynistic extremism" has emerged as a subject of interest among scholars, governments, law enforcement personnel, and the media. Yet a consistent understanding of how misogynistic extremism is defined and conceptualized has not yet emerged. Varying epistemological orientations may contribute to the current conceptual muddle of this topic, reflecting long-standing and on-going challenges with the conceptualization of its individual components. To address the potential impact of misogynistic extremism (i.e., violent attacks), a more precise understanding of what this phenomenon entails is needed. To summarize the existing knowledge base on the nature of misogynistic extremism, this scoping review analyzed publications within English-language peer-reviewed and gray literature sources. Seven electronic databases and citation indexes were systematically searched using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews (PRISMA-ScR) checklist and charted using the 2020 PRISMA flow diagram. Inclusion criteria included English peer-reviewed articles and relevant gray literature publications, which contained the term "misogynistic extremism" and other closely related terms. No date restrictions were imposed. The search strategy initially yielded 475 publications. After exclusion of ineligible articles, 40 publications remained for synthesis. We found that misogynistic extremism is most frequently conceptualized in the context of misogynistic incels, male supremacism, far-right extremism, terrorism, and the black pill ideology. Policy recommendations include increased education among law enforcement and Countering and Preventing Violent Extremism experts on male supremacist violence and encouraging legal and educational mechanisms to bolster gender equality. Violence stemming from misogynistic worldviews must be addressed by directly acknowledging and challenging socially embedded systems of oppression such as white supremacy and cisheteropatriarchy.


Adapter-TST: A Parameter Efficient Method for Multiple-Attribute Text Style Transfer

May 2023

·

6 Reads

Adapting a large language model for multiple-attribute text style transfer via fine-tuning can be challenging due to the significant amount of computational resources and labeled data required for the specific task. In this paper, we address this challenge by introducing AdapterTST, a framework that freezes the pre-trained model's original parameters and enables the development of a multiple-attribute text style transfer model. Using BART as the backbone model, Adapter-TST utilizes different neural adapters to capture different attribute information, like a plug-in connected to BART. Our method allows control over multiple attributes, like sentiment, tense, voice, etc., and configures the adapters' architecture to generate multiple outputs respected to attributes or compositional editing on the same sentence. We evaluate the proposed model on both traditional sentiment transfer and multiple-attribute transfer tasks. The experiment results demonstrate that Adapter-TST outperforms all the state-of-the-art baselines with significantly lesser computational resources. We have also empirically shown that each adapter is able to capture specific stylistic attributes effectively and can be configured to perform compositional editing.


Categorizing Memes About the Ukraine Conflict

February 2023

·

152 Reads

·

6 Citations

Lecture Notes in Computer Science

The Russian disinformation campaign uses pro-Russia memes to polarize Americans, and increase support for the Russian invasion of Ukraine. Thus, it is critical for governments and similar stakeholders to identify pro-Russia memes, countering them with evidence-based information. Identifying broad meme themes is crucial for developing a targeted and strategic counter response. There are also a range of pro-Ukraine memes that bolster support for the Ukrainian cause. As such, we need to identify pro-Ukraine memes and aid with their dissemination to augment global support for Ukraine. We address the indicated issues through the following contributions: 1) Creation of an annotated dataset of pro-Russia (N = 70) and pro-Ukraine (N = 121) memes regarding the Ukraine conflict; 2) Identification of broad themes within the pro-Russia and pro-Ukraine meme categories. Broadly, our findings indicated that pro-Russia memes fall into thematic categories that seek to undermine specific elements of US and their allies’ policy and culture. Pro-Ukraine memes are far more diffuse thematically, highlighting admiration for Ukraine’s people and its leadership. Stakeholders may utilize our findings to develop targeted strategies to mitigate Russian influence operations - possibly reducing effects of the conflict.


Citations (14)


... Text style transfer (TST) is a popular natural language generation task that aims to change the stylistic properties (e.g., sentiment, formality, tense, voice) of the text while preserving the style-independent content (Hu et al., 2022a). Existing studies explore performing text style transfer on attributes like age, or gender (Lample et al., 2019), sentiment Luo et al., 2019;Fu et al., 2018), formality (Rao and Tetreault, 2018), politeness (Madaan et al., 2020;Hu et al., 2022b), and author writing style (Syed et al., 2020). Nevertheless, most of the existing TST studies are confined to single-attribute TST tasks. ...

Reference:

Adapter-TST: A Parameter Efficient Method for Multiple-Attribute Text Style Transfer
Are Current Task-Oriented Dialogue Systems Able to Satisfy Impolite Users?
  • Citing Article
  • January 2025

IEEE Transactions on Computational Social Systems

... While such metrics can still capture consistency between machine and human coders, even complete consistency cannot measure the "openness" or "exhaustiveness" of the codes. Without a viable alternative, many studies adopt deductive metrics for evaluating machine-generated open codes (Gao et al., 2023;Gebreegziabher et al., 2023;Parfenova, 2024;Rietz & Maedche, 2021). Similarly, many human evaluation studies expect machine coders to match what humans identified (Deiner et al., 2024;Khan et al., 2024). ...

CoAIcoder: Examining the Effectiveness of AI-assisted Human-to-Human Collaboration in Qualitative Analysis
  • Citing Article
  • August 2023

ACM Transactions on Computer-Human Interaction

... Although the manifestations vary across individuals and groups, they share a common theme: a belief that although men are entitled to control and agency, this entitlement is being stolen or threatened by external forces. Whether seen through the lens of personal grievances, racial superiority, or gender hierarchy, the underlying sentiment is one of victimhood and anger directed toward those seen as encroaching on men's perceived ordained rights and power (O'Hanlon et al., 2024;O'Malley & Helm, 2023). Understanding this perspective, which highlights a troubling intersection of frustration and misogyny, is crucial for addressing the factors that influence the violence and extremism linked to these ideologies. ...

Misogynistic Extremism: A Scoping Review

Trauma Violence & Abuse

... These efforts T. Munk enhance resistance and build a sense of unity among supporters while excluding opposing groups (Mortensen & Neumayer, 2023, p. 2368Ross & Rivers, 2017, p. 3;Munk, 2024b, p. 76-77) . Ukraine has used social media effectively, spanning from President Zelenskyy's appeals to global audiences to grassroots and institutional meme campaigns on social media and virtual spaces (Munk, 2024b, 6-7;Munk & Ahmad, 2022;Chen et al., 2022) . Pro-Ukraine memes promote unity and resistance, while pro-Russia memes aim to polarise societies and promote uncertainty -like the methods used by Trump's election campaign. ...

Categorizing Memes About the Ukraine Conflict
  • Citing Chapter
  • February 2023

Lecture Notes in Computer Science

... In recent years there has been an increased interest in transliteration as a means of "bridging the script gap" between related languages for constructing multilingual LLMs in NLP (Murikinati, Anastasopoulos, and Neubig 2020;Muller et al. 2021;Dhamecha et al. 2021;Moosa, Akhter, and Habib 2023) and multilingual ASR (Datta et al. 2020;Khare et al. 2021). Such LLMs pretrained on large amounts of general multilingual text data generalize well to many specific NLP scenarios when fine-tuned using smaller amounts of task-specific data (Izacard and Grave 2021;Markewich et al. 2022;Moezzi et al. 2023). ...

DReD - A Descriptive Relation Dataset for Expanding Relation Extraction
  • Citing Article
  • January 2022

IEEE Transactions on Artificial Intelligence

... By identifying misinformation before it gains substantial traction, we can mitigate its negative effects and uphold the integrity of online discourse. The concept of early detection has been well-studied for online information disorders such as hate speech [8], rumors [9], fake news [10], etc. At the same time, developing robust methods to detect misinformation early is also essential to protect the accuracy of information and promote meaningful discussions. ...

Early Prediction of Hate Speech Propagation
  • Citing Conference Paper
  • December 2021

... TikTok, like other social media platforms, can potentially negatively impact children's social and emotional development. Platform designs that stimulate the dopaminergic reward system through interactions such as likes, comments, and followers can lead to addictive behavior, reduce direct social interactions, and hinder the development of real-world social skills (Ng et al., 2021;Olvera et al., 2021;Silvina et al., 2023). An educational approach that promotes healthy self-understanding and teaches positive social media skills is essential. ...

Will you dance to the challenge?: predicting user participation of TikTok challenges
  • Citing Conference Paper
  • November 2021

... This ongoing advancement in DLA methods underscores the importance and persistent relevance of the field in the broader context of document understanding and computer vision research. As documents continue to evolve, so will the techniques and technologies in DLA, ensuring that it remains an essential and ever-progressing study area [15]. ...

Segmentation for document layout analysis: not dead yet

International Journal on Document Analysis and Recognition (IJDAR)

... As illustrated in the left part of the figure, when the caret is after the first letter "B", the top completion results contain both the prefix "B" and the suffix "Spears". Phrase comple- tion can improve writing efficiency and reduce the chance of typographical errors (Lee et al., 2021). For example, if a user types "Los A" and triggers phrase completion, the topmost suggestion is "Los Angeles", which can be directly selected by the user. ...

Improving Text Auto-Completion with Next Phrase Prediction
  • Citing Conference Paper
  • January 2021

... Extending on our previous conference publication [22], we present our completed dataset for document layout analysis that contains 43 annotated object classes. To train on this dataset, we also propose an aggressive weighted loss strategy, as well as a new method for bounding box regression on segmentation outputs. ...

Document Structure Extraction: An Exploratory Study
  • Citing Conference Paper
  • November 2020