
Lynnette Hui Xian Ng- Carnegie Mellon University
Lynnette Hui Xian Ng
- Carnegie Mellon University
About
99
Publications
9,482
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
591
Citations
Current institution
Publications
Publications (99)
This work examines the influence of misinformation and the role of AI agents, called bots, on social network platforms. To quantify the impact of misinformation, it proposes two new metrics based on attributes of tweet engagement and user network position: Appeal, which measures the popularity of the tweet, and Scope, which measures the potential r...
Social media bots are AI agents that participate in online conversations. Most studies focus on the general bot and the malicious nature of these agents. However, bots have many different personas, each specialized towards a specific behavioral or content trait. Neither are bots singularly bad, because they are used for both good and bad informatio...
Technology evolves at a breakneck pace. As a result, legislatures are often unable to enact laws that can keep pace with technological changes. The dissonance between the state of the law and the state of technology intensifies with respect to biometric data because the purposes of biometric data use evolve, the types of biometric data expand, and...
Decentralized online social networks have evolved from experimental stages to operating at unprecedented scale, with broader adoption and more active use than ever before. Platforms like Mastodon, Bluesky, Hive, and Nostr have seen notable growth, particularly following the wave of user migration after Twitter's acquisition in October 2022. As new...
Chatter on social media about global events comes from 20% bots and 80% humans. The chatter by bots and humans is consistently different: bots tend to use linguistic cues that can be easily automated (e.g., increased hashtags, and positive terms) while humans use cues that require dialogue understanding (e.g. replying to post threads). Bots use wor...
Supervised machine-learning models often underperform in predicting user behaviors from conversational text, hindered by poor crowdsourced label quality and low NLP task accuracy. We introduce the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), which integrates annotator meta-features like fatigue and speeding. First, our results show...
Stance classification, the task of predicting the viewpoint of an author on a subject of interest, has long been a focal point of research in domains ranging from social science to machine learning. Current stance detection methods rely predominantly on manual annotation of sentences, followed by training a supervised machine learning model. Howeve...
Southeast Asia (SEA) is a region of extraordinary linguistic and cultural diversity, yet it remains significantly underrepresented in vision-language (VL) research. This often results in artificial intelligence (AI) models that fail to capture SEA cultural nuances. To fill this gap, we present SEA-VL, an open-source initiative dedicated to developi...
Social media is a primary medium for information diffusion during natural disasters. The social media ecosystem has been used to identify destruction, analyze opinions and organize aid. While the overall picture and aggregate trends may be important, a crucial part of the picture is the connections on these sites. These bridges are essential to fac...
Social Cyber Geography is the space in the digital cyber realm that is produced through social relations. Communication in the social media ecosystem happens not only because of human interactions, but is also fueled by algorithmically controlled bot agents. Most studies have not looked at the social cyber geography of bots because they focus on bo...
Chatter on social media is 20% bots and 80% humans. Chatter by bots and humans is consistently different: bots tend to use linguistic cues that can be easily automated while humans use cues that require dialogue understanding. Bots use words that match the identities they choose to present, while humans may send messages that are not related to the...
In recent years, mass-broadcast messaging platforms like Telegram have gained prominence for both, serving as a harbor for private communication and enabling large-scale disinformation campaigns. The encrypted and networked nature of these platforms makes it challenging to identify intervention targets since most channels that promote misleading in...
Translation of code-mixed texts to formal English allow a wider audience to understand these code-mixed languages, and facilitate downstream analysis applications such as sentiment analysis. In this work, we look at translating Singlish, which is colloquial Singaporean English, to formal standard English. Singlish is formed through the code-mixing...
Large Language Models (LLMs) offer a lucrative promise for scalable content moderation, including hate speech detection. However, they are also known to be brittle and biased against marginalised communities and dialects. This requires their applications to high-stakes tasks like hate speech detection to be critically scrutinized. In this work, we...
Singlish, or Colloquial Singapore English, is a language formed from oral and social communication within multicultural Singapore. In this work, we work on a fundamental Natural Language Processing (NLP) task: Parts-Of-Speech (POS) tagging of Singlish sentences. For our analysis, we build a parallel Singlish dataset containing direct English transl...
Singlish, or formally Colloquial Singapore English, is an English-based creole language originating from the SouthEast Asian country Singapore. The language contains influences from Sinitic languages such as Chinese dialects, Malay, Tamil and so forth. A fundamental task to understanding Singlish is to first understand the pragmatic functions of it...
Effective public health messaging benefits from understanding antecedents to unstable attitudes that are more likely to be influenced. This work investigates the relationship between moral and emotional bases for attitudes towards COVID-19 vaccines and variance in stance. Evaluating nearly 1 million X users over a two month period, we find that emo...
Pink slime news sites are politically polarized Web sites controlled by partisan national organizations that masquerade as local news. Instead of authentic community reporting, these sites rely on automated algorithms and APIs to fill in news articles between their politically charged messaging aimed at influencing votes. Over 1,000 of these sites...
The dissemination of disinformation has become a formidable weapon, with nation-states exploiting social media platforms to engineer narratives favorable to their geopolitical interests. This study delved into Russia’s orchestrated disinformation campaign, in three times periods of the 2022 Russian-Ukraine War: its incursion, its midpoint and the U...
Automated political campaigns in the digital space can influence electoral votes and tilt the balance of power. We developed a compact ensemble approach named Tiny BotBuster to identify automated bot users, then we applied the Combined Synchronization Index to reveal the political actors working together. We applied our techniques to the 2024 Indon...
Effective public health messaging benefits from understanding antecedents to unstable attitudes that are more likely to be influenced. This work investigates the relationship between moral and emotional bases for attitudes towards COVID-19 vaccines and variance in stance. Evaluating nearly 1 million X users over a two month period, we find that emo...
Large Language Models (LLMs) have demonstrated remarkable capabilities in executing tasks based on natural language queries. However, these models, trained on curated datasets, inherently embody biases ranging from racial to national and gender biases. It remains uncertain whether these biases impact the performance of LLMs for certain tasks. In th...
Online multiplayer games like League of Legends, Counter Strike, and Skribbl.io create experiences through community interactions. Providing players with the ability to interact with each other through multiple modes also opens a Pandora box. Toxic behaviour and malicious players can ruin the experience, reduce the player base and potentially harmi...
Radiance fields produce high fidelity images with high rendering speed, but are difficult to manipulate. We effectively perform avatar texture transfer across different appearances by combining benefits from radiance fields and mesh surfaces. We represent the source as a radiance field using 3D Gaussian Splatter, then project the Gaussians on the t...
During the COVID-19 pandemic, the proliferation of misinformation on social media has been rapidly increasing. Automated Bot authors are believed to be significant contributors of this surge. It is hypothesized that Bot authors deliberately craft online misinformation aimed at triggering and exploiting human cognitive biases, thereby enhancing twee...
This study analyzes two covert Chinese bot networks, employing tweet-based and account-based methods to find detection evasion tactics. We reveal the use of message artifacts that disguise spam, engagement strategies that mimic human interaction, and behavioral patterns suggesting algorithmic control. We uncover bot maintenance practices and algori...
Bots are automated social media users that can be used to amplify (mis)information and sow harmful discourse. In order to effectively influence users, bots can be generated to reproduce human user behavior. Indeed, people tend to trust information coming from users with profiles that fit roles they expect to exist, such as users with gender role st...
There is a prevailing perception that content on a social media platform generally have the same political leaning. These platforms are often viewed as ideologically congruent entities, reflecting the majority opinion of their users; a prime example of this is Truth Social. While this perception may exist, it is essential to verify the platform's c...
Online games are dynamic environments where players interact with each other, which offers a rich setting for understanding how players negotiate their way through the game to an ultimate victory. This work studies online player interactions during the turn-based strategy game, Diplomacy. We annotated a dataset of over 10,000 chat messages for diff...
News journalism has evolved from traditional print media to social media, with a large proportion of readers consuming their news via digital means. Through an analysis of over 1.3 million posts across three social media platforms (Facebook, Twitter, Reddit) pertaining to the 2022 U.S. Midterm Elections, this analysis examines the difference in sha...
The COVID-19 pandemic of 2021 led to a worldwide health crisis that was accompanied by an infodemic. A group of 12 social media personalities, dubbed the “Disinformation Dozen”, were identified as key in spreading disinformation regarding the COVID-19 virus, treatments, and vaccines. This study focuses on the spread of disinformation propagated by...
The dissemination of disinformation has become a formidable weapon, with nation-states exploiting social media platforms to engineer narratives favorable to their geopolitical interests. This study delved into Russia's orchestrated disinformation campaign, in three times periods of the 2022 Russian-Ukraine War: its incursion, its midpoint and the U...
Bots have been in the spotlight for many social media studies, for they have been observed to be participating in the manipulation of information and opinions on social media. These studies analyzed the activity and influence of bots in a variety of contexts: elections, protests, health communication and so forth. Prior to this analyzes is the iden...
Social media platforms are a key ground of information consumption and dissemination. Key figures like politicians, celebrities, and activists have leveraged on its wide user base for strategic communication. Strategic communications, or StratCom, is the deliberate act of information creation and distribution. Its techniques are used by key figures...
As digitalization increases, countries employ digital diplomacy, harnessing digital resources to project their desired image. Digital diplomacy also encompasses the interactivity of digital platforms, providing a trove of public opinion that diplomatic agents can collect. Social media bots actively participate in political events through influencin...
From border control using fingerprints to law enforcement with video surveillance to self-activating devices via voice identification, biometric data is used in many applications in the contemporary context of a Smart City. Biometric data consists of human characteristics that can identify one person from others. Given the advent of big data and th...
TikTok, a video sharing social media platform, is a subsidiary of ByteDance, a Chinese-owned company, with its headquarters in Singapore. There’s an ongoing concern in the United States Congress that American users’ data may be exposed to unauthorized access or manipulated by foreign entities. In an effort to alleviate these concerns and retain use...
In this work, we analyze the circumstances under which social influence operations are likely to succeed. These circumstances include the selection of Confederate agents to execute intentional perturbations and the selection of Perturbation strategies. We use Agent-Based Modelling (ABM) as a simulation technique to observe the effect of intentional...
The cross-strait relationship between China and Taiwan is marked by increasing hostility around potential reunification. We analyze an unattributed bot network and how repeater bots engaged in an influence campaign against Taiwan following US House Speaker Nancy Pelosi’s visit to Taiwan in 2022. We examine the message amplification tactics employed...
Introduction
France has seen two key protests within the term of President Emmanuel Macron: one in 2020 against Islamophobia, and another in 2023 against the pension reform. During these protests, there is much chatter on online social media platforms like Twitter.
Methods
In this study, we aim to analyze the differences between the online chatter...
With the proliferation of online technologies, social media recruitment has become an essential part of any company’s outreach campaign. A social media platform can provide marketing posts with access to a large pool of candidates and at a low cost. It also provides the opportunity to quickly customize and refine messages in response to the recepti...
In this study, we examined online conversations on Twitter about a Chinese balloon spotted over U.S. airspace in January 2023. We investigated the conversations between U.S.-based, China-based and accounts from the rest of the world. We also studied the difference between bots and human accounts within these conversations. We found that U.S.-based...
Social media platforms are information battlegrounds where actors or communities compete to influence ideas and beliefs. These platforms can benefit government and health organizations by quickly disseminating pertinent information about the COVID-19 vaccine to a large population. However, at the same time, the social-cyberspace domain has made it...
Agent-based simulations have been used in modeling transportation systems for traffic management and passenger flows. In this work, we hope to shed light on the complex factors that influence transportation mode decisions within developing countries, using Colombia as a case study. We model an ecosystem of human agents that decide at each time step...
Agent-based simulations have been used in modeling transportation systems for traffic management and passenger flows. In this work, we hope to shed light on the complex factors that influence transportation mode decisions within developing countries, using Colombia as a case study. We model an ecosystem of human agents that decide at each time step...
In this work, we analyze the circumstances under which social influence operations are likely to succeed. These circumstances include the selection of Confederate agents to execute intentional perturbations and the selection of Perturbation strategies. We use Agent-Based Modelling (ABM) as a simulation technique to observe the effect of intentional...
Despite rapid development, current bot detection models still face challenges in dealing with incomplete data and cross-platform applications. In this paper, we propose BotBuster, a social bot detector built with the concept of a mixture of experts approach. Each expert is trained to analyze a portion of account information, e.g. username, and are...
White supremacist extremist groups are a significant domestic terror threat in many Western nations. These groups harness the Internet to spread their ideology via online platforms: blogs, chat rooms, forums, and social media, which can inspire violence offline. In this work, we study the persistence and reach of white supremacist propaganda in bot...
Social media has provided a citizen voice, giving rise to grassroots collective action, where users deploy a concerted effort to disseminate online narratives and even carry out offline protests. Sometimes these collective action are aided by inorganic synchronization, which arise from bot actors. It is thus important to identify the synchronicity...
This case study investigates a recent Russian disinformation narrative about U.S. biolabs and the development of biological weapons in Ukraine. This disinformation campaign was officially initiated by the Russian government, including the Russian Ministry of Defense, and was disseminated by official state-funded Russian media. In their announcement...
Coordinated disinformation campaigns are used to influence social media users, potentially leading to offline violence. In this study, we introduce a general methodology to uncover coordinated messaging through an analysis of user posts on Parler. The proposed Coordinating Narratives Framework constructs a user-to-user coordination graph, which is...
Social media has provided a citizen voice, giving rise to grassroots collective action, where users deploy a concerted effort to disseminate online narratives and even carry out offline protests. Sometimes these collective action are aided by inorganic synchronization, which arise from bot actors. It is thus important to identify the synchronicity...
Stance detection identifies a person’s evaluation of a subject, and is a crucial component for many downstream applications. In application, stance detection requires training a machine learning model on an annotated dataset and applying the model on another to predict stances of text snippets. This cross-dataset model generalization poses three ce...
This paper investigates how hate speech varies in systematic ways according to the identities it targets. Across multiple hate speech datasets annotated for targeted identities, we find that classifiers trained on hate speech targeting specific identity groups struggle to generalize to other targeted identities. This provides empirical evidence for...
Social media has become an integral component of the modern information system. An average person typically has multiple accounts across different platforms. At the same time, the rise of social media facilitates the spread of online mis/disinformation narratives within and across these platforms. In this study, we characterize the coordinated info...
Coordinated campaigns in the digital realm have become an increasingly important area of study due to their potential to cause political polarization and threats to security through real-world protests and riots. In this paper, we introduce a methodology to profile two case studies of coordinated actions in Indonesian Twitter discourse. Combining n...
Despite rapid development, current bot detection models still face challenges in dealing with incomplete data and cross-platform applications. In this paper, we propose BotBuster, a social bot detector built with the concept of a mixture of experts approach. Each expert is trained to analyze a portion of account information, e.g. username, and are...
Coordinated campaigns in the digital realm have become an increasingly important area of study due to their potential to cause political polarization and threats to security through real-world protests and riots. In this paper, we introduce a methodology to profile two case studies of coordinated actions in Indonesian Twitter discourse. Combining n...
proposes an approach towards goal-oriented modeling of the detection and modeling complex social phenomena in multiparty discourse in an online political strategy game. We developed a two-tier approach that first encodes sociolinguistic behavior as linguistic features then use reinforcement learning to estimate the advantage afforded to any player....
Coordinated groups of user accounts working together in online social media can be used to manipulate the online discourse and thus is an important area of study. In this study, we work towards a general theory of coordination. There are many ways to coordinate groups online: semantic, social, referral and many more. Each represents a coordination...
State-sponsored online influence operations typically consist of coordinated accounts exploiting the online space to influence public opinion. Accounts associated with these operations use images and memes as part of their content generation and dissemination strategy to increase the effectiveness and engagement of the content. In this paper, we pr...
Social media bots have been characterized in their use in digital activism and information manipulation, due to their roles in information diffusion. The detection of bots has been a major task within the field of social media computation, and many datasets and bot detection algorithms have been developed. With these algorithms, the bot score stabi...
The study of the coordinated manipulation of conversations on social media has become more prevalent as social media’s role in amplifying misinformation, hate, and polarization has come under greater scrutiny. We discuss how successful generalized coordination detection algorithms could be used to reinforce existing power imbalances, such as those...
Social influence characterizes the change of an individual’s stances in a complex social environment towards a topic. Two factors often govern the influence of stances in an online social network: endogenous influences driven by an individual’s innate beliefs through the agent’s past stances and exogenous influences formed by social network influen...
proposes an approach towards goal-oriented modeling of the detection and modeling complex social phenomena in multiparty discourse in an online political strategy game. We developed a two-tier approach that first encodes sociolinguistic behavior as linguistic features then use reinforcement learning to estimate the advantage afforded to any player....
TikTok is a popular new social media, where users express themselves through short video clips. A common form of interaction on the platform is participating in "challenges", which are songs and dances for users to iterate upon. Challenge contagion can be measured through replication reach, i.e., users uploading videos of their participation in the...
en What are the pathways for spreading disinformation on social media platforms? This article addresses this question by collecting, categorising, and situating an extensive body of research on how application programming interfaces (APIs) provided by social media platforms facilitate the spread of disinformation. We first examine the landscape of...
What are the pathways for spreading disinformation on social media platforms? This article addresses this question by collecting, categorising, and situating an extensive body of research on how application programming interfaces (APIs) provided by social media platforms facilitate the spread of disinformation. We first examine the landscape of off...
Coordinated disinformation campaigns are used to influence social media users, potentially leading to offline violence. In this study, we introduce a general methodology to uncover coordinated messaging through analysis of user parleys on Parler. The proposed method constructs a user-to-user coordination network graph induced by a user-to-text grap...
Multi-modal generation has been widely explored in recent years. Current research directions involve generating text based on an image or vice versa. In this paper, we propose a new task called CIGLI: Conditional Image Generation from Language and Image. Instead of generating an image based on text as in text-image generation, this task requires th...
A picture speaks a thousand words. Images are extremely effective at evoking emotions and presents a potentially damaging force to the health of digital discourse. While text-based emotion analysis has been studied, little work has examined the emotions images invoke on social media platforms. This work analyzes bot-based emotion behavior differenc...
Social influence characterizes the change of opinions in a complex social environment, incorporating an individual's past stances and the impact of interpersonal influence through the social network influence. In this work, we observe stance changes towards the coronavirus vaccine on Twitter from April 2020 to May 2021, where 1\% of the agents exhi...
The 2020 coronavirus pandemic has heightened the need to flag coronavirus-related misinformation, and fact-checking groups have taken to verifying misinformation on the Internet. We explore stories reported by fact-checking groups PolitiFact, Poynter and Snopes from January to June 2020. We characterise these stories into six clusters, then analyse...
The study of coordinated manipulation of conversations on social media has become more prevalent as social media's role in amplifying misinformation, hate, and polarization has come under scrutiny. We discuss the implications of successful coordination detection algorithms based on shifts of power, and consider how responsible coordination detectio...
Digital disinformation presents a challenging problem for democracies worldwide, especially in times of crisis like the COVID-19 pandemic. In countries like Singapore, legislative efforts to quell fake news constitute relatively new and understudied contexts for understanding local information operations. This paper presents a social cybersecurity...
We construct audio adversarial examples on automatic Speech-To-Text systems . Given any audio waveform, we produce an another by overlaying an audio vocal mask generated from the original audio. We apply our audio adversarial attack to five SOTA STT systems: DeepSpeech, Julius, Kaldi, wav2letter@anywhere and CMUSphinx. In addition, we engaged human...
We introduce KOSMOS, a knowledge retrieval system based on the constructed knowledge graph of social media and mainstream media documents. The system first identifies key events from the documents at each time frame through clustering, extracting a document to represent each cluster, then describing the document in terms of 5W1H (Who, What, When, W...
We analyse a Singapore-based COVID-19 Telegram group with more than 10,000 participants. First, we study the group's opinion over time, focusing on five dimensions: participation, sentiment, negative emotions, topics, and message types. We find that participation peaked when the Ministry of Health raised the disease alert level, but this engagement...
We analyse a Singapore-based COVID-19 Telegram group with more than 10,000 participants. First, we study the group's opinion over time, focusing on four dimensions: participation, sentiment, topics, and psychological features. We find that engagement peaked when the Ministry of Health raised the disease alert level, but this engagement was not sust...
The 2020 coronavirus pandemic has heightened the need to flag coronavirus-related misinformation, and fact-checking groups have taken to verifying misinformation on the Internet. We explore stories reported by fact-checking groups PolitiFact, Poynter and Snopes from January to June 2020, characterising them into six story clusters before then analy...
We analyse a Singapore-based COVID-19 Telegram group with more than 10,000 participants. First, we study the group's opinion over time, focusing on four dimensions: participation, sentiment, topics, and psychological features. We find that engagement peaked when the Ministry of Health raised the disease alert level, but this engagement was not sust...
Have you ever been stuck in an airport because your flight was delayed or cancelled and wondered if you could have predicted it if you'd had more data? In this project, we used 7.5 million entries worth of commercial flight data within USA to answer the following question: Given a current flight's data and the past few days of flight data, can you...