Amy X. Zhang’s research while affiliated with University of Washington Tacoma and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (127)


The Collaborative Practices and Motivations of Online Communities Dedicated to Voluntary Misinformation Response
  • Preprint
  • File available

November 2024

·

6 Reads

Jina Yoon

·

Shreya Sathyanarayanan

·

·

Amy X. Zhang

Responding to misinformation online can be an exhausting and thankless task. It takes time and energy to write effective content, puts users at risk of online harassment, and strains personal relationships. Despite these challenges, there are people who voluntarily respond to misinformation online, and some have established communities on platforms such as Reddit, Discord, and X (formerly Twitter) dedicated to these efforts. In this work, we interviewed 8 people who participate in such communities to understand the type of support they receive from each other in these discussion spaces. Interviewees described that their communities helped them sustain motivation, save time, and improve their communication skills. Common practices included sharing sources and citations, providing emotional support, giving others advice, and signaling positive feedback. We present our findings as three case studies and discuss opportunities for future work to support collaborative practices in online communities dedicated to misinformation response. Our work surfaces how resource sharing, social motivation, and decentralization can make misinformation correction more sustainable, rewarding, and effective for online citizens.

Download

SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment

November 2024

·

2 Reads

Alignment of large language models (LLMs) to societal values should account for pluralistic values from diverse groups. One technique uses in-context learning for inference-time alignment, but only considers similarity when drawing few-shot examples, not accounting for cross-group differences in value prioritization. We propose SPICA, a framework for pluralistic alignment that accounts for group-level differences during in-context example retrieval. SPICA introduces three designs to facilitate pluralistic alignment: scenario banks, group-informed metrics, and in-context alignment prompts. From an evaluation of SPICA on an alignment task collecting inputs from four demographic groups (n=544n = 544), our metrics retrieve in-context examples that more closely match observed preferences, with the best prompt configuration using multiple contrastive responses to demonstrate examples. In an end-to-end evaluation (n=80n = 80), we observe that SPICA-aligned models are higher rated than a baseline similarity-only retrieval approach, with groups seeing up to a +0.16 point improvement on a 5 point scale. Additionally, gains from SPICA were more uniform, with all groups benefiting from alignment rather than only some. Finally, we find that while a group-agnostic approach can effectively align to aggregated values, it is not most suited for aligning to divergent groups.


Figure 4: Rule-objective alignment evaluation performance compared to human experts. Plotted values computed by averaging Dice-Hamming similarities between evaluator outputs and the judgements of multiple individual human experts, then normalizing those values so that average similarity between human experts is one.
Chain of Alignment: Integrating Public Will with Expert Intelligence for Language Model Alignment
Andrew Konya

·

·

Kevin Feng

·

[...]

·

Amy X. Zhang

We introduce a method to measure the alignment between public will and language model (LM) behavior that can be applied to fine-tuning, online oversight, and pre-release safety checks. Our `chain of alignment' (CoA) approach produces a rule based reward (RBR) by creating model behavior rules\textit{rules} aligned to normative objectives\textit{objectives} aligned to public will\textit{public will}. This factoring enables a nonexpert public to directly specify their will through the normative objectives, while expert intelligence is used to figure out rules entailing model behavior that best achieves those objectives. We validate our approach by applying it across three different domains of LM prompts related to mental health. We demonstrate a public input process built on collective dialogues and bridging-based ranking that reliably produces normative objectives supported by at least 96%±2%96\% \pm 2\% of the US public. We then show that rules developed by mental health experts to achieve those objectives enable a RBR that evaluates an LM response's alignment with the objectives similarly to human experts (Pearson's r=0.841, AUC=0.964). By measuring alignment with objectives that have near unanimous public support, these CoA RBRs provide an approximate measure of alignment between LM behavior and public will.



Fig. 1. Summary of high-level writer interaction with AI during the writing process.
Fig. 3. Flow of P13's process writing lyrics with ChatGPT.
Fig. 4. Summary of dynamics in how writers engage with AI.
From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice

November 2024

·

32 Reads

Creative writers have a love for their craft, yet AI systems using large language models (LLMs) offer the automation of significant parts of the writing process. So why do some creative writers choose to integrate AI into their workflows? To explore this, we interview and observe a writing session with 18 creative writers who already use AI regularly in their writing practice. Our findings reveal that creative writers are intentional about how they incorporate AI, making many deliberate decisions about when and how to engage AI based on the core values they hold about writing. These values, such as authenticity and craftsmanship, alongside writers' relationships with and use of AI influence the parts of writing over which they wish to maintain control. Through our analysis, we contribute a taxonomy of writer values, writer relationships with AI, and integration strategies, and discuss how these three elements interrelate.


Fig. 1. The Social-RAG workflow involves four steps, as illustrated in the figure. After collecting and indexing content group interactions to create a social knowledge base (Step 1), Social-RAG retrieves from prior group interactions (Step 2), selects and ranks relevant social signals which are fed as context into a large language model to generate a succinct message (Step 3). The message is then posted to social channels, where the agent can continue to collect and index group reactions (Step 4). The workflow grounds LLM agent generations to social information about a group.
Fig. 2. An example scenario demonstrates how users can interact with PaperPing. Channel members share paper links, react to papers shared by other members or PaperPing, and comment on shared papers. After gathering and processing this information, PaperPing sends new contextually grounded paper recommendations, including explanations of how the recommended papers are relevant to the channel. It also provides links to previous related discussions and includes meta-information about the recommended papers in the recommendation message.
Fig. 4. PaperPing prompts pipeline (Condition 4). It starts with social signals leading to three prompts: Prompt 1 highlights paper content, Prompt 2 highlights relevance to a previous paper, and Prompt 3 highlights relevance to a channel member. Finally, outputs from these prompts feed into Prompt 4, which synthesizes the information and adjusts the style, resulting in a final output.
Fig. 6. Results of our post-study questionnaire. The responses are grouped based on the four design goals. The original questions of the questionnaire are presented in the Appendix.
Social-RAG: Retrieving from Group Interactions to Socially Ground Proactive AI Generation to Group Preferences

November 2024

·

17 Reads

AI agents are increasingly tasked with making proactive suggestions in online spaces where groups collaborate, but can be unhelpful or even annoying, due to not fitting the group's preferences or behaving in socially inappropriate ways. Fortunately, group spaces have a rich history of prior social interactions and affordances for social feedback to support creating agents that align to a group's interests and norms. We present Social-RAG, a workflow for grounding agents to social information about a group, which retrieves from prior group interactions, selects relevant social signals, and then feeds the context into a large language model to generate messages to the group. We implement this into PaperPing, our system that posts academic paper recommendations in group chat, leveraging social signals determined from formative studies with 39 researchers. From a three-month deployment in 18 channels, we observed PaperPing posted relevant messages in groups without disrupting their existing social practices, fostering group common ground.


Fig. 1. An example of data we collected per participant from sections of the survey that relate to our statistical modeling. Each participant provides answers to (up to) five demographic questions and (up to) 36 Likert ratings in response to questions about LLM usage frequency, perceptions, and usage types. Participants are not required to report on every question. In this example, the gender information is missing from the participant.
Fig. 5. Overview of the relation between usage frequency and perceptions of LLM. Each heatmap represents one type of perception, and each cell represents the number of responses (log scaled) that fall under this level of frequency of perception. The Kendall's tau coefficient on the bottom indicates how strong the correlation is between the usage frequency and the perception of that usage. All perceptions are significantly correlated with usage (í µí± < .0001). Tests performed using cor.test in R and corrected with p.adjust.
Kendall's tau correlation between the frequency of LLM usage and the perception of that usage. Each cell includes the tau coefficient with the Holm-Bonferroni corrected í µí±-value in parenthesis. Coefficients not followed by parenthesis all had í µí±<.0001.
LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions

October 2024

·

45 Reads

The rise of large language models (LLMs) has led many researchers to consider their usage for scientific work. Some have found benefits using LLMs to augment or automate aspects of their research pipeline, while others have urged caution due to risks and ethical concerns. Yet little work has sought to quantify and characterize how researchers use LLMs and why. We present the first large-scale survey of 816 verified research article authors to understand how the research community leverages and perceives LLMs as research tools. We examine participants' self-reported LLM usage, finding that 81% of researchers have already incorporated LLMs into different aspects of their research workflow. We also find that traditionally disadvantaged groups in academia (non-White, junior, and non-native English speaking researchers) report higher LLM usage and perceived benefits, suggesting potential for improved research equity. However, women, non-binary, and senior researchers have greater ethical concerns, potentially hindering adoption.


Fig. 1. Publications per year for papers included in our corpus for the years 2013-2023. Note that 2023 only includes publications from January 1, 2023 to June 30, 2023.
Fig. 3. Identifying characteristics presented for each of the 101 neurodivergent and 72 neurotypical samples. A sample could be identified across multiple characteristics (ex. a description of "children ages 7-17 from the United States" would be coded under "Age" and "Nationality"). ND participants are most often described by their age and binary gender (male or female) when gender is presented. NT participants are most often identified exclusively by their familial relation to a corresponding ND participant or their employment (teacher, therapist, etc). For more information on the application of identifying characteristics, see "Identifiers" in Section 8.2.8 of the Appendix.
Who Puts the "Social" in "Social Computing"?: Using A Neurodiversity Framing to Review Social Computing Research

October 2024

·

13 Reads

Human-Computer Interaction (HCI) and Computer Supported Collaborative Work (CSCW) have a longstanding tradition of interrogating the values that underlie systems in order to create novel and accessible experiences. In this work, we use a neurodiversity framing to examine how people with ways of thinking, speaking, and being that differ from normative assumptions are perceived by researchers seeking to study and design social computing systems for neurodivergent people. From a critical analysis of 84 publications systematically gathered across a decade of social computing research, we determine that research into social computing with neurodiverse participants is largely medicalized, adheres to historical stereotypes of neurodivergent children and their families, and is insensitive to the wide spectrum of neurodivergent people that are potential users of social technologies. When social computing systems designed for neurodivergent people rely upon a conception of disability that restricts expression for the sake of preserving existing norms surrounding social experience, the result is often simplistic and restrictive systems that prevent users from "being social" in a way that feels natural and enjoyable. We argue that a neurodiversity perspective informed by critical disability theory allows us to engage with alternative forms of sociality as meaningful and desirable rather than a deficit to be compensated for. We conclude by identifying opportunities for researchers to collaborate with neurodivergent users and their communities, including the creation of spectrum-conscious social systems and the embedding of double empathy into systems for more equitable design.




Citations (45)


... Responding to misinformation is not an easy task. It requires significant time, energy, and trust to develop high-quality and source-backed content [25,46]. Being actively involved in such efforts also puts people in positions of high visibility and, consequently, high risk for harassment [34]. ...

Reference:

The Collaborative Practices and Motivations of Online Communities Dedicated to Voluntary Misinformation Response
User experiences and needs when responding to misinformation on social media
  • Citing Article
  • November 2023

... In recent years, uncertainty quantification (UQ) has been one of the challenging topics in the context of AI/ML concept, particularly critical applications using LLMs in healthcare, finance and law [15][16][17][18][19]. Uncertainty, which refers to the confidence of the model in its predictions, can originate from different sources, such as model architecture, model parameters, noise, and insufficient information in the dataset(s) [20][21][22][23] in addition to the nature of LLMs. ...

(A)I Am Not a Lawyer, But...: Engaging Legal Experts towards Responsible LLM Policies for Legal Advice
  • Citing Conference Paper
  • June 2024

... Bao et al. [3] analyzed the utility of conversational features for the prediction of prosocial outcomes in online interactions by introducing theory-inspired metrics. Weld et al. [56] conducted a survey of Reddit users to develop a comprehensive taxonomy of community values that users find desirable to foster better communities. Relatedly, Lambert et al. [35] surveyed Reddit moderators and constructed taxonomies capturing both what moderators want to encourage and what actions they take to positively reinforce desirable contributions. ...

Making Online Communities ‘Better’: A Taxonomy of Community Values on Reddit
  • Citing Article
  • May 2024

Proceedings of the International AAAI Conference on Web and Social Media

... papers is often open-ended, with no clear decision point to stop, and opportunistic [35]. Paper recommendations often come from social networks [56], and researchers commonly form groups for the purpose of sharing and exchanging relevant research literature. ...

Mitigating Barriers to Public Social Interaction with Meronymous Communication
  • Citing Conference Paper
  • May 2024

... With the rise of generative AI, some tools like Remesh use GPT-4 to synthesize crowdsourced public views into initial policies and then iteratively refine them through expert and public feedback [43]. Other tools like PolicyKit and Pika provide infrastructures that allow online community members to set up their own process for policy design [83,91]. ...

Pika: Empowering Non-Programmers to Author Executable Governance Policies in Online Communities
  • Citing Conference Paper
  • May 2024

... First, these studies address various content types and platforms, such as social media posts [10,275], news articles [124,128,181], videos and video sharing platforms [91,130], images [91], search engines [143,184,305], or online encyclopedias [85]. ...

Viblio: Introducing Credibility Signals and Citations to Video-Sharing Platforms
  • Citing Conference Paper
  • May 2024

... Journalists and communicators rely on their professional communities and networks to validate information on breaking news together. Similarly, professional fact checkers develop pipelines of practices to overcome challenges they face when it comes to improving effectiveness, efficiency, scale, and reach [38]. We examine whether these individual-level and community-level practices also arise among voluntary factcheckers and similarly explore potential sociotechnical solutions that could assist those who engage in this work. ...

Misinformation as a Harm: Structured Approaches for Fact-Checking Prioritization
  • Citing Article
  • April 2024

Proceedings of the ACM on Human-Computer Interaction

... A balance is needed between removing harmful content and respecting freedom of expression, but current one-size-fits-all approaches often fail to account for individual preferences. Personalization in content moderation, where users' thresholds for what they consider immoral or offensive can differ, has emerged as an area of focus, offering insights into how personalized moderation can improve user engagement and transparency [18]. ...

Personalizing Content Moderation on Social Media: User Perspectives on Moderation Choices, Interface Design, and Labor
  • Citing Article
  • October 2023

Proceedings of the ACM on Human-Computer Interaction

... The labor of annotating, reviewing, or moderating potentially harmful content or behaviors from within technology-facilitated spaces or interactions could be traced back to early online communities of unpaid volunteers [126] that enforced community norms and rules [89,120] across platforms like Discord [127], Twitch [151], Reddit [80], and Facebook [48]. In addition, end-users who are not formal community members may also engage in content moderation by reporting violations [25,68], adjusting moderation preferences [47,67,75]. As the popularity of social media platforms grew, this labor evolved into a structured and industrialized form with human workers in various ways. ...

Do users want platform moderation or individual control? Examining the role of third-person effects and free speech support in shaping moderation preferences
  • Citing Article
  • December 2023

New Media & Society