October 2023
·
14 Reads
·
29 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
October 2023
·
14 Reads
·
29 Citations
August 2023
·
14 Reads
·
50 Citations
June 2023
·
93 Reads
While demands for change and accountability for harmful AI consequences mount, foreseeing the downstream effects of deploying AI systems remains a challenging task. We developed AHA! (Anticipating Harms of AI), a generative framework to assist AI practitioners and decision-makers in anticipating potential harms and unintended consequences of AI systems prior to development or deployment. Given an AI deployment scenario, AHA! generates descriptions of possible harms for different stakeholders. To do so, AHA! systematically considers the interplay between common problematic AI behaviors as well as their potential impacts on different stakeholders, and narrates these conditions through vignettes. These vignettes are then filled in with descriptions of possible harms by prompting crowd workers and large language models. By examining 4113 harms surfaced by AHA! for five different AI deployment scenarios, we found that AHA! generates meaningful examples of harms, with different problematic AI behaviors resulting in different types of harms. Prompting both crowds and a large language model with the vignettes resulted in more diverse examples of harms than those generated by either the crowd or the model alone. To gauge AHA!'s potential practical utility, we also conducted semi-structured interviews with responsible AI professionals (N=9). Participants found AHA!'s systematic approach to surfacing harms important for ethical reflection and discovered meaningful stakeholders and harms they believed they would not have thought of otherwise. Participants, however, differed in their opinions about whether AHA! should be used upfront or as a secondary-check and noted that AHA! may shift harm anticipation from an ideation problem to a potentially demanding review problem. Drawing on our results, we discuss design implications of building tools to help practitioners envision possible harms.
May 2023
·
17 Reads
Even when aggregate accuracy is high, state-of-the-art NLP models often fail systematically on specific subgroups of data, resulting in unfair outcomes and eroding user trust. Additional data collection may not help in addressing these weaknesses, as such challenging subgroups may be unknown to users, and underrepresented in the existing and new data. We propose Targeted Data Generation (TDG), a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop. TDG estimates the expected benefit and potential harm of data augmentation for each subgroup, and selects the ones most likely to improve within group performance without hurting overall performance. In our experiments, TDG significantly improves the accuracy on challenging subgroups for state-of-the-art sentiment analysis and natural language inference models, while also improving overall test accuracy.
May 2023
·
17 Reads
Despite substantial advancements, Natural Language Processing (NLP) models often require post-training adjustments to enforce business rules, rectify undesired behavior, and align with user values. These adjustments involve operationalizing "concepts"--dictating desired model responses to certain inputs. However, it's difficult for a single entity to enumerate and define all possible concepts, indicating a need for a multi-user, collaborative model alignment framework. Moreover, the exhaustive delineation of a concept is challenging, and an improper approach can create shortcuts or interfere with original data or other concepts. To address these challenges, we introduce CoDev, a framework that enables multi-user interaction with the model, thereby mitigating individual limitations. CoDev aids users in operationalizing their concepts using Large Language Models, and relying on the principle that NLP models exhibit simpler behaviors in local regions. Our main insight is learning a \emph{local} model for each concept, and a \emph{global} model to integrate the original data with all concepts. We then steer a large language model to generate instances within concept boundaries where local and global disagree. Our experiments show CoDev is effective at helping multiple users operationalize concepts and avoid interference for a variety of scenarios, tasks, and models.
April 2023
·
41 Reads
·
2 Citations
Large language models are becoming increasingly pervasive and ubiquitous in society via deployment in sociotechnical systems. Yet these language models, be it for classification or generation, have been shown to be biased and behave irresponsibly, causing harm to people at scale. It is crucial to audit these language models rigorously. Existing auditing tools leverage either or both humans and AI to find failures. In this work, we draw upon literature in human-AI collaboration and sensemaking, and conduct interviews with research experts in safe and fair AI, to build upon the auditing tool: AdaTest (Ribeiro and Lundberg, 2022), which is powered by a generative large language model (LLM). Through the design process we highlight the importance of sensemaking and human-AI communication to leverage complementary strengths of humans and generative models in collaborative auditing. To evaluate the effectiveness of the augmented tool, AdaTest++, we conduct user studies with participants auditing two commercial language models: OpenAI's GPT-3 and Azure's sentiment analysis model. Qualitative analysis shows that AdaTest++ effectively leverages human strengths such as schematization, hypothesis formation and testing. Further, with our tool, participants identified a variety of failures modes, covering 26 different topics over 2 tasks, that have been shown before in formal audits and also those previously under-reported.
March 2023
·
5 Reads
·
16 Citations
March 2023
·
15,393 Reads
·
181 Citations
Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions.
February 2023
·
52 Reads
The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when "enough" examples have been included even for known patterns. In this work, we present ScatterShot, an interactive system for building high-quality demonstration sets for in-context learning. ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set. In simulation studies on two text perturbation scenarios, ScatterShot sampling improves the resulting few-shot functions by 4-5 percentage points over random sampling, with less variance as more examples are added. In a user study, ScatterShot greatly helps users in covering different patterns in the input space and labeling in-context examples more efficiently, resulting in better in-context learning and less user effort.
January 2023
·
4 Reads
·
8 Citations
... On one hand, systematic biases may arise during the data collection process, leading to the underestimation or overestimation of certain groups or characteristics within the training data. Such biases may stem from socioeconomic factors, limitations in technical access, or flaws in data collection methodologies (48). For instance, if certain patients lack access to healthcare resources due to poverty or the digital divide, their representation in the training data may be inadequate, resulting in their underrepresentation in public health policy assessments. ...
February 2016
... Although in recent years, several works (Chung et al., 2019;Sagadeeva & Boehm, 2021;d'Eon et al., 2022;Eyuboglu et al., 2022;Metzen et al., 2023;Plumb et al., 2023;Jain et al., 2023;Gao et al., 2023) have proposed methods for analysing systematic weaknesses, there is a lack of focus on identifying weaknesses of models evaluated on real-world datasets where the weaknesses align with human-understandable semantic concepts defined by, e.g., safety experts in ODDs. We argue that it is more beneficial from a safety perspective if the approaches to identify systematic weaknesses are ODD compliant for two main reasons: (i) the slices are useful as the identified vulnerabilities are aligned with human-understandable safety-relevant dimensions. ...
October 2023
... • Allowing users to directly interact with AI-enabled systems to construct comprehensive mental models about the entire AI pipeline to help them understand the output and recommendations [37]. • Supporting users by calibrating their confidence in a hypothesis based on available evidence [60]. • Comparing AI output and recommendations with not only the domain experts' and users' knowledge and experience, but also other similar AI-enabled systems, and peers [39]. ...
August 2023
... Among the non-parameter-modifying methods, there are two paradigm. The first is knowledge editing based on retrieval augmentation, which treats the new knowledge as external knowledge in the retrieval-augmented model,i.e.SERAC [9], IKE [10], Wang et al. [11], Shi et al. [12], MemPrompt [13], Murty et al. [14]. The second involves adding extra trainable parameters, which are trained on the modified knowledge dataset while the original model parameters remain unchanged,i.e.T-Patcher [15], CaliNet [16], GRACE [17], MELO [18]. ...
January 2022
... Recent advancements in this field concentrate on optimizing distillation objectives to improve the efficiency and effectiveness of the distillation process (Zhong et al., 2024;Ko et al., 2024;Agarwal et al., 2024). Besides, there is a growing trend towards distilling specialized capabilities from LLMs, including leveraging LLMs as annotators to generate pseudolabeled data (Ding et al., 2023;Xu et al., 2023b;Zhou et al., 2024;He et al., 2024) and synthesizing task-specific data from scratch (Ye et al., 2022;He et al., 2023;Gao et al., Table 8: Experimental results on 5-shot MMLU (accuracy, %). Our evaluation is conducted using LM-Evaluation-Harness provided at https://github.com/ ...
January 2023
... In each example, denotes the core interaction design, while indicates the absence of typical paradigm instances identified in our paper, leaving for future exploration. The examples are from (a1) [43], (a3) [46], (a4) [32], (b1) [53], (b2) [72], (b3) [38], (b4) [80], (c1) [75], (d1) [70], and (d2) [4]. (1/19). ...
March 2023
... The scaling laws for neural language models further elucidated the relationship between model size, dataset size, and performance, highlighting the benefits of scaling up model parameters and training data [20]. Models like GPT-3 and GPT-4 [21] have pushed the boundaries of LLM capabilities, exhibiting emergent abilities and even sparking discussions about artificial general intelligence [22]. Beyond text, LLMs are also being extended to handle multimodal inputs, as seen in models designed for tasks like image captioning [23] and benchmarks developed to evaluate emotional intelligence in multimodal contexts [5]. ...
March 2023
... To establish the design goals for our system, we adapted Pirolli and Card's sense-making framework [45] with insights gained from semi-structured interviews with three data scientists experienced in subgroup analysis. Sense-making captures how analysts move between individual observations and larger-scale, more rigorous hypotheses, similar to the process of insight discovery in EDA [8,59]. However, it is unclear how the stages of sense-making might correspond to exploratory analysis steps on subgroup data. ...
June 2022
ACM Transactions on Computer-Human Interaction
... 9. Unambiguity is estimated as an inverse total number of definitions in the dictionary corresponding to the term entry. 10. Nominativity (as opposed to descriptive attribute) is calculated according to the formula Knom = 1/(1+nconj+nend), where nconj is an inverse number of conjunctions in the collocation, and nend is the number of verb endings "ty", "tysja", "tysj". ...
December 2022
... Then, we qualitatively analyze the final list of tokens in order to identify linguistic patterns which can be converted into features. This method is similar to the approach presented by Zhou et al. (2022), which consists in deriving information about the internal reasoning of a complex model from local explanations of its predictions. We restrict our analysis to the fifty tokens which receive the most attention for each class (opinion and news). ...
January 2022