February 2025
·
5 Reads
·
1 Citation
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
February 2025
·
5 Reads
·
1 Citation
December 2024
·
8 Reads
October 2024
·
6 Reads
Stereotypical bias encoded in language models (LMs) poses a threat to safe language technology, yet our understanding of how bias manifests in the parameters of LMs remains incomplete. We introduce local contrastive editing that enables the localization and editing of a subset of weights in a target model in relation to a reference model. We deploy this approach to identify and modify subsets of weights that are associated with gender stereotypes in LMs. Through a series of experiments, we demonstrate that local contrastive editing can precisely localize and control a small subset (< 0.5%) of weights that encode gender bias. Our work (i) advances our understanding of how stereotypical biases can manifest in the parameter space of LMs and (ii) opens up new avenues for developing parameter-efficient strategies for controlling model properties in a contrastive manner.
September 2024
·
7 Reads
This paper proposes temporally aligned Large Language Models (LLMs) as a tool for longitudinal analysis of social media data. We fine-tune Temporal Adapters for Llama 3 8B on full timelines from a panel of British Twitter users, and extract longitudinal aggregates of emotions and attitudes with established questionnaires. We validate our estimates against representative British survey data and find strong positive, significant correlations for several collective emotions. The obtained estimates are robust across multiple training seeds and prompt formulations, and in line with collective emotions extracted using a traditional classification model trained on labeled data. To the best of our knowledge, this is the first work to extend the analysis of affect in LLMs to a longitudinal setting through Temporal Adapters. Our work enables new approaches towards the longitudinal analysis of social media data.
August 2024
·
5 Reads
ACM Transactions on Social Computing
In recent years, several measures have been developed for evaluating group fairness of rankings. Given that these measures were developed with different application contexts and ranking algorithms in mind, it is not straightforward which measure to choose for a given scenario. Previous work has already laid out some categorizations of measures and explored relationships between these, however, there has not yet been a thorough mathematical analysis of practically grounded properties of individual measures. In this paper, we therefore apply an axiomatic approach to perform a comprehensive analysis of existing group fairness measures that have been developed in the context of fair ranking. To this end, we propose a set of fourteen properties for group fairness measures that consider different ranking settings. These properties specifically provide information about how fairness scores of ranked outputs can be interpreted and contextualized. For a given use case, one can then identify which properties are of interest, and select a measure based on whether it satisfies these properties. We further apply our properties on twelve existing group fairness measures, and through both empirical and theoretical results demonstrate that most of these measures only satisfy a small subset of the proposed properties. These findings highlight limitations of existing measures, and provide insights into how to evaluate and interpret different fairness measures in practical deployment. Overall, our work can assist practitioners in selecting appropriate group fairness measures for a specific application, and also aid researchers in designing and evaluating such measures.
August 2024
·
20 Reads
Measuring the similarity of different representations of neural architectures is a fundamental task and an open research challenge for the machine learning community. This paper presents the first comprehensive benchmark for evaluating representational similarity measures based on well-defined groundings of similarity. The representational similarity (ReSi) benchmark consists of (i) six carefully designed tests for similarity measures, (ii) 23 similarity measures, (iii) eleven neural network architectures, and (iv) six datasets, spanning over the graph, language, and vision domains. The benchmark opens up several important avenues of research on representational similarity that enable novel explorations and applications of neural architectures. We demonstrate the utility of the ReSi benchmark by conducting experiments on various neural network architectures, real world datasets and similarity measures. All components of the benchmark are publicly available and thereby facilitate systematic reproduction and production of research results. The benchmark is extensible, future research can build on and further expand it. We believe that the ReSi benchmark can serve as a sound platform catalyzing future research that aims to systematically evaluate existing and explore novel ways of comparing representations of neural architectures.
June 2024
·
6 Reads
January 2024
·
892 Reads
·
62 Citations
Perspectives on Psychological Science
We illustrate how standard psychometric inventories originally designed for assessing noncognitive human traits can be repurposed as diagnostic tools to evaluate analogous traits in large language models (LLMs). We start from the assumption that LLMs, inadvertently yet inevitably, acquire psychological traits (metaphorically speaking) from the vast text corpora on which they are trained. Such corpora contain sediments of the personalities, values, beliefs, and biases of the countless human authors of these texts, which LLMs learn through a complex training process. The traits that LLMs acquire in such a way can potentially influence their behavior, that is, their outputs in downstream tasks and applications in which they are employed, which in turn may have real-world consequences for individuals and social groups. By eliciting LLMs’ responses to language-based psychometric inventories, we can bring their traits to light. Psychometric profiling enables researchers to study and compare LLMs in terms of noncognitive characteristics, thereby providing a window into the personalities, values, beliefs, and biases these models exhibit (or mimic). We discuss the history of similar ideas and outline possible psychometric approaches for LLMs. We demonstrate one promising approach, zero-shot classification, for several LLMs and psychometric inventories. We conclude by highlighting open challenges and future avenues of research for AI Psychometrics.
January 2024
·
1 Read
·
1 Citation
December 2023
·
98 Reads
·
1 Citation
PNAS Nexus
Wikipedia is one of the most successful collaborative projects in history. It is the largest encyclopedia ever created, with millions of users worldwide relying on it as the first source of information as well as for fact-checking and in-depth research. As Wikipedia relies solely on the efforts of its volunteer editors, its success might be particularly affected by toxic speech. In this paper, we analyze all 57 million comments made on user talk pages of 8.5 million editors across the six most active language editions of Wikipedia to study the potential impact of toxicity on editors’ behavior. We find that toxic comments are consistently associated with reduced activity of editors, equivalent to 0.5–2 active days per user in the short term. This translates to multiple human-years of lost productivity, considering the number of active contributors to Wikipedia. The effects of toxic comments are potentially even greater in the long term, as they are associated with a significantly increased risk of editors leaving the project altogether. Using an agent-based model, we demonstrate that toxicity attacks on Wikipedia have the potential to impede the progress of the entire project. Our results underscore the importance of mitigating toxic speech on collaborative platforms such as Wikipedia to ensure their continued success.
... Their work highlighted the ability to localize gender bias and proposed parameter-efficient fine-tuning strategies to mitigate it. Similarly, Lutz et al. (2024) introduced local contrastive editing, a technique leveraging unstructured pruning to precisely localize individual model weights responsible for encoding gender stereotypes. This method enabled them to edit these weights efficiently, mitigating bias without significant degradation of model performance. ...
January 2024
... AI systems' social abilities and how they articulate dominance can be considered in the design process [e.g., 31]. Resulting AI systems' psychometric qualities can be measured [69] and it can be evaluated whether they are perceived as teammates [e.g., 2, 50]. ...
January 2024
Perspectives on Psychological Science
... In the process of the creation of open knowledge content in Wikipedia, effective management of volunteers means not only aligning people with common goals, but also taking care of the well-being of the users and providing them with technical and substantive support. Self-governing communities of practice such as Wikipedia face challenges in meeting content creation goals by optimizing volunteer efforts and a division of tasks (Smirnov, Oprea and Strohmaier 2023). Providing users with an engaging and rewarding experience is essential to attracting and retaining user participation, which is often addressed through discussions about motivations and non-material rewards. ...
December 2023
PNAS Nexus
... Engler et al. [58] introduce SensePOLAR, a technique that enhances the interpretability of pre-trained contextual word embeddings by differentiating word senses. It extends the POLAR framework introduced by Mathew et al. [55], utilising semantic differentials (scales between two antonyms) to create a sense-aware space. ...
January 2022
... For instance, radio frequency identification (RFID), which uses radio waves to transmit data wirelessly between an RFID transponder and an RFID reader, has been used to identify face-to-face interactions in many recent studies [10,[14][15][16]. For example, Génois et al. [17] present the data collected at four science conferences for social interaction studies where the participants use cards and antennas with RFID technology, detecting other sensors in neighborhoods of a 1.5 m radius in face-to-face contact. This tool has the limitation of needing to be in certain areas with no obstacles in its way to be detected in order to obtain accurate measurements. ...
June 2023
Personality Science
... An agent's network is considered homophilous if the national identities of over seventy percent of those tied to the agent are within the national identity similarity threshold, s ij ≤ 0.5 . This model relaxes the majority and minority group proportions of Neuhauser et al. 2022. ...
May 2023
... Generalizing ideas from [40], in this paper we address this gap. We design an algorithm to efficiently generate synthetic network samples that (a) resemble the mesoscale path structure of a given temporal network up to a certain depth d, and are (b) maximally random otherwise. ...
April 2023
... It is not difficult to figure out that neighborhood aggregation is the core of all GNNs. Whether for unsigned graphs or signed graphs, GNN-backbones often face fairness and bias issue due to their homogeneity caused by the aggregation mechanisms (Chen et al. 2022;Hussain et al. 2022;Zhang et al. 2023a;Chen et al. 2024;Liu, Nguyen, and Fang 2023). Even within the usage of balance theory, which sets forth two paths-namely, the balanced and the unbalanced-capturing the interplay of both positive and negative relationships, aggregation is essential. ...
November 2022
... Different from other novel technologies, though, AI can be imbued with humanlike characteristics, has the potential to fulfill roles once reserved exclusively for humans, and is increasingly compared with human counterparts on psychological dimensions De Freitas et al., 2023;Morewedge, 2022). AI language models can possess different personality profiles and make different social impressions through aggregable or neurotic verbal expressions (Pellert et al., 2022;Safdari et al., 2023). Anthropomorphizing robots can often induce more trust in service contexts through their human-like appearances Waytz et al., 2014). ...
December 2022
... It involves offline activity in campaigns, meetings at local branches, etc. On the other hand, participation has never been the absolute value -just as in Germany, where the Pirates introduced the possibility for members to delegate their powers via liquid democracy (Kling 2015). ...
August 2021
Proceedings of the International AAAI Conference on Web and Social Media