Gavin Abercrombie

Gavin Abercrombie
  • Doctor of Philosophy
  • Research Associate at Heriot-Watt University

About

40
Publications
13,433
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
495
Citations
Current institution
Heriot-Watt University
Current position
  • Research Associate

Publications

Publications (40)
Conference Paper
Full-text available
Transcripts of UK parliamentary debates provide access to the opinions of politicians towards many important topics, but due to the large quantity of textual data and the specialised language used, they are not straightforward for human readers to process. We apply opinion mining methods to these transcripts to classify the sentiment polarity of sp...
Conference Paper
Full-text available
Analysis of the topics mentioned and opinions expressed in parliamentary debate motions–or proposals–is difficult for human readers, but necessary for understanding and automatic processing of the content of the subsequent speeches. We present a dataset of debate motions with pre-existing ‘policy’labels, and investigate the utility of these labels...
Article
Full-text available
Parliamentary and legislative debate transcripts provide access to information concerning the opinions, positions, and policy preferences of elected politicians. They attract attention from researchers from a wide variety of backgrounds, from political and social sciences to computer science. As a result, the problem of computational sentiment and...
Preprint
Full-text available
Gender-Based Violence (GBV) is an increasing problem online, but existing datasets fail to capture the plurality of possible annotator perspectives or ensure the representation of affected groups. We revisit two important stages in the moderation pipeline for GBV: (1) manual data labelling; and (2) automated classification. For (1), we examine two...
Article
Full-text available
Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful con...
Article
Full-text available
In Artificial Intelligence research, perspectivism is an approach to machine learning that aims at leveraging data annotated by different individuals in order to model varied perspectives that influence their opinions and world view. We present the first survey of datasets and methods relevant to perspectivism in Natural Language Processing (NLP)....
Preprint
Full-text available
This paper introduces a collaborative, human-centered taxonomy of AI, algorithmic and automation harms. We argue that existing taxonomies, while valuable, can be narrow, unclear, typically cater to practitioners and government, and often overlook the needs of the wider public. Drawing on existing taxonomies and a large repository of documented inci...
Preprint
Full-text available
Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful con...
Preprint
Full-text available
In this paper, we trace the biases in current natural language processing (NLP) models back to their origins in racism, sexism, and homophobia over the last 500 years. We review literature from critical race theory, gender studies, data ethics, and digital humanities studies, and summarize the origins of bias in NLP models from these social science...
Preprint
Full-text available
Automated dialogue or conversational systems are anthropomorphised by developers and personified by users. While a degree of anthropomorphism is inevitable, conscious and unconscious design choices can guide users to personify them to varying degrees. Encouraging users to relate to automated systems as if they were human can lead to transparency an...
Preprint
Full-text available
There are two competing approaches for modelling annotator disagreement: distributional soft-labelling approaches (which aim to capture the level of disagreement) or modelling perspectives of individual annotators or groups thereof. We adapt a multi-task architecture -- which has previously shown success in modelling perspectives -- to evaluate its...
Preprint
Full-text available
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13\% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable...
Preprint
Full-text available
NLP datasets annotated with human judgments are rife with disagreements between the judges. This is especially true for tasks depending on subjective judgments such as sentiment analysis or offensive language detection. Particularly in these latter cases, the NLP community has come to realize that the approach of 'reconciling' these different subje...
Preprint
Full-text available
We commonly use agreement measures to assess the utility of judgements made by human annotators in Natural Language Processing (NLP) tasks. While inter-annotator agreement is frequently used as an indication of label reliability by measuring consistency between annotators, we argue for the additional use of intra-annotator agreement to measure labe...
Preprint
Full-text available
Conversational AI systems can engage in unsafe behaviour when handling users' medical queries that can have severe consequences and could even lead to deaths. Systems therefore need to be capable of both recognising the seriousness of medical inputs and producing responses with appropriate levels of risk. We create a corpus of human written English...
Article
Full-text available
Legislative debate transcripts provide citizens with information about the activities of their elected representatives, but are difficult for people to process. We propose the novel task of policy-focused stance detection, in which both the policy proposals under debate and the position of the speakers towards those proposals are identified. We ada...
Preprint
We present the first English corpus study on abusive language towards three conversational AI systems gathered "in the wild": an open-domain social bot, a rule-based chatbot, and a task-based system. To account for the complexity of the task, we take a more `nuanced' approach where our ConvAI dataset reflects fine-grained notions of abuse, as well...
Preprint
Full-text available
Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researche...
Preprint
Full-text available
Technology companies have produced varied responses to concerns about the effects of the design of their conversational AI systems. Some have claimed that their voice assistants are in fact not gendered or human-like -- despite design features suggesting the contrary. We compare these claims to user perceptions by analysing the pronouns they use wh...
Preprint
Parliamentary and legislative debate transcripts provide access to information concerning the opinions, positions and policy preferences of elected politicians. They attract attention from researchers from a wide variety of backgrounds, from political and social sciences to computer science. As a result, the problem of automatic sentiment and posit...
Conference Paper
Full-text available
Hansard transcripts provide access to Members of Parliament’s opinions on many important issues, but are difficult for people to process. Existing corpora for sentiment analysis in Hansard debates rely on speakers’ votes as sentiment labels, but these votes are known to be constrained by speakers’ party affiliations. We develop an annotation scheme...

Network

Cited By