Aurélie Herbelot’s research while affiliated with University of Trento and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (49)


LLMs as supersloppers
  • Preprint
  • File available

November 2024

·

47 Reads

·

Lucy Duggan

·

Aurelie Herbelot

·

[...]

·

Eva von Redecker

The development of Large Language Models (LLMs) as a component of systems such as ChatGPT foregrounds a range of issues which can only be analysed through novel interdisciplinary approaches. Our pilot project `Exploring novel figurative language to conceptualise Large Language Models’, funded by Cambridge Language Sciences, is aimed at helping both specialists and non-specialists gain a more precise understanding of the technology and its implications. In this poster, we use `slop' as a metaphor to highlight one aspect of LLMs, but situate the issue in a broader context. We use ‘slop’ to mean text delivered to a reader which is of little or no value to them (or is even harmful) OR is so verbose or convoluted that the value is hidden. Examples of slop include: over-general instructions, unnecessary terms and conditions and spam email. The term `slop' is sometimes used specifically for AI-generated content but in our usage it predates machine-generated text. Slop arises when desiderata other than communication with readers determine text production or delivery. Systems incorporating LLMs may become `supersloppers': tools for the creation and delivery of more and more pointless text. Because so much slop already exists, and because it is often repetitious, maximizing the text on which LLMs are trained results in systems which excel at production of slop. It is useful to think of slop as a category because it draws attention to specific ways in which the arenas we are examining are far removed from the basic setting of human conversation.

Download

Figure 1 A graphical model that shows dependencies between random variables: the word bat may have been uttered because the situation included some animal referent, or a baseball bat. Given the presence of the word fly, the animal is more likely, though the baseball bat is not entirely impossible (as in The bat flew across the pitch, presumably due to some player's frustration.) Shaded nodes are random variables whose values are known (called observed).
Figure 2 Directed graphical model and (part of) DRS for the sentence The astronomer married the star. Nodes are random variables. Next to the nodes: possible values for some random variables. Dashed line: Link between DRS condition and random variable.
Figure 3 Situation Description System for the sentence A bat was sleeping, with selectional constraints only. For each random variable, the list of all possible values is shown next to the node. Node numbers have been added for easier discussion in the text.
Figure 6 Conflicting constraints in the sentence The astronomer married the star: Either the concept for star conflicts with the selectional constraint (left), or it conflicts with the preference for a coherent scenario (right)
How to Marry a Star: Probabilistic Constraints for Meaning in Context

January 2024

·

33 Reads

·

2 Citations

Journal of Semantics

In this paper, we derive a notion of word meaning in context that characterizes meaning as both intensional and conceptual. We introduce a framework for specifying local as well as global constraints on word meaning in context, together with their interactions, thus modelling a wide range of lexical shifts and ambiguities observed in utterance interpretation. We represent sentence meaning as a situation description system, a probabilistic model which takes utterance understanding to be the mental process of describing to oneself one or more situations that would account for an observed utterance. We show how the system can be implemented in practice, and apply it to examples containing various contextualisation phenomena.


How Understanding Shapes Reasoning: Experimental Argument Analysis with Methods from Psycholinguistics and Computational Linguistics

June 2023

·

22 Reads

Empirical insights into language processing have a philosophical relevance that extends well beyond philosophical questions about language. This chapter will discuss this wider relevance: We will consider how experimental philosophers can examine language processing in order to address questions in several different areas of philosophy. To do so, we will present the emerging research program of experimental argument analysis (EAA) that examines how automatic language processing shapes verbal reasoning – including philosophical arguments. The evidential strand of experimental philosophy uses mainly questionnaire-based methods to assess the evidentiary value of intuitive judgments that are adduced as evidence for philosophical theories and as premises for philosophical arguments. Extending this prominent strand of experimental philosophy, EAA underpins such assessments, extends the scope of the assessments, and expands the range of the empirical methods employed: EAA examines how automatic inferences that are continually made in language comprehension and production shape verbal reasoning, and draws on findings about comprehension biases that affect the contextualisation of such default inferences, in order to explain and expose fallacies. It deploys findings to assess premises and inferences from premises to conclusions, in philosophical arguments. To do so, it adapts methods from psycholinguistics and recruits methods from computational linguistics.


Figure 3: Distribution of average cosine similarities for the three groups of kappa j , showing low, intermediate and high average shifts respectively.
CALaMo: a Constructionist Assessment of Language Models

February 2023

·

42 Reads

This paper presents a novel framework for evaluating Neural Language Models' linguistic abilities using a constructionist approach. Not only is the usage-based model in line with the underlying stochastic philosophy of neural architectures, but it also allows the linguist to keep meaning as a determinant factor in the analysis. We outline the framework and present two possible scenarios for its application.


Algorithmic Diversity and Tiny Models: Comparing Binary Networks and the Fruit Fly Algorithm on Document Representation Tasks

Neural language models have seen a dramatic increase in size in the last years. While many still advocate that 'bigger is better', work in model distillation has shown that the number of parameters used by very large networks is actually more than what is required for state-of-the-art performance. This prompts an obvious question: can we build smaller models from scratch, rather than going through the inefficient process of training at scale and subsequently reducing model size? In this paper, we investigate the behaviour of a biologically inspired algorithm, based on the fruit fly's olfactory system. This algorithm has shown good performance in the past on the task of learning word embeddings. We now put it to the test on the task of semantic hashing. Specifically, we compare the fruit fly to a standard binary network on the task of generating locality-sensitive hashes for text documents, measuring both task performance and energy consumption. Our results indicate that the two algorithms have complementary strengths while showing similar electricity usage.


Figure 1. Mean plausibility ratings for each of the eight conditions in the eye tracking study. Error bars show the standard error of the mean. (From Fischer and Engelhardt 2019)
Occurrence and completion frequencies for 'see' (from Fischer and Engelhardt 2020)
Example stimuli and regions of interest for eye movement analysis (from Fischer and Engelhardt 2019)
How understanding shapes reasoning: Experimental argument analysis with methods from psycholinguistics and computational linguistics

July 2022

·

52 Reads

·

3 Citations

Empirical insights into language processing have a philosophical relevance that extends well beyond philosophical questions about language. This chapter will discuss this wider relevance: We will consider how experimental philosophers can examine language processing in order to address questions in several different areas of philosophy. To do so, we will present the emerging research program of experimental argument analysis (EAA) that examines how automatic language processing shapes verbal reasoning – including philosophical arguments. The evidential strand of experimental philosophy uses mainly questionnaire-based methods to assess the evidentiary value of intuitive judgments that are adduced as evidence for philosophical theories and as premises for philosophical arguments. Extending this prominent strand of experimental philosophy, EAA underpins such assessments, extends the scope of the assessments, and expands the range of the empirical methods employed: EAA examines how automatic inferences that are continually made in language comprehension and production shape verbal reasoning, and draws on findings about comprehension biases that affect the contextualisation of such default inferences, in order to explain and expose fallacies. It deploys findings to assess premises and inferences from premises to conclusions, in philosophical arguments. To do so, it adapts methods from psycholinguistics and recruits methods from computational linguistics.


LSTM networks are capable of keeping track of long-term dependencies. As recurrent neural networks (upper layer of the figure), they present a chain-like structure: at each time step t, the network's output is computed based on both the input of time t(xt) and the network's state at time t−1(ht−1). As opposed to a simple recurrent cell, an LSTM cell (lower layer of the figure) has the ability to regulate how the two kinds of information (input and previous state) are weighted towards the computation of the output. The first gate, the forget gate, evaluates Ct−1 (a representation of the previous state different from ht − 1) against xt and learns what information to keep from previous steps, including it in a vector ft. Next, a candidate value for the current state Ĉt is computed along with the input gate vector it that weighs how much of the input will contribute to the current state. Finally, the state of the cell Ct is computed by weighting Ct−1 with the forget gate vector ft and the at Ĉt with the input vector it. ht is then computed from Ct. A complete and easy to read guide to LSTMs can be found at https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
Can Recurrent Neural Networks Validate Usage-Based Theories of Grammar Acquisition?

March 2022

·

70 Reads

·

6 Citations

It has been shown that Recurrent Artificial Neural Networks automatically acquire some grammatical knowledge in the course of performing linguistic prediction tasks. The extent to which such networks can actually learn grammar is still an object of investigation. However, being mostly data-driven, they provide a natural testbed for usage-based theories of language acquisition. This mini-review gives an overview of the state of the field, focusing on the influence of the theoretical framework in the interpretation of results.


Mean plausibility ratings per condition and group: psychology undergraduates (top), Other Philosophers (middle), and Philosophers of Perception (bottom). Error bars show the standard error of the mean
Mean plausibility ratings per group in s-inconsistent conditions. Error bars show the standard error of the mean
Philosophers’ linguistic expertise: A psycholinguistic approach to the expertise objection against experimental philosophy

February 2022

·

132 Reads

·

6 Citations

Synthese

Philosophers are often credited with particularly well-developed conceptual skills. The ‘expertise objection’ to experimental philosophy builds on this assumption to challenge inferences from findings about laypeople to conclusions about philosophers. We draw on psycholinguistics to develop and assess this objection. We examine whether philosophers are less or differently susceptible than laypersons to cognitive biases that affect how people understand verbal case descriptions and judge the cases described. We examine two possible sources of difference: Philosophers could be better at deploying concepts, and this could make them less susceptible to comprehension biases (‘linguistic expertise objection’). Alternatively, exposure to different patterns of linguistic usage could render philosophers vulnerable to a fundamental comprehension bias, the linguistic salience bias, at different points (‘linguistic usage objection’). Together, these objections mount a novel ‘master argument’ against experimental philosophy. To develop and empirically assess this argument, we employ corpus analysis and distributional semantic analysis and elicit plausibility ratings from academic philosophers and psychology undergraduates. Our findings suggest philosophers are better at deploying concepts than laypeople but are susceptible to the linguistic salience bias to a similar extent and at similar points. We identify methodological consequences for experimental philosophy and for philosophical thought experiments.


Ideal Words: A Vector-Based Formalisation of Semantic Competence

May 2021

·

268 Reads

·

4 Citations

KI - Ku_nstliche Intelligenz

In this theoretical paper, we consider the notion of semantic competence and its relation to general language understanding—one of the most sough-after goals of Artificial Intelligence. We come back to three main accounts of competence involving (a) lexical knowledge; (b) truth-theoretic reference; and (c) causal chains in language use. We argue that all three are needed to reach a notion of meaning in artificial agents and suggest that they can be combined in a single formalisation, where competence develops from exposure to observable performance data. We introduce a theoretical framework which translates set theory into vector-space semantics by applying distributional techniques to a corpus of utterances associated with truth values. The resulting meaning space naturally satisfies the requirements of a causal theory of competence, but it can also be regarded as some ‘ideal’ model of the world, allowing for extensions and standard lexical relations to be retrieved.


Figure 1: A visualization of the Doppelgänger test. Each of the 59 novels is split into two parts (Part A and Part B), and then from each one of them, for each character and for the matched common nouns, a word vector is created by using distributional semantics models. Then, by comparing the vectors for part A and part B, we check whether we can correctly match co-referring word vectors.
Novel Aficionados and Doppelg\"angers: a referential task for semantic representations of individual entities

April 2021

·

28 Reads

In human semantic cognition, proper names (names which refer to individual entities) are harder to learn and retrieve than common nouns. This seems to be the case for machine learning algorithms too, but the linguistic and distributional reasons for this behaviour have not been investigated in depth so far. To tackle this issue, we show that the semantic distinction between proper names and common nouns is reflected in their linguistic distributions by employing an original task for distributional semantics, the Doppelg\"anger test, an extensive set of models, and a new dataset, the Novel Aficionados dataset. The results indicate that the distributional representations of different individual entities are less clearly distinguishable from each other than those of common nouns, an outcome which intriguingly mirrors human cognition.


Citations (35)


... Our use of scenarios takes direct inspiration from Erk and Herbelot (2024). Erk and Herbelot crucially distinguish scenario knowledge from word meaning and propose that scenarios, though not connected to word meaning directly, probabilistically influence the concepts that words convey in sentential contexts. ...

Reference:

It’s time for a complete theory of partial predictability in language
How to Marry a Star: Probabilistic Constraints for Meaning in Context

Journal of Semantics

... Frank et al., 2019). Recurrent neural networks control how and which information is relevant for syntactic categorization as gating mechanisms analyse and compare old and new inputs to decide on which information should be stored and which should be forgotten or rewritten (Pannitto and Herbelot, 2022). In the Parallel Architecture, these mechanisms are located in the linguistic structures: the phonological input is processed in the phonological structure. ...

Can Recurrent Neural Networks Validate Usage-Based Theories of Grammar Acquisition?

... Beberapa penelitian sebelumnya telah membahas hubungan antara filsafat dan pembelajaran bahasa. Pertama, penelitian oleh Fischer et al (2022) mengkaji filsafat bahasa sebagai salah satu cabang dari filsafat. Penelitian ini menunjukkan bahwa perkembangan ilmu bahasa dan berbagai aspek linguistik dipengaruhi oleh filsafat bahasa. ...

Philosophers’ linguistic expertise: A psycholinguistic approach to the expertise objection against experimental philosophy

Synthese

... animal, flies, has_fangs, is_black, is_scary Humans can not only list typical features, they are also able to estimate their relative frequencies. Herbelot & Vecchi (2016) asked participants whether, say, all, most, or few bats-that-are-animals are black. They also convert these judgments to probabilities. ...

Many speakers, many worlds: Interannotator variations in the quantification of feature norms
  • Citing Article
  • May 2016

Linguistic Issues in Language Technology

... In the article Ideal Words: a Vector-based Formalisation of Semantic Competence [2], Aurelie Herbelot and Ann Copestake bridge theories of semantic competence with semantic performance with a formal distributional account based on corpus data, arguing that the representation of meaning is in principle learnable from performance data and can be leveraged for teaching artificial agents meaning. ...

Ideal Words: A Vector-Based Formalisation of Semantic Competence

KI - Ku_nstliche Intelligenz

... First, the same word pairs may be rated differently in similarity and relatedness datasets (Bruni et al. 2012;Hill, Reichart, and Korhonen 2015). Second, judgments for related word classes (catdog) are more reliable than for unrelated words (cat-democracy) (Kabbach and Herbelot 2021). Another downside of this type of evaluation is that similarity scores are assigned to pairs of words in isolation. ...

Avoiding Conflict: When Speaker Coordination Does Not Require Conceptual Agreement

Frontiers in Artificial Intelligence

... Frequently, CDS from the CHILDES database (MacWhinney, 2000) is used to train developmentally plausible LMs (cf. Pannitto and Herbelot, 2020;Huebner et al., 2021). While CHILDES-based models have the advantage of learning from authentic data only, they have the disadvantage of not accessing the full breadth of the linguistic input children receive. ...

Recurrent babbling: evaluating the acquisition of grammar from limited input data

... Several frameworks have been proposed to describe fine-grained lexical meaning differences, using types (Asher, 2011), attribute-value matrices (Zeevat et al., 2017), qualia (Del Pinal, 2018) and, frequently, distributional models automatically computed from corpus data (Asher et al., 2016;Baroni et al., 2014;Emerson, 2020a;Erk, 2016;Grefenstette & Sadrzadeh, 2011;Herbelot, 2020;McNally & Boleda, 2017). ...

Re-solve it: simulating the acquisition of core semantic competences from small data
  • Citing Conference Paper
  • January 2020

... They do not propose a concrete algorithm, but they discuss several challenges, and suggest that grounded data might be necessary. In this vein, Kuzmenko and Herbelot (2019) use the Visual Genome dataset (Krishna et al., 2017) to learn vector representations with logically interpretable dimensions, although these vectors are not as expressive as Copestake and Herbelot's ideal distributions. ...

Distributional Semantics in the Real World: Building Word Vector Representations from a Truth-Theoretic Model
  • Citing Conference Paper
  • January 2019