About
195
Publications
31,966
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,615
Citations
Introduction
Current institution
Additional affiliations
October 2013 - present
October 2010 - September 2013
Publications
Publications (195)
Transformer-based language models have recently been at the forefront of active research in text generation. However, these models' advances come at the price of prohibitive training costs, with parameter counts in the billions and compute requirements measured in petaflop/s-decades. In this paper, we investigate transformer-based architectures for...
Linear-chain conditional random fields (CRFs) are a common model component for sequence labeling tasks when modeling the interactions between different labels is important. However, the Markov assumption limits linear-chain CRFs to only directly modeling interactions between adjacent labels. Weighted finite-state transducers (FSTs) are a related ap...
In theatre, playwrights use the portrayal of characters to explore culturally based gender norms. In this paper, we develop quantitative methods to study gender depiction in the non-religious works (comedias) of Pedro Calder\'on de la Barca, a prolific Spanish 17th century author. We gather insights from a corpus of more than 100 plays by using a g...
Due to the widespread use of large language models (LLMs), we need to understand whether they embed a specific “worldview” and what these views reflect. Recent studies report that, prompted with political questionnaires, LLMs show left-liberal leanings (Feng et al., 2023; Motoki et al., 2024). However, it is as yet unclear whether these leanings ar...
Political discourse on Twitter is a moving target: politicians continuously make statements about their positions. It is therefore crucial to track their discourse on social media to understand their ideological positions and goals. However, Twitter data is also challenging to work with since it is ambiguous and often dependent on social context, a...
Frame semantics (Fillmore 1976) is a framework for the analysis of natural language semantics in terms of conceptual structures which can provide a semantic basis to constructions. Its cross-lingual applicability is subject to debate. Our study leverages recent developments in computational linguistics to assess the cross-lingual applicability of f...
Dual encoder architectures like CLIP models map two types of inputs into a shared embedding space and learn similarities between them. However, it is not understood how such models compare two inputs. Here, we address this research gap with two contributions. First, we derive a method to attribute predictions of any differentiable dual encoder onto...
The opinions of political actors (e.g., politicians, parties, organizations) expressed through claims are the core elements of political debates and decision-making. Political actors communicate through different channels: parties publish manifestos for major elections, while individual actors make statements on a day-today basis as reflected in th...
An important aspect of responsible recommendation systems is the transparency of the prediction mechanisms. This is a general challenge for deep-learning-based systems such as the currently predominant neural news recommender architectures which are optimized to predict clicks by matching candidate news items against users’ reading histories. Such...
Despite the success of Siamese encoder models such as sentence transformers (ST), little is known about the aspects of inputs they pay attention to. A barrier is that their predictions cannot be attributed to individual features, as they compare two inputs rather than processing a single one. This paper derives a local attri-bution method for Siame...
Scaling analysis is a technique in computational political science that assigns a political actor (e.g. politician or party) a score on a predefined scale based on a (typically long) body of text (e.g. a parliamentary speech or an election manifesto). For example, political scientists have often used the left-right scale to systematically analyse p...
The question of what kinds of linguistic information are encoded in different layers of Transformer-based language models is of considerable interest for the NLP community. Existing work, however, has overwhelmingly focused on word-level representations and encoder-only language models with the masked-token training objective. In this paper, we pre...
Emotions, which are responses to salient events, can be realized in text implicitly, for instance with mere references to facts (e.g., “That was the beginning of a long war”). Interpreting affective meanings thus relies on the readers’ background knowledge, but that is hardly modeled in computational emotion analysis. Much work in the field is focu...
This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word m...
Automatic extraction of party (dis)similarities from texts such as party election manifestos or parliamentary speeches plays an increasing role in computational political science. However, existing approaches are fundamentally limited to targeting only global party (dis)-similarity: they condense the relationship between a pair of parties into a si...
Variants of the BERT architecture specialised for producing full-sentence representations often achieve better performance on downstream tasks than sentence embeddings extracted from vanilla BERT. However, there is still little understanding of what properties of inputs determine the properties of such representations. In this study, we construct s...
Newspaper reports provide a rich source of information on the unfolding of public debates, which can serve as basis for inquiry in political science. Such debates are often triggered by critical events, which attract public attention and incite the reactions of political actors: crisis sparks the debate. However, due to the challenges of reliable a...
In this study, we aim at distinguishing comedies and tragedies among 112 dramas written by Calderón de la Barca, using procedures established by distributional semantics. 15 each of these comedias nuevas have already been classified by qualitative researchers as either tragedies or comedies, respectively; for another 82 dramas the classification wa...
Even though fine-tuned neural language models have been pivotal in enabling "deep" automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political...
El objetivo de este estudio es clasificar 112 dramas escritos por Calderón de la Barca, comedias y tragedias, utilizando procedimientos computacionales basados en la semántica distribucional. Quince de estas comedias nuevas ya han sido clasificadas cualitativamente por investigadores especialistas como tragedias o comedias; para otros 82 dramas no...
Even though fine-tuned neural language models have been pivotal in enabling "deep" automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political...
Positional analyses in political science target the identification of (dis)similarities between parties based on their stance on a given policy or political demand (claim)-which are, unsurprisingly, a well-explored source of information for text-as-data approaches to this task. Political actors, however, do not only make claims regarding a given po...
In recent years, there has been an increasing awareness that many NLP systems incorporate biases of various types (e.g., regarding gender or race) which can have significant negative consequences. At the same time, the techniques used to statistically analyze such biases are still relatively simple. Typically, studies test for the presence of a sig...
A number of models for neural content-based news recommendation have been proposed. However, there is limited understanding of the relative importances of the three main components of such systems (news encoder, user encoder, and scoring function) and the trade-offs involved. In this paper, we assess the hypothesis that the most widely used means o...
A recent research direction in computational linguistics involves efforts to make the field, which used to focus primarily on English, more multilingual and inclusive. However, resource creation often remains a bottleneck for many languages, in particular at the semantic level. In this article, we consider the case of frame-semantic annotation. We...
The ’short answer’ question format is a widely used tool in educational assessment, in which students write one to three sentences in response to an open question. The answers are subsequently rated by expert graders. The agreement between these graders is crucial for reliable analysis, both in terms of educational strategies and in terms of develo...
The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by...
In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to a...
Source code summarization is the task of generating a high-level natural language description for a segment of programming language code. Current neural models for the task differ in their architecture and the aspects of code they consider. In this paper, we show that three SOTA models for code summarization work well on largely disjoint subsets of...
The capabilities and limitations of BERT and similar models are still unclear when it comes to learning syntactic abstractions, in particular across languages. In this paper, we use the task of subordinate-clause detection within and across languages to probe these properties. We show that this task is deceptively simple, with easy gains offset by...
Newspaper reports provide a rich source of information on the unfolding of public debate on specific policy fields that can serve as basis for inquiry in political science. Such debates are often triggered by critical events, which attract public attention and incite the reactions of political actors: crisis sparks the debate. However, due to the c...
We present the results of a large-scale corpus-based comparison of two German event nominalization patterns: deverbal nouns in -ung (e.g., die Evaluierung, ‘the evaluation’) and nominal infinitives (e.g., das Evaluieren, ‘the evaluating’). Among the many available event nominalization patterns for German, we selected these two because they are both...
The recognition of hate speech and offensive language (HOF) is commonly formulated as a classification task to decide if a text contains HOF. We investigate whether HOF detection can profit by taking into account the relationships between HOF and similar concepts: (a) HOF is related to sentiment analysis because hate speech is typically a negative...
Cognitive scientists have long used distributional semantic representations of categories. The predominant approach uses distributional representations of category-denoting nouns, such as “city” for the category city. We propose a novel scheme that represents categories as prototypes over representations of names of its members, such as “Barcelona,...
In structured prediction, a major challenge for models is to represent the interdependencies within their output structures. For the common case where outputs are structured as a sequence, linear-chain conditional random fields (CRFs) are a widely used model class which can learn local dependencies in output sequences. However, the CRF's Markov ass...
When humans judge the affective content of texts, they also implicitly assess the correctness of such judgment, that is, their confidence. We hypothesize that people's (in)confidence that they performed well in an annotation task leads to (dis)agreements among each other. If this is true, confidence may serve as a diagnostic tool for systematic dif...
Span identification (in short, span ID) tasks such as chunking, NER, or code-switching detection, ask models to identify and classify relevant spans in a text. Despite being a staple of NLP, and sharing a common structure, there is little insight on how these tasks' properties influence their difficulty, and thus little guidance on what model famil...
The Center for Reflected Text Analytics (CRETA) develops interdisciplinary mixed methods for text analytics in the research fields of the digital humanities. This volume is a collection of text analyses from specialty fields including literary studies, linguistics, the social sciences, and philosophy. It thus offers an overview of the methodology o...
Most approaches to emotion analysis in fictional texts focus on detecting the emotion class expressed over the course of a text, either with machine learning-based classification or with dictionaries. These approaches do not consider who experiences the emotion and what triggers it and therefore, as a necessary simplicifaction, aggregate across dif...
Discourse network analysis is an aspiring development in political science which analyzes political debates in terms of bipartite actor/claim networks. It aims at understanding the structure and temporal dynamics of major political debates as instances of politicized democratic decision making. We discuss how such networks can be constructed on the...
This article investigates the integration of machine learning in the political claim annotation workflow with the goal to partially automate the annotation and analysis of large text corpora. It introduces the MARDY annotation environment and presents results from an experiment in which the annotation quality of annotators with and without machine...
DEbateNet-migr15 is a manually annotated dataset for German which covers the public debate on immigration in 2015. The building block of our annotation is the political science notion of a claim, i.e., a statement made by a political actor (a politician, a party, or a group of citizens) that a specific action should be taken (e.g., vacant flats sho...
In response to the continuing research interest in computational semantic analysis, we have proposed a new task for SemEval-2010: multi-way classification of mutually exclusive semantic relations between pairs of nominals. The task is designed to compare different approaches to the problem and to provide a standard testbed for future research. In t...
This paper describes the MARDY corpus annotation environment developed for a collaboration between political science and computational linguistics. The tool realizes the complete workflow necessary for annotating a large newspaper text collection with rich information about claims (demands) raised by politicians and other actors, including claim an...
Understanding the structures of political debates (which actors make what claims) is essential for understanding democratic political decision-making. The vision of computational construction of such discourse networks from newspaper reports brings together political science and natural language processing. This paper presents three contributions t...
In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to a...
The empirical study of emotions in Spanish travelogues and reports requires cultural knowledge as well as the use of linguistic annotation and quantitative methods. We report on an interdisciplinary project in which we perform emotion annotation on a selection of texts spanning several centuries to analyze the differences across different time slic...
Sentiment analysis has a range of corpora available across multiple languages. For emotion analysis, the situation is more limited, which hinders potential research on cross-lingual modeling and the development of predictive models for other languages. In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in analogy...
Categorization is a central capability of human cognition, and a number of theories have been developed to account for properties of categorization. Despite the fact that many semantic tasks involve categorization, theories of categorization do not play a major role in contemporary research in computational linguistics. This paper follows the idea...
This chapter gives an overview of work on the representation of semantic information in lexicon resources for computational natural language processing (NLP). It starts with a broad overview of the history and state of the art of different types of semantic lexicons in Computational Linguistics, and discusses their main use cases. Section 2 is devo...
One of the central problems in the semantics of derived words is polysemy (see, for example, the recent contributions by Lieber 2016 and Plag et al. 2018 ). In this paper, we tackle the problem of disambiguating newly derived words in context by applying Distributional Semantics ( Firth 1957 ) to deverbal -ment nominalizations (e.g. bedragglement,...
One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects rep...
Characterizing paraphrases formally has proven to be a challenging task. Hasegawa et al. (2011) pointed out the usefulness of FrameNet for paraphrase research, focusing on paraphrases which are backed by underlying classical linguistic relationships such as synonymy or voice alternations. This article proposes that other frame-to-frame-relations, n...
In computational linguistics, a large body of work exists on distributed modeling of lexical relations, focussing largely on lexical relations such as hypernymy (scientist -- person) that hold between two categories, as expressed by common nouns. In contrast, computational linguistics has paid little attention to entities denoted by proper nouns (M...
Word sense induction is the most prominent unsupervised approach to lexical disambiguation. It clusters word instances, typically represented by their bag-of-words contexts. Therefore, uninformative and ambiguous contexts present a major challenge. In this paper, we investigate the use of an alternative instance representation based on lexical subs...
Word sense induction is the most prominent unsupervised approach to lexical disambiguation. It clusters word instances, typically represented by their bag-of-words contexts. Therefore, uninformative and ambiguous contexts present a major challenge. In this paper, we investigate the use of an alternative instance representation based on lexical subs...
Complement coercion (begin a book →reading) involves a type clash between an event-selecting verb and an entity-denoting object, triggering a covert event (reading). Two main factors involved in complement coercion have been investigated: the semantic type of the object (event vs. entity), and the typicality of the covert event (the author began a...
We report on a study applying compositional distributional semantic models (CDSMs) to a set of Ukrainian derivational patterns. Ukrainian is an interesting language as it is morphologically rich, and low-resource. Our study aims at resolving inconsistent results from previous studies which employed CDSMs for derivation; we provide evidence for a cr...
This paper analyses the development of emotions in different genres of literature. We discover different emotion patterns in adventures, mystery, science fiction, romance and humorous fiction stories. We support our findings with quantitative and qualitative analyses.
Reference is the crucial property of language that allows us to connect linguistic expressions to the world. Modeling it requires handling both continuous and discrete aspects of meaning. Data-driven models excel at the former, but struggle with the latter, and the reverse is true for symbolic models. We propose a fully data-driven, end-to-end trai...
There is a rich variety of data sets for sentiment analysis (viz., polarity and subjec-tivity classification). For the more challenging task of detecting discrete emotions following the definitions of Ekman and Plutchik, however, there are much fewer data sets, and notably no resources for the social media domain. This paper contributes to closing...
Compositional distributional semantic models (CDSMs) have successfully been applied to the task of predicting the meaning of a range of linguistic constructions. Their performance on semi-compositional word formation process of (morphological) derivation, however, has been extremely variable, with no large-scale empirical investigation to date. Thi...
One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects rep...
Native Language Identification (NLI) is the task of recognizing the native language of an author from text that they wrote in another language. In this paper, we investigate the generalizability of NLI models among learner corpora, and from learner corpora to a new text type, namely scientific articles. Our main results are: (a) the science corpus...
Conversion is a word formation operation that changes the grammatical category of a word in the absence of overt morphology. Conversion is extremely productive in English (e.g., tunnel, talk). This paper investigates whether distributional information can be used to predict the diachronic direction of conversion for homophonous noun‐verb pairs. We...
Syntax-based semantic spaces are more flexible and can potentially better model semantic relatedness than bag-of-words spaces. Their application is however limited by sparsity and restricted coverage. We address these problems by smoothing syntax-based with word-based spaces and investigate when to choose which prediction. We obtain the best result...
Distributional methods have proven to excel at capturing fuzzy, graded aspects of meaning (Italy is more similar to Spain than to Germany). In contrast, it is difficult to extract the values of more specific attributes of word referents from distribu-tional representations, attributes of the kind typically found in structured knowledge bases (Italy...
The Practical Lexical Function model (PLF) is a recently proposed compositional distribu-tional semantic model which provides an elegant account of composition, striking a balance between expressiveness and robustness and performing at the state-of-the-art. In this paper, we identify an inconsistency in PLF between the objective function at trainin...
We report on initial work on bridging the concept-to-reference gap using distributional semantics. Specifically, we aim at predicting properties of countries, using distributional vectors to infer database information. Our results are highly encouraging, since we achieve an error reduction of 30% over the baseline and are not far from the upper bou...
Predicting the (distributional) meaning of derivationally related words (read / read+er) from one another has recently been recognized as an instance of distributional compositional meaning construction. However, the properties of this task are not yet well understood. In this paper, we present an analysis of two such composition models on a set of...
We aim to model the results from a selfpaced reading experiment, which tested the effect of semantic type clash and typicality on the processing of German complement coercion. We present two distributional semantic models to test if they can model the effect of both type and typicality in the psycholinguistic study. We show that one of the models,...
Studies across multiple languages show that overt morphological priming leads to a speed-up only for transparent derivations but not for opaque derivations. However, in a recent experiment for German, Smolka et al. (2014) show comparable speed-ups for transparent and opaque derivations, and conclude that German behaves unlike other Indo-European la...