Katrin Erk

Katrin Erk
  • Dr. Ing., Saarland University
  • Professor (Associate) at University of Texas at Austin

About

123
Publications
13,429
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,578
Citations
Current institution
University of Texas at Austin
Current position
  • Professor (Associate)
Additional affiliations
September 2012 - present
University of Texas at Austin
Position
  • Professor (Associate)

Publications

Publications (123)
Conference Paper
Full-text available
We combine logical and distributional rep-resentations of natural language meaning by transforming distributional similarity judg-ments into weighted inference rules using Markov Logic Networks (MLNs). We show that this framework supports both judg-ing sentence similarity and recognizing tex-tual entailment by appropriately adapting the MLN impleme...
Conference Paper
Full-text available
We address the task of computing vector space representations for the meaning of word oc- currences, which can vary widely according to context. This task is a crucial step towards a robust, vector-based compositional account of sentence meaning. We argue that existing mod- els for this task do not take syntactic structure sufficiently into account...
Conference Paper
Full-text available
The vast majority of work on word senses has relied on predefined sense invento- ries and an annotation schema where each word instance is tagged with the best fit- ting sense. This paper examines the case for a graded notion of word meaning in two experiments, one which uses WordNet senses in a graded fashion, contrasted with the "winner takes all...
Preprint
Full-text available
Interpreting and assessing goal driven actions is vital to understanding and reasoning over complex events. It is important to be able to acquire the knowledge needed for this understanding, though doing so is challenging. We argue that such knowledge can be elicited through a participant achievement lens. We analyze a complex event in a narrative...
Preprint
Full-text available
We study semantic construal in grammatical constructions using large language models. First, we project contextual word embeddings into three interpretable semantic spaces, each defined by a different set of psycholinguistic feature norms. We validate these interpretable spaces and then use them to automatically derive semantic characterizations of...
Preprint
Full-text available
Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability. Here, we describe the approach of the team "longhorns" on Task 1 of the The First Workshop on Dynamic Adversarial Data Collection (DADC), which asked teams to manually fool a model on an Extractive Question Answ...
Article
Property inference involves predicting properties for a word from its distributional representation. We focus on human-generated resources that link words to their properties and on the task of predicting these properties for unseen words. We introduce the use of label propagation, a semi-supervised machine learning approach, for this task and, in...
Preprint
Full-text available
Discourse signals are often implicit, leaving it up to the interpreter to draw the required inferences. At the same time, discourse is embedded in a social context, meaning that interpreters apply their own assumptions and beliefs when resolving these inferences, leading to multiple, valid interpretations. However, current discourse data and framew...
Preprint
Full-text available
Humans use language to accomplish a wide variety of tasks - asking for and giving advice being one of them. In online advice forums, advice is mixed in with non-advice, like emotional support, and is sometimes stated explicitly, sometimes implicitly. Understanding the language of advice would equip systems with a better grasp of language pragmatics...
Preprint
We propose a method for controlled narrative/story generation where we are able to guide the model to produce coherent narratives with user-specified target endings by interpolation: for example, we are told that Jim went hiking and at the end Jim needed to be rescued, and we want the model to incrementally generate steps along the way. The core of...
Article
Recent progress in NLP witnessed the development of large-scale pre-trained language models (GPT, BERT, XLNet, etc.) based on Transformer (Vaswani et al. 2017), and in a range of end tasks, such models have achieved state-of-the-art results, approaching human performance. This clearly demonstrates the power of the stacked self-attention architectur...
Preprint
Recent progress in NLP witnessed the development of large-scale pre-trained language models (GPT, BERT, XLNet, etc.) based on Transformer (Vaswani et al. 2017), and in a range of end tasks, such models have achieved state-of-the-art results, approaching human performance. This demonstrates the power of the stacked self-attention architecture when p...
Preprint
The news coverage of events often contains not one but multiple incompatible accounts of what happened. We develop a query-based system that extracts compatible sets of events (scenarios) from such data, formulated as one-class clustering. Our system incrementally evaluates each event's compatibility with already selected events, taking order into...
Article
Implicit arguments, which cannot be detected solely through syntactic cues, make it harder to extract predicate-argument tuples. We present a new model for implicit argument prediction that draws on reading comprehension, casting the predicate-argument tuple with the missing argument as a query. We also draw on pointer networks and multi-hop comput...
Preprint
Full-text available
Discourse structure is integral to understanding a text and is helpful in many NLP tasks. Learning latent representations of discourse is an attractive alternative to acquiring expensive labeled discourse data. Liu and Lapata (2018) propose a structured attention mechanism for text classification that derives a tree over a text, akin to an RST disc...
Preprint
The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problem...
Preprint
Implicit arguments, which cannot be detected solely through syntactic cues, make it harder to extract predicate-argument tuples. We present a new model for implicit argument prediction that draws on reading comprehension, casting the predicate-argument tuple with the missing argument as a query. We also draw on pointer networks and multi-hop comput...
Preprint
During natural disasters and conflicts, information about what happened is often confusing, messy, and distributed across many sources. We would like to be able to automatically identify relevant information and assemble it into coherent narratives of what happened. To make this task accessible to neural models, we introduce Story Salads, mixtures...
Conference Paper
Full-text available
Distributional data tells us that a man can swallow candy, but not that a man can swallow a paintball, since this is never attested. However both are physically plausible events. This paper introduces the task of semantic plausibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility ju...
Conference Paper
Full-text available
We test whether distributional models can do one-shot learning of definitional properties from text only. Using Bayesian models, we find that first learning overar-ching structure in the known data, regularities in textual contexts and in properties, helps one-shot learning, and that individual context items can be highly informative. Our experimen...
Article
Full-text available
NLP tasks differ in the semantic information they require, and at this time no single semantic representation fulfills all requirements. Logic-based representations characterize sentence structure, but do not capture the graded aspect of meaning. Distributional models give graded similarity ratings for words and phrases, but do not capture sentence...
Article
We consider the task of predicting lexical entailment using distributional vectors. We perform a novel qualitative analysis of one existing model which was previously shown to only measure the prototypicality of word pairs. We find that the model strongly learns to identify hypernyms using Hearst patterns, which are well known to be predictive of l...
Article
Distributional models describe the meaning of a word in terms of its observed contexts. They have been very successful in computational linguistics. They have also been suggested as a model for how humans acquire (partial) knowledge about word meanings. But that raises the question of what, exactly, distributional models can learn, and the question...
Article
Full-text available
Word sense disambiguation and the related field of automated word sense induction traditionally assume that the occurrences of a lemma can be partitioned into senses. But this seems to be a much easier task for some lemmas than others. Our work builds on recent work that proposes describing word meaning in a graded fashion rather than through a str...
Article
NLP tasks differ in the semantic information they require, and at this time no single semantic representation fulfills all requirements. Logic-based representations characterize sentence structure, but do not capture the graded aspect of meaning. Distributional models give graded similarity ratings for words and phrases, but do not adequately captu...
Conference Paper
Full-text available
As a format for describing the meaning of natural language sentences, probabilistic logic combines the expressivity of first-order logic with the ability to handle graded information in a principled fashion. But practical probabilistic logic frameworks usually assume a finite domain in which each entity corresponds to a constant in the logic (domai...
Conference Paper
Full-text available
We test the Distributional Inclusion Hypothesis, which states that hypernyms tend to occur in a superset of contexts in which their hyponyms are found. We find that this hypothesis only holds when it is applied to relevant dimensions. We propose a robust supervised approach that achieves accuracies of .84 and .85 on two existing datasets and that c...
Conference Paper
Full-text available
We present the first large-scale English "all-words lexical substitution" corpus. The size of the corpus provides a rich resource for investigations into word meaning. We investigate the nature of lexical substitute sets, comparing them to WordNet synsets. We find them to be consistent with, but more fine-grained than, synsets. We also identify sig...
Conference Paper
Full-text available
Probabilistic Soft Logic (PSL) is a re-cently developed framework for proba-bilistic logic. We use PSL to combine logical and distributional representations of natural-language meaning, where distri-butional information is represented in the form of weighted inference rules. We ap-ply this framework to the task of Seman-tic Textual Similarity (STS)...
Chapter
First-order logic provides a powerful and flexible mechanism for representing natural language semantics. However, it is an open question of how best to integrate it with uncertain, weighted knowledge, for example regarding word meaning. This paper describes a mapping between predicates of logical form and points in a vector space. This mapping is...
Conference Paper
Full-text available
We represent natural language semantics by combining logical and distributional information in probabilistic logic. We use Markov Logic Networks (MLN) for the RTE task, and Probabilistic Soft Logic (PSL) for the STS task. The system is evaluated on the SICK dataset. Our best system achieves 73% accuracy on the RTE task, and a Pearson’s correlation...
Article
Word Sense Induction (WSI) is the task of identifying the different uses (senses) of a target word in a given text in an unsupervised manner, i.e. without relying on any external resources such as dictionaries or sense-tagged data. This paper presents ...
Article
Full-text available
Word sense disambiguation (WSD) is an old and important task in computational linguistics that still remains challenging, to machines as well as to human annotators. Recently there have been several proposals for representing word meaning in context that diverge from the traditional use of a single best sense for each occurrence. They represent wor...
Article
Graded models of word meaning in context characterize the meaning of individual usages (occurrences) without reference to dictionary senses. We introduce a novel approach that frames the task of computing word meaning in context as a probabilistic inference problem. The model represents the meaning of a word as a probability distribution over poten...
Conference Paper
Full-text available
Distributional representations have recently been proposed as a general-purpose representation of natural language meaning, to replace logical form. There is, however, one important difference between logical and distributional representations: Logical languages have a clear semantics, while distributional representations do not. In this paper, we...
Article
Distributional models represent a word through the contexts in which it has been observed. They can be used to predict similarity in meaning, based on the distributional hypothesis, which states that two words that occur in similar contexts tend to have similar meanings. Distributional approaches are often implemented in vector space models. They r...
Article
Full-text available
First-order logic provides a powerful and flexible mechanism for representing natural language semantics. However, it is an open question of how best to integrate it with uncertain, probabilistic knowledge, for example regarding word meaning. This paper describes the first steps of an approach to recasting first-order semantics into the probabilist...
Conference Paper
Full-text available
We consider a new subproblem of unsupervised parsing from raw text, unsupervised partial parsing---the unsupervised version of text chunking. We show that addressing this task directly, using probabilistic finite-state methods, produces better results than relying on the local predictions of a current best unsu-pervised parser, Seginer's (2007) CCL...
Article
Full-text available
We present a vector space–based model for selectional preferences that predicts plausibility scores for argument headwords. It does not require any lexical resources (such as WordNet). It can be trained either on one corpus with syntactic annotation, or on a combination of a small semantically annotated primary corpus and a large, syntactically ana...
Conference Paper
Full-text available
We present an approach to unsupervised partial parsing: the identification of low-level constituents (which we dub clumps) in unannotated text. We begin by showing that CCLParser (Seginer 2007), an unsupervised parsing model, is particularly adept at identifying clumps, and that, surprisingly, building a simple right-branching structure above its c...
Conference Paper
Full-text available
In this paper, we argue in favor of re-considering models for word meaning, using as a basis results from cognitive sci-ence on human concept representation. More specifically, we argue for a more flexible representation of word meaning than the assignment of a single best-fitting dictionary sense to each occurrence: Ei-ther use dictionary senses,...
Article
Full-text available
With the urgent need to document the world’s dying languages, it is important to explore ways to speed up language documentation efforts. One promising avenue is to use techniques from computational linguistics to automate some of the process. Here we consider unsupervised morphological segmentation and active learning for creating interlinear glos...
Conference Paper
Full-text available
This paper describes ongoing work on dis- tributional models for word meaning in context. We abandon the usual one-vector- per-word paradigm in favor of an exemplar model that activates only relevant occur- rences. On a paraphrasing task, we find that a simple exemplar model outperforms more complex state-of-the-art models.
Conference Paper
Full-text available
We define the crouching Dirichlet, hidden Markov model (CDHMM), an HMM for part- of-speech tagging which draws state prior dis- tributions for each local document context. This simple modification of the HMM takes advantage of the dichotomy in natural lan- guage between content and function words. In contrast, a standard HMM draws all prior dis- tr...
Conference Paper
Full-text available
We describe an approach for connecting language and geog-raphy that anchors natural language expressions to specific regions of the Earth, implemented in our TextGrounder system. The core of the system is a region-topic model, which we use to learn word distribu-tions for each region discussed in a given corpus. This model performs toponym resoluti...
Conference Paper
Full-text available
Vector space models of word meaning typically represent the meaning of a word as a vector computed by summing over all its corpus occurrences. Words close to this point in space can be assumed to be similar to it in meaning. But how far around this point does the region of similar meaning extend? In this paper we discuss two models that represent w...
Article
Full-text available
The appropriateness of paraphrases for words depends often on context: "grab" can replace "catch" in "catch a ball", but not in "catch a cold". Structured Vector Space (SVS) (Erk and Padó, 2008) is a model that computes word meaning in context in order to assess the appropriateness of such paraphrases. This paper investigates "best-practice" parame...
Article
Full-text available
Semantic space models represent the meaning of a word as a vector in high-dimensional space. They offer a framework in which the mean-ing representation of a word can be computed from its context, but the question remains how they support inferences. While there has been some work on paraphrase-based inferences in semantic space, it is not clear ho...
Conference Paper
Full-text available
Many approaches to unsupervised mor- phology acquisition incorporate the fre- quency of character sequences with re- spect to each other to identify word stems and affixes. This typically involves heuris- tic search procedures and calibrating mul- tiple arbitrary thresholds. We present a simple approach that uses no thresholds other than those invo...
Conference Paper
Full-text available
Word sense disambiguation is typically phrased as the task of labeling a word in context with the best-fitting sense from a sense inventory such as WordNet. While questions have often been raised over the choice of sense inventory, computational linguists have readily accepted the best-fitting sense methodology despite the fact that the case for di...
Conference Paper
Full-text available
Abstract Both vector space models and graph ran- dom walk models can be used to determine similarity between,concepts. Noting that vectors can be regarded as local views of a graph, we directly compare vector space models and graph random,walk models on standard tasks of predicting human,simi- larity ratings, concept categorization, and semantic pr...
Conference Paper
Semantic space models represent the meaning of a word as a vector in high-dimensional space. They offer a framework in which the meaning representation of a word can be computed from its context, but the question remains how they support inferences. While there has been some work on paraphrase-based inferences in semantic space, it is not clear how...
Article
Full-text available
In this article, we address the task of comparing and combining different semantic verb classifications within one language. We present a methodology for the manual analysis of individual resources on the level of semantic features. The resulting representations can be aligned across resources, and allow a contrastive analysis of these resources. I...
Article
Full-text available
We describe course adaptation and develop- ment for teaching computational linguistics for the diverse body of undergraduate and graduate students the Department of Linguis- tics at the University of Texas at Austin. We also discuss classroom tools and teaching aids we have used and created, and we mention our efforts to develop a campus-wide compu...
Conference Paper
Full-text available
We describe course adaptation and development for teaching computational linguistics for the diverse body of undergraduate and graduate students the Department of Linguistics at the University of Texas at Austin. We also discuss classroom tools and teaching aids we have used and created, and we mention our efforts to develop a campus-wide computati...
Article
Full-text available
We propose a new XML format for representing interlinearized glossed text (IGT), particularly in the context of the documentation and description of endangered languages. The proposed representation, which we call IGT-XML, builds on previous models but provides a more loosely coupled and flexible representation of different annotation layers. Desig...
Article
We express dominance constraints in the once-only nesting fragment of stratified context unification, which therefore is NP-complete.
Conference Paper
Full-text available
We propose a new, simple model for the auto- matic induction of selectional preferences, using corpus-based semantic similarity metrics. Fo- cusing on the task of semantic role labeling, we compute selectional preferences for seman- tic roles. In evaluations the similarity-based model shows lower error rates than both Resnik's WordNet-based model a...
Article
Full-text available
This task consists of recognizing words and phrases that evoke semantic frames as defined in the FrameNet project (http: //framenet.icsi.berkeley.edu), and their semantic dependents, which are usually, but not always, their syntactic dependents (including subjects). The train-ing data was FN annotated sentences. In testing, participants automatical...
Conference Paper
Full-text available
This task consists of recognizing words and phrases that evoke semantic frames as defined in the FrameNet project (http://framenet.icsi.berkeley.edu), and their semantic dependents, which are usually, but not always, their syntactic dependents (including subjects). The training data was FN annotated sentences. In testing, participants automatically...
Conference Paper
We propose a new XML format for representing interlinearized glossed text (IGT), particularly in the context of the documentation and description of endangered languages. The proposed representation, which we call IGT-XML, builds on previous models but provides a more loosely coupled and flexible representation of different annotation layers. Desig...
Conference Paper
Full-text available
We address the problem of unknown word sense detection: the identification of cor- pus occurrences that are not covered by a given sense inventory. We model this as an instance of outlier detection, using a simple nearest neighbor-based approach to measuring the resemblance of a new item to a training set. In combination with a method that alleviat...
Article
Full-text available
In this paper, we describe the SALTO tool. It was originally developed for the annotation of semantic roles in the frame semantics paradigm, but can be used for graphical annotation of treebanks with general relational information in a simple drag-and-drop fashion. The tool additionally supports corpus management and quality control.
Article
Full-text available
This paper presents SHALMANESER, a software package for shallow semantic parsing, the automatic assignment of semantic classes and roles to free text. SHALMANESER is a toolchain of independent modules communicating through a common XML format. System output can be inspected graphically. SHALMANESER can be used either as a "black box" to obtain sema...
Article
Full-text available
This paper describes the SALSA corpus, a large German corpus manually annotated with role-semantic information, based on the syntactically annotated TIGER newspaper corpus (Brants et al., 2002). The first release, comprising about 20,000 annotated predicate instances (about half the TIGER corpus), is scheduled for mid-2006. In this paper we discuss...
Conference Paper
Full-text available
We analyze models for semantic role assignment by defining a meta-model that abstracts over features and learning paradigms. This meta-model is based on the concept of role confusability, is de- fined in information-theoretic terms, and predicts that roles realized by less specific grammatical functions are more difficult to assign. We find that co...
Article
Full-text available
In this paper, we present a rule-based system for the assignment of FrameNet frames by way of a "detour via WordNet". The system can be used to overcome sparse-data problems of statistical systems trained on current FrameNet data. We devise a weighting scheme to select the best frame(s) out of a set of candidate frames, and present first figures of...
Article
Full-text available
This paper presents a manual pilot study in cross-linguistic analysis at the predicate-argument level. Looking at translation pairs differing in their parts of speech, we find that predicate-argument structure abstracts somewhat from morphosyntac-tic language idiosyncrasies, but there is still considerable variation in the distri-bution of semantic...
Article
Full-text available
We describe a statistical approach to semantic role labelling that employs only shallow information.
Article
Full-text available
This thesis studies the Constraint Language for Lambda Structures (CLLS), which is interpreted over lambda terms represented as tree-like structures. Our main focus is on the processing of parallelism constraints, a construct of CLLS. A parallelism constraint states that two pieces of a tree have the same structure. We present a sound and complete...
Conference Paper
Full-text available
We present an LFG syntax-semantics interface for the semi-automatic annotation of frame semantic roles for German in the SALSA project. The architecture is intended to support a bootstrapping cycle for the acquisition of stochastic models for frame semantic role assignment, starting from manual annotations on the basis of the syntactically annotate...
Article
Full-text available
We present two XML formats for the description and encoding of semantic role information in corpora. The TIGER/SALSA XML format provides a modular representation for semantic roles and syntactic structure. The Text-SALSA XML format is a lightweight version of TIGER/SALSA XML designed for manual annotation with an XML editor rather than a special to...
Article
We describe the ongoing construction of a large, semantically annotated corpus resource as reliable basis for the largescale acquisition of word-semantic information, e.g. the construction of domainindependent lexica. The backbone of the annotation are semantic roles in the frame semantics paradigm. We report experiences and evaluate the annotated...
Article
Full-text available
The Constraint Language for Lambda Structures (CLLS) is an expressive tree description language. It provides a uniform framework for underspecified semantics, covering scope, ellipsis, and anaphora. Efficient algorithms exist for the sublanguage that models scope. But so far no terminating algorithm exists for sublanguages that model ellipsis. We i...
Conference Paper
Full-text available
We describe the ongoing construction of a large, semantically annotated corpus resource as reliable basis for the large-scale acquisition of word-semantic information, e.g. the construction of domain-independent lexica. The backbone of the annotation are semantic roles in the frame semantics paradigm. We report experiences and evaluate the annotate...
Conference Paper
Full-text available
We describe the ongoing construction of a large, semantically annotated corpus resource as reliable basis for the large-scale acquisition of word-semantic information, e.g. the construction of domain-independent lexica. The backbone of the annotation are semantic roles in the frame semantics paradigm. We report experiences and evaluate the annotate...

Network

Cited By