Stefan Bott

Stefan Bott
Universität Stuttgart · Institute for Natural Language Processing

About

36
Publications
6,548
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
577
Citations
Citations since 2016
5 Research Items
370 Citations
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060

Publications

Publications (36)
Article
Full-text available
Particle verbs represent a type of multi-word expression composed of a base verb and a particle. The meaning of the particle verb is often, but not always, derived from the meaning of the base verb, sometimes in quite complex ways. In this work, we computationally assess the levels of German particle verb compositionality, with the use of distribut...
Conference Paper
Full-text available
German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon. Similarly to other multi-word expressions, particle verbs exhibit various levels of compositionality. One of the major obstacles for the study of compositionality is the lack of representative gold standards of human rati...
Conference Paper
Full-text available
This paper presents a novel gold standard of German noun-noun compounds G h ost-NN including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs. Moreover, a sub...
Article
The way in which a text is written can be a barrier for many people. Automatic text simplification is a natural language processing technology that, when mature, could be used to produce texts that are adapted to the specific needs of particular users. Most research in the area of automatic text simplification has dealt with the English language. I...
Article
The way in which a text is written can be a barrier for many people. Automatic text simplification is a natural language processing technology that, when mature, could be used to produce texts that are adapted to the specific needs of particular users. Most research in the area of automatic text simplification has dealt with the English language. I...
Article
In this paper we study the effect of different lexical resources for selecting synonyms and strategies for word sense disambiguation in a lexical simplification system for the Spanish language. The resources used for the experiments are the Spanish EuroWordNet, the Spanish Open Thesaurus and a combination of both. As for the synonym selection strat...
Conference Paper
Full-text available
This article presents a distributional approach to predict the compositionality of German particle verbs by modelling changes in syntactic argument structure. We justify the experiments on theoretical grounds and employ GermaNet, Topic Models and Singular Value Decomposition for generalization, to compensate for data sparseness. Evaluating against...
Conference Paper
Full-text available
In this work we address the question why different German particle verbs tend to occur with different frequency proportions in syntactically separated vs. non-separated forms. The problem has been studied from a theoretical point of view and the syntactic conditions that determine particle verb realization in separated and non-separated paradigms a...
Conference Paper
Full-text available
In the work presented here we assess the degree of compositionality of German Particle Verbs with a Distributional Semantics Model which only relies on word window information and has no access to syntactic information as such. Our method only takes the lexical distributional distance between the Particle Verb to its Base Verb as a predictor for co...
Article
In this paper we present the development of a text simplification system for Spanish. Text simplification is the adaptation of a text for the special needs of certain groups of readers, such as language learners, people with cognitive difficulties, and elderly people, among others. There is a clear need for simplified texts, but manual production a...
Conference Paper
Full-text available
German particle verbs are a type of multi word expression which is often compositional with respect to a base verb. If they are compositional they tend to express the same types of semantic arguments, but they do not necessarily express them in the same syntactic subcategorization frame: some arguments may be expressed by differing syntactic subcat...
Conference Paper
Full-text available
German particle verbs, like anblicken (to gaze at) combine a base verb (blicken) with a particle (an) to form a special kind of Multi Word Expression. Particle verbs may share the semantics of the base verb and the particle to a variable degree. However, while syntactic subcategorization frames tend to be good predictor for the semantics of verbs i...
Conference Paper
In this paper we study the effect of different lexical resources and strategies for selecting synonyms in a lexical simplification system for the Spanish language. The resources used for the experiments are the Spanish EuroWordNet, the Spanish Open Thesaurus and a combination of both. As for the synonym selection strategies, we have used both local...
Conference Paper
Even if dyslexia is neurological in origin, certain text modifications could make texts more accessible for people with dyslexia. We introduce DysWebxia 2.0, a model that integrates our findings from research conducted with this target group. It alters content and presentation of the text to make it more readable. We also present the current integr...
Conference Paper
We present a user study for two different automatic strategies that simplify text content for people with dyslexia. The strategies considered are the standard one (replacing a complex word with the most simpler synonym) and a new one that presents several synonyms for a complex word if the user requests them. We compare texts transformed by both st...
Conference Paper
In this paper we present two components of an automatic text simplification system for Spanish, aimed at making news articles more accessible to readers with cognitive disabilities. Our system in its current state consists of a rule-based lexical transformation component and a module for syntactic simplification. We evaluate the two components sepa...
Conference Paper
Lexical simplification is the task of replacing a word in a given context by an easier-to-understand synonym. Although a number of lexical simplification approaches have been developed in recent years, most of them have been applied to English, with recent work taking advantage of parallel monolingual datasets for training. Here we present LexSiS,...
Conference Paper
In this paper we present an automatic text simplification system for Spanish which intends to make texts more accessible for users with cognitive disabilities. This system aims at reducing the structural complexity of Spanish sentences in that it converts complex sentences in two or more simple sentences and therefore reduces reading difficulty.
Conference Paper
Full-text available
This paper addresses the problem of automatic text simplification. Automatic text simplifications aims at reducing the reading difficulty for people with cognitive disability, among other target groups. We describe an automatic text simplification system for Spanish which combines a rule based core module with a statistical support module that cont...
Article
Full-text available
We present a system developed for the CoNLL-2009 Shared Task (Hajič et al., 2009). We extend the Carreras (2007) parser to jointly annotate syntactic and semantic dependencies. This state-of-the-art parser factorizes the built tree in second-order factors. We include semantic dependencies in the factors and extend their score function to com...
Article
Full-text available
In this paper we describe the development of a text simplification system for Spanish. Text simplification is the adaptation of a text to the special needs of certain groups of readers, such as language learners, people with cognitive difficulties and elderly people, among others. There is a clear need for simplified texts, but manual production an...
Conference Paper
Full-text available
We present a method for the sentence-level alignment of short simplified text to the original text from which they were adapted. Our goal is to align a medium-sized corpus of parallel text, consisting of short news texts in Spanish with their simplified counterpart. No training data is available for this task, so we have to rely on unsupervised lea...
Article
Full-text available
This paper presents a corpus of spoken narrative texts in Catalan, Italian, Spanish, English, and German. The aim of this corpus compilation is to create an empirical resource for a comparative study of Information Structure. A total of 68 speakers were asked to tell a story in an acoustically isolated room by looking at the pictures of three textl...
Article
Full-text available
Text simplification is the process of transforming a text into an equivalent which is more understandable for a target user. We focus on text simplification in the Spanish language and present a corpus-based study of simplification operations. The study has implications for the development of an automatic simplification system. © 2011 Sociedad Espa...
Article
Full-text available
The aim of this paper is to treat background (non-focus) elements as anaphoric, i.e. as objects which are formally subject to the same conditions which hold for other anaphora. The idea that backgrounds are anaphoric is inherent in many approaches to information structure, but they are rarely treated as such formally. I will discuss two problems wh...
Conference Paper
Full-text available
This paper presents CUCWeb, a 166 mil-lion word corpus for Catalan built by crawling the Web. The corpus has been annotated with NLP tools and made avail-able to language users through a flexible web interface. The developed architecture is quite general, so that it can be used to create corpora for other languages.
Conference Paper
Full-text available
This paper presents CUCWeb, a 166 million word corpus for Catalan built by crawling the Web. The corpus has been annotated with NLP tools and made available to language users through a flexible web interface. The developed architecture is quite general, so that it can be used to create corpora for other languages.
Article
This paper presents the Corpus d'Ús del Català a la Web (CuCWeb), a 208 million word (125,000 documents) corpus automatically compiled from the Web. This corpus has been automatically processed so that additional linguistic information is available (apart from the word forms). A very flexible search interface has been implemented, which allows for...
Article
Full-text available
CATCG es un sistema de análisis morfosintáctico superficial para el catalán, basado en el formalismo Constraint Grammar, que contiene tres herramientas básicas: un analizador morfológico, un etiquetador morfológico y un analizador sintáctico superficial. CATCG is a shallow parser for Catalan. It uses the Constraint Grammar formalism and contains th...
Article
This paper focuses on the language processing tool being developed at our centre and briefly describes two of its applications. CATCG, our morphosyntactic analyser, is designed to deal with general written Catalan text. In CATCG the whole processing task has been divided into specific subtasks and for each one of them we try to apply the best strat...
Article
Full-text available
This dissertation investigates the interrelation between information structure and discourse structure. Information-structurally backgrounded material is here generally treated as being anaphoric in a very strict sense. It is argued that, apart from having more descriptive content, elements from the sentence background are not different from other...

Network

Cited By

Projects

Projects (4)
Archived project
Project
token- and type-based approaches to predict degrees of compositionality; figurative language and meaning shifts of multi-word expressions; focus on noun-noun compounds and particle verbs