Manfred Stede

Manfred Stede
Universität Potsdam · Department Linguistik

PhD

About

170
Publications
36,443
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,645
Citations
Additional affiliations
April 2001 - June 2020
Universität Potsdam
Position
  • Professor

Publications

Publications (170)
Article
Full-text available
In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the st...
Article
Background: Coherence is the quality that distinguishes discourse from a random collection of sentences. People with aphasia have been reported to produce less-coherent discourse than non-language-impaired speakers. It is largely unclear how coherence is established in natural language and what leads to its impairment in aphasia. Aims: This paper p...
Article
Full-text available
Reflecting in written form on one’s teaching enactments has been considered a facilitator for teachers’ professional growth in university-based preservice teacher education. Writing a structured reflection can be facilitated through external feedback. However, researchers noted that feedback in preservice teacher education often relies on holistic,...
Article
Parsing of argumentative structures has become a very active line of research in recent years. Like discourse parsing or any other natural language task that requires prediction of linguistic structures, most approaches choose to learn a local model and then perform global decoding over the local probability distributions, often imposing constraint...
Conference Paper
Full-text available
In response to (i) inconclusive results in the literature as to the properties of coreference chains in written versus spoken language, and (ii) a general lack of work on automatic coreference resolution on both spoken language and social media, we undertake a corpus study involving the various genre sections of Ontonotes, the Switchboard corpus, a...
Conference Paper
The performance of standard coreference resolution is known to drop significantly on Twitter texts. We improve the performance of the (Lee et al., 2018) system, which is originally trained on OntoNotes, by retraining on manually-annotated Twitter conversation data. Further experiments by combining different portions of OntoNotes with Twitter data s...
Article
Full-text available
Background Computational linguistic methodology allows quantification of speech abnormalities in non-affective psychosis. For this patient group, incoherent speech has long been described as a symptom of formal thought disorder. Our study is an interdisciplinary attempt at developing a model of incoherence in non-affective psychosis, informed by co...
Article
Full-text available
We present DiMLex-Bangla, a newly developed lexicon of discourse connectives in Bangla. The lexicon, upon completion of its first version, contains 123 Bangla connective entries, which are primarily compiled from the linguistic literature and translation of English discourse connectives. The lexicon compilation is later augmented by adding more con...
Article
Full-text available
In this paper, we present a tangible outcome of the TextLink network: a joint online database project displaying and linking existing and newly-created lexicons of discourse connectives in multiple languages. We discuss the definition and demarcation of the class of connectives that should be included in such a resource, and present the syntactic,...
Chapter
Argumentation Mining aims at finding components of arguments, as well as relations between them, in text. One of the largely unsolved problems is implicitness, where the text invites the reader to infer a missing component, such as the claim or a supporting statement. In the work of Wojatzki and Zesch (2016), an interesting implicitness problem is...
Article
Computer-assisted text coding can facilitate the analysis of large text collections. To evaluate the functionality of providing an analyst with a ranked list of suggestions for suitable text codes, we used a data set of discussion posts, which had been manually coded for reasons given for taking a stance on the topic of vaccination. We trained a lo...
Chapter
The OntoNotes corpus is widely used for training and testing coreference resolution systems, but only little attention has so far been given to the differences between the different genres of language that the corpus is composed of. We are primarily interested in the contrast between spoken and written language, and thus we conducted in-depth analy...
Chapter
We present an approach to the extraction of arguments for explicit discourse relations in German, as a sub-task of the larger task of shallow discourse parsing for German. Using the Potsdam Commentary Corpus, we evaluate two methods (one based on constituency trees, the other based on dependency trees) to extract both the internal and the external...
Poster
Speech deficits are common symptoms among Parkinson's Disease (PD) patients. The automatic assessment of speech signals is promising for the evaluation of the neurological state and the speech quality of the patients. Recently , progress has been made in applying machine learning and computational methods to automatically evaluate the speech of PD...
Poster
Full-text available
Speech deficits are common symptoms among Parkinson's Disease (PD) patients. The automatic assessment of speech signals is promising for the evaluation of the neurological state and the speech quality of the patients. Recently, progress has been made in applying machine learning and computational methods to automatically evaluate the speech of PD p...
Conference Paper
Full-text available
Incoherent discourse in schizophrenia has long been recognized as a dominant symptom of the mental disorder (Bleuler, 1911/1950). Recent studies have used modern sentence and word embeddings to compute coherence metrics for spontaneous speech in schizophrenia. While clinical ratings always have a subjective element, computational linguistic methodo...
Conference Paper
Full-text available
This paper presents RST-Tace, a tool for automatic comparison and evaluation of RST trees. RST-Tace serves as an implementation of Iruskieta's comparison method, which allows trees to be compared and evaluated without the influence of decisions at lower levels in a tree in terms of four factors: constituent, attachment point, nuclearity as well as...
Chapter
Strictly speaking, the generation (or synthesis) of argumentative text is outside the scope of mining, but nonetheless we consider the topic here, as it is a part of the wider field of argumentation technology and will be increasingly relevant for many applications. However, in contrast to analysis, the generation of arguments has so far received m...
Chapter
Unlike many of the standard tasks in NLP, argumentation mining is not a single unified process, but a constellation of subtasks, which are of different prominence depending on the goals of the underlying target application. For a (hypothetical) example, in order to obtain the gist of a Twitter conversation, it can be sufficient to extract claims an...
Chapter
The tasks explained in the previous two chapters were to label the components of an argument and, along the way, to identify spans of text that are not part of the argument. We illustrate this with an extended version of Example 6.1, using the subscripts C, S, A for claim, support, attack: (7.1) Last week I bought this new camera here. [You should...
Article
Argumentation mining is an application of natural language processing (NLP) that emerged a few years ago and has recently enjoyed considerable popularity, as demonstrated by a series of international workshops and by a rising number of publications at the major conferences and journals of the field. Its goals are to identify argumentation in text o...
Article
Starting from the perspective that discourse structure arises from the presence of coherence relations, we provide a map of linguistic discourse structuring devices (DRDs), and then focus on those found in written text: connectives. To subdivide this class further, we follow the recent idea of structuring the set of connectives by differentiating b...
Article
Full-text available
Parsing of argumentative structures has become a very active line of research in recent years. Like discourse parsing or any other natural language task that requires prediction of linguistic structures, most approaches choose to learn a local model and then perform global decoding over the local probability distributions, often imposing constraint...
Article
Arguments used when vaccination is debated on Internet discussion forums might give us valuable insights into reasons behind vaccine hesitancy. In this study, we applied automatic topic modelling on a collection of 943 discussion posts in which vaccine was debated, and six distinct discussion topics were detected by the algorithm. When manually cod...
Article
Full-text available
We present a lexicon of Dutch Discourse Connectives (DisCoDict). Its content was obtained using a two-step process, in which we first exploited a parallel corpus and a German seed lexicon, and then manually evaluated the candidate entries against existing connective resources for Dutch, using these resources to complete our lexicon. We compared con...
Chapter
Full-text available
The physical formats used to represent linguistic data and its annotations have evolved over the past four decades, accommodating different needs and perspectives as well as incorporating advances in data representation generally. This chapter provides an overview of representation formats with the aim of surveying the relevant issues for represent...
Article
Newspaper text can be broadly divided in the classes ‘opinion’ (editorials, commentary, letters to the editor) and ‘neutral’ (reports). We describe a classification system for performing this separation, which uses a set of linguistically motivated features. Working with various English newspaper corpora, we demonstrate that it significantly outper...
Article
Full-text available
Despite a substantial progress made in developing new sentiment lexicon generation (SLG) methods for English, the task of transferring these approaches to other languages and domains in a sound way still remains open. In this paper, we contribute to the solution of this problem by systematically comparing semi-automatic translations of common Engli...
Chapter
Full-text available
The recognition that certain aspects of sentences or utterances depend on previous discourse, speakers' knowledge management and packaging strategies, and are not fully describable in a narrow syntactic formalism has led to the formulation of Information Structure (IS) models. IS models differ in the number of concepts, phenomena, and theoretical a...
Chapter
For more than 10 years now, Sentiment Analysis has enjoyed enormous popularity in Computational Linguistics, one main reason being its great potential for practical applications, predominantly (but not only) for industrial purposes. We observe a tendency that early work referred to certain theoretical notions of Subjectivity, whereas a lot of the l...
Article
Full-text available
Argument mining has started to yield early results in automatic analysis of text to produce representations of reason-conclusion structures. This paper addresses for the first time the question of automatically extracting such structures from dialogical settings of argument. More specifically, we introduce theoretical foundations for dialogical arg...
Conference Paper
A simple conceptual model is employed to investigate events, and break the task of coreference resolution into two steps: semantic class detection and similaritybased matching. With this perspective an algorithm is implemented to cluster event mentions in a large-scale corpus. Results on test data from AQUAINT TimeML, which we annotated manually wi...
Conference Paper
We investigate the differences and the levelsof difficulty for sentiment analysis on thetwo genres of newspaper text and twitter text(tweets). Two existing systems are comparedwith respect to their performance on bothgenres: SentiStrength (Thelwall et al., 2012)and SO-CAL (Taboada et al., 2011). Bothhave similar architectures, using hand-builtpolar...
Article
In this paper, the authors consider argument mining as the task of building a formal representation for an argumentative piece of text. Their goal is to provide a critical survey of the literature on both the resulting representations (i.e., argument diagramming techniques) and on the various aspects of the automatic analysis process. For represent...
Article
Full-text available
Annotating linguistic data has become a major field of interest, both for supplying the necessary data for machine learning approaches to NLP applications, and as a research issue in its own right. This comprises issues of technical formats, tools, and methodologies of annotation. We provide a brief overview of these notions and then introduce the...
Article
The meaning of linguistic connectives has often been characterized in terms of their position in a bipartite (semantic, pragmatic) or a tripartite (content, epistemic, speech act) structure of domains, depending on what kinds of entities are being connected (largely: propositions or speech acts). This paper argues that a more fine-grained analysis...
Conference Paper
Full-text available
Speech synthesis nowadays is of acceptable quality for many purposes. Nonetheless there are applications where contextual and other pragmatic factors play an important role, which cannot be accounted for by straightforward text-to-speech (TTS) systems. This is the case for systems giving product comparisons and recommendations: For instance, an app...
Book
Full-text available
This white paper is part of a series that promotes knowledge about language technology and its potential. It addresses educators, journalists, politicians, language communities and others. The availability and use of language technology in Europe varies between languages. Consequently, the actions that are required to further support research and d...
Article
Given the contemporary trend to modular NLP architectures and multiple annotation frameworks, the existence of concurrent tokenizations of the same text represents a pervasive problem in everyday's NLP practice and poses a non-trivial theoretical problem to the integration of linguistic annotations and their interpretability in general. This paper...
Book
Discourse Processing here is framed as marking up a text with structural descriptions on several levels, which can serve to support many language-processing or text-mining tasks. We first explore some ways of assigning structure on the document level: the logical document structure as determined by the layout of the text, its genre-specific content...
Conference Paper
Images and videos resulting from diagnostic imaging procedures such as echocardiography need to be analyzed and interpreted by physicians in order to diagnose diseases of the patient. This process can be split into two steps: in a first step, various morphological features depicted in the images have to be interpreted and described. Then, a diagnos...
Article
Full-text available
We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or...
Chapter
As a result of the aging societies in the western world, the impact of dementia, with its characteristics like disorientation and obliviousness is becoming a significant problem to an increasing amount of persons and the health system. To enable such dementia patients to regain a self determined life, we have developed a mobile orientation system c...
Chapter
Text documents are structured on (at least) two separate levels: The “logical” structure is largely reflected in the layout (headlines, paragraphs, etc.), and the “content” structure specifies the functional zones that serve a part of the text’s overall communicative purpose. The latter is clearly genre-specific, whereas the former is independent o...
Article
Full-text available
A crucial step in the development of NLP systems is a detailed error analysis. Our system demonstration presents the infrastructure and the workflow for training classifiers for different NLP tasks and the verification of their predictions on annotated corpora. We describe an enhancement cycle of subsequent steps of classification and context-sensi...
Conference Paper
A central step in the automatic processing of court decisions is the identification of the various content zones, i.e., breaking up the document into functionally independent areas. We assembled a corpus of German court decisions and argue that this genre belongs to the class of semi-structured text documents. Currently, we are implementing zone id...
Conference Paper
Full-text available
We present a taxonomy and classification system for distinguishing between differ- ent types of paragraphs in movie reviews: formal vs. functional paragraphs and, within the latter, between description and comment. The classification is used for sentiment extraction, achieving im- provement over a baseline without para- graph classification.
Conference Paper
Full-text available
Given the contemporary trend to modular NLP architectures and multiple annotation frameworks, the existence of concurrent tokenizations of the same text represents a pervasive problem in everyday’s NLP practice and poses a non-trivial theoretical problem to the integration of linguistic annotations and their interpretability in general. This paper...
Article
This short paper describes our ongoing work on representing the argument structure of a particular class of persuasive texts, and on reading experiments designed to investigate the effects of certain rhetorical devices, in particular the use of explicit argumentative connectives.
Article
Empirical studies of text coherence often use tree-like structures in the spirit of Rhetorical Structure Theory (RST) as representational device. This paper identifies several sources of ambiguity in RST-inspired trees and argues that such structures are therefore not as explanatory as a text representation should be. As an alternative, an approach...