Manfred Pinkal’s research while affiliated with Saarland University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (109)


Detecting Everyday Scenarios in Narrative Texts
  • Preprint

June 2019

·

18 Reads

·

Michael Roth

·

Manfred Pinkal

Script knowledge consists of detailed information on everyday activities. Such information is often taken for granted in text and needs to be inferred by readers. Therefore, script knowledge is a central component to language comprehension. Previous work on representing scripts is mostly based on extensive manual work or limited to scenarios that can be found with sufficient redundancy in large corpora. We introduce the task of scenario detection, in which we identify references to scripts. In this task, we address a wide range of different scripts (200 scenarios) and we attempt to identify all references to them in a collection of narrative texts. We present a first benchmark data set and a baseline model that tackles scenario detection using techniques from topic segmentation and text classification.


MCScript2.0: A Machine Comprehension Corpus Focused on Script Events and Participants

May 2019

·

22 Reads

We introduce MCScript2.0, a machine comprehension corpus for the end-to-end evaluation of script knowledge. MCScript2.0 contains approx. 20,000 questions on approx. 3,500 texts, crowdsourced based on a new collection process that results in challenging questions. Half of the questions cannot be answered from the reading texts, but require the use of commonsense and, in particular, script knowledge. We give a thorough analysis of our corpus and show that while the task is not challenging to humans, existing machine comprehension models fail to perform well on the data, even if they make use of a commonsense knowledge base. The dataset is available at http://www.sfb1102. uni-saarland.de/?page_id=2582


13. Semantic research in computational linguistics

February 2019

·

52 Reads

·

1 Citation

Computational semantics is the branch of computational linguistics that is concerned with the development of methods for processing meaning information. Because a computer system that analyzes natural language must be able to deal with arbitrary real-world sentences, computational semantics faces a number of specific challenges related to the coverage of semantic construction procedures, the efficient resolution of ambiguities, and the ability to compute inferences. After initial successes with logic-based methods, the mainstream paradigm in computational semantics today is to let the computer automatically learn from corpora. In this article, we present both approaches, compare them, and discuss some recent initiatives for combining the two.




Figure 1: An example for a text snippet with two reading comprehension questions.
Figure 3: An example text with 2 questions from MCScript
Figure 4: Accuracy values of the baseline models on question types appearing > 25 times.
MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge
  • Article
  • Full-text available

March 2018

·

1,426 Reads

·

49 Citations

Simon Ostermann

·

·

Michael Roth

·

[...]

·

Manfred Pinkal

We introduce a large dataset of narrative texts and questions about these texts, intended to be used in a machine comprehension task that requires reasoning using commonsense knowledge. Our dataset complements similar datasets in that we focus on stories about everyday activities, such as going to the movies or working in the garden, and that the questions require commonsense knowledge, or more specifically, script knowledge, to be answered. We show that our mode of data collection via crowdsourcing results in a substantial amount of such inference questions. The dataset forms the basis of a shared task on commonsense and script knowledge organized at SemEval 2018 and provides challenging test cases for the broader natural language understanding community.

Download



Aligning Script Events with Narrative Texts

October 2017

·

18 Reads

·

2 Citations

Script knowledge plays a central role in text understanding and is relevant for a variety of downstream tasks. In this paper, we consider two recent datasets which provide a rich and general representation of script events in terms of paraphrase sets. We introduce the task of mapping event mentions in narrative texts to such script event types, and present a model for this task that exploits rich linguistic representations as well as information on temporal ordering. The results of our experiments demonstrate that this complex task is indeed feasible.


Figure 1: An WIKIHOW activity example. 
Table 2 : Prediction examples.
Sequence to Sequence Learning for Event Prediction

September 2017

·

184 Reads

·

2 Citations

This paper presents an approach to the task of predicting an event description from a preceding sentence in a text. Our approach explores sequence-to-sequence learning using a bidirectional multi-layer recurrent neural network. Our approach substantially outperforms previous work in terms of the BLEU score on two datasets derived from WikiHow and DeScript respectively. Since the BLEU score is not easy to interpret as a measure of event prediction, we complement our study with a second evaluation that exploits the rich linguistic annotation of gold paraphrase sets of events.


Citations (72)


... Feature extraction and classification techniques, including Fractal theory, Gabor filters, and SVM, are crucial for character recognition, with SVM classifier achieving 95.04% accuracy and offline k-NN classifier achieving 94.12% accuracy. The study [23] investigates the use of reading texts for automatic short answer scoring in German language learning, finding textual features improve classification accuracy, while other models suggest less instructor supervision. This paper investigates the role of text in short answer scoring in reading comprehension exercises, comparing student answers to instructor-provided answer-based models. ...

Reference:

Review on Smart Evaluation of Descriptive Answer Sheets
Using the text to evaluate short answers for reading comprehension exercises

... Open World Knowledge Incorporating external knowledge and commonsense has been a longstanding challenge in both dataset construction and model design (Zellers et al., 2019;Wanzare et al., 2019;Mikhalkova et al., 2020;Ashida and Sugawara, 2022). Efforts to address this challenge have been made over the years, with the emergence of LLMs providing a source potential of commonsense knowledge (Bosselut et al., 2019;Petroni et al., 2019). ...

Detecting Everyday Scenarios in Narrative Texts
  • Citing Conference Paper
  • January 2019

... Short Video Temporal Grounding (SVTG). SVTG methods aim to locate specific events within short videos, typically lasting from a few seconds to a few minutes [16,21,36,41]. There is extensive research in this area, which generally falls into proposal-based and proposal-free methods. ...

Grounding Action Descriptions in Videos
  • Citing Article
  • December 2013

Transactions of the Association for Computational Linguistics

... This work has tailored temporal commonsense as reading comprehension and presents a dataset as question-answering tasks. The field of question-answering has seen steady research progress in the NLP community, with a focus on general comprehension of text [38][39][40] . ...

SemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge

... To evaluate the performance and assess the quality of the integrated cross-asset risk management system utilizing LLMs for real-time monitoring, we will utilize diverse datasets, including MCScript for common sense reasoning and narrative comprehension [26], CLIMATE-FEVER for verifying realworld climate claims [27], MURA for detecting abnormalities in musculoskeletal radiographs [28], the Norwegian Review Corpus for document-level sentiment analysis [29], and TaPaCo, which provides a corpus of sentential paraphrases across multiple languages [30]. These datasets collectively support the comprehensive evaluation of the proposed system in various domains within financial and risk management contexts. ...

MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge

... The stories were based on event chains extracted from the DeScript corpus of script knowledge (Wanzare et al. 2016). For each story, which represents a different script-based scenario, about 100 responses were collected. ...

DeScript: A crowdsourced Corpus for the Acquisition of High-Quality Script Knowledge

... Text-based Script Induction Temporal relations have always been the core of script (schema) related tasks, which can either be learned from data or human annotation. When human-written scripts are available, previous works have typically assumed that the human-provided ordering of steps is the only correct order (Jung et al., 2010;Ostermann et al., 2017;Nguyen et al., 2017;Lyu et al., 2021;Sakaguchi et al., 2021). Another line of work has attempted to learn event ordering from data alone, either by assuming that the events follow narrative order Jurafsky, 2008, 2009;Jans et al., 2012;Rudinger et al., 2015;Ahrendt and Demberg, 2016;Wang et al., 2017) or by using an event-event temporal relation classifier to predict the true ordering of events . ...

Aligning Script Events with Narrative Texts
  • Citing Article
  • October 2017

... The author would like to extend his gratitude to the NLCS organising committee and the reviewers of the present publication for their kind and helpful comments and the opportunity to return to NLCS. [Pustejovsky and Asher, 2000], [Gupta and Aha, 2005] and [Pinkal and Kohlhase, 2000]), but the sheer complexity of the concepts involved prevented really satisfactory analyses for a long while. ...

Feature Logic for Dotted Types: A Formalism for Complex Word Meanings
  • Citing Conference Paper
  • January 2000