Malte Pietsch's scientific contributions

Publications (4)

Preprint
Full-text available
The evaluation of question answering models compares ground-truth annotations with model predictions. However, as of today, this comparison is mostly lexical-based and therefore misses out on answers that have no lexical overlap but are still semantically similar, thus treating correct answers as false. This underestimation of the true performance...
Preprint
Full-text available
A major challenge of research on non-English machine reading for question answering (QA) is the lack of annotated datasets. In this paper, we present GermanQuAD, a dataset of 13,722 extractive question/answer pairs. To improve the reproducibility of the dataset creation approach and foster QA research on other languages, we summarize lessons learne...

Citations

... The major contribution of the authors is the formulation and analysis of four semantic answer similarity approaches that aim to resolve to a large extent the issues mentioned above. They also release two three-way annotated datasets: a subset of the English SQuAD dataset [9], German GermanQuAD dataset [10], and NQ-open [3]. ...
... Noting their lexical nature, the exact and partial matching modes as well as the F1 measure have the drawback that they focus on whether the extracted answer is literally the same as the one in the ground truth rather than providing equivalent information [70]. For example, consider question Q1 in Fig. 1. ...