Figure 1 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Validation Accuracies for the Different Languages using four different input feature representations.
Source publication
We study the performance of monolingual and multilingual language models on the task of question-answering (QA) on three diverse languages: English, Finnish and Japanese. We develop models for the tasks of (1) determining if a question is answerable given the context and (2) identifying the answer texts within the context using IOB tagging. Further...
Contexts in source publication
Context 1
... show the Validation accuracies for our different setups in Figure 1. Except for the Japanese model, the combination approach performs best or equal to other methods. ...
Context 2
... on top of a frozen BERT model the classifier is more capable of predicting the answerability of the question and context pair. A noticeable increase in accuracy can be observed on the validation dataset compared to the other, more primitive, models in all three languages, c.f. Figure 1. Further, the BERT-based models achieve an accuracy close to their maximum already after the first epoch, however, they fail to improve beyond 80%. ...
Similar publications
Natural language tasks like Named Entity Recognition (NER) in the clinical domain on non-English texts can be very time-consuming and expensive due to the lack of annotated data. Cross-lingual transfer (CLT) is a way to circumvent this issue thanks to the ability of multilingual large language models to be fine-tuned on a specific task in one langu...
We present a multilingual Named Entity Recognition approach based on a robust and general set of features across languages and datasets. Our system combines shallow local information with clustering semi-supervised features induced on large amounts of unlabeled text. Understanding via empirical experimentation how to effectively combine various typ...
The adoption of large language models (LLMs) in education holds much promise. However, like many technological innovations before them, adoption and access can often be inequitable from the outset, creating more divides than they bridge. In this paper, we explore the magnitude of the country and language divide in the leading open‐source and propri...
While several benefits were realized for multilingual vision-language pretrained models, recent benchmarks across various tasks and languages showed poor cross-lingual generalisation when multilingually pre-trained vision-language models are applied to non-English data, with a large gap between (supervised) English performance and (zero-shot) cross...