Miloslav Konopík’s research while affiliated with New Technologies for the Information Society and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (50)


Annotation application window screens
R1 – The first round of annotation. S = summary sentence; A,B,C = report sentences. Pairs SA, SB, and SC were selected and annotated by Group 1. One annotator has processed six full incidents (no overlap). The STS scores for each pair unused in the second round are – together with the corresponding sentences – labelled as the training dataset
R2 – Sn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_n$$\end{document}= summary sentence from R1; An,Bn,Cn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_n, B_n, C_n$$\end{document} = report sentences from R1. The distinct sequence for each validator
Statistics of the short preliminary annotation phase. As the true time between submitting annotations consists of thinking time and waiting time (precisely, their sum, which was the only information known to us), the thinking and respective waiting times shown here were estimated by ignoring the other part. (a) The dependence of root mean squared error (= RMSE) between the R1 score and R2 score on estimated validator (R2) thinking time. All annotations were grouped into buckets (bucket size = 1 second), and RMSE deviation from the original R1 score was calculated for each group. The initial spike corresponds to users who have not given the answer any significant thought. (b) The dependence of Pearson correlation between STS scores of R1 and R2 on the number of annotations already made by individual R2 annotator (= calibration curve). Note the value shown is not an average correlation between each R1 annotator and the corresponding R2 annotator but the correlation between all R1 annotators and all R2 annotators at once
The full round two (R2) on a temporal axis. The target sentence pairs in context-free and context-dependent subphases of each phase were equal and ordered the same exact way. The 72-hour break between the context-free and context-dependent halves has been forced

+5

Czech news dataset for semantic textual similarity
  • Article
  • Publisher preview available

December 2024

·

16 Reads

Language Resources and Evaluation

·

·

·

[...]

·

This paper describes a novel dataset consisting of sentences with two different semantic similarity annotations; with and without surrounding context. The data originate from the journalistic domain in the Czech language. The final dataset contains 138,556 human annotations divided into train and test sets. In total, 485 journalism students participated in the creation process. To increase the reliability of the test set, we compute the final annotations as an average of 9 individual annotation scores. We evaluate the dataset quality by measuring inter and intra-annotator agreements. Besides agreement numbers, we provide detailed statistics of the collected dataset. We conclude our paper with a baseline experiment of building a system for predicting the semantic similarity of sentences. Due to the massive number of training annotations (116,956), the model significantly outperforms an average annotator (0.92 versus 0.86 of Pearson’s correlation coefficient).

View access options

Findings of the Third Shared Task on Multilingual Coreference Resolution

October 2024

·

11 Reads

The paper presents an overview of the third edition of the shared task on multilingual coreference resolution, held as part of the CRAC 2024 workshop. Similarly to the previous two editions, the participants were challenged to develop systems capable of identifying mentions and clustering them based on identity coreference. This year's edition took another step towards real-world application by not providing participants with gold slots for zero anaphora, increasing the task's complexity and realism. In addition, the shared task was expanded to include a more diverse set of languages, with a particular focus on historical languages. The training and evaluation data were drawn from version 1.2 of the multilingual collection of harmonized coreference resources CorefUD, encompassing 21 datasets across 15 languages. 6 systems competed in this shared task.


Exploring Multiple Strategies to Improve Multilingual Coreference Resolution in CorefUD

August 2024

·

31 Reads

Coreference resolution, the task of identifying expressions in text that refer to the same entity, is a critical component in various natural language processing (NLP) applications. This paper presents our end-to-end neural coreference resolution system, utilizing the CorefUD 1.1 dataset, which spans 17 datasets across 12 languages. We first establish strong baseline models, including monolingual and cross-lingual variations, and then propose several extensions to enhance performance across diverse linguistic contexts. These extensions include cross-lingual training, incorporation of syntactic information, a Span2Head model for optimized headword prediction, and advanced singleton modeling. We also experiment with headword span representation and long-documents modeling through overlapping segments. The proposed extensions, particularly the heads-only approach, singleton modeling, and long document prediction significantly improve performance across most datasets. We also perform zero-shot cross-lingual experiments, highlighting the potential and limitations of cross-lingual transfer in coreference resolution. Our findings contribute to the development of robust and scalable coreference systems for multilingual coreference resolution. Finally, we evaluate our model on CorefUD 1.1 test set and surpass the best model from CRAC 2023 shared task of a comparable size by a large margin. Our nodel is available on GitHub: \url{https://github.com/ondfa/coref-multiling}






End-to-end Multilingual Coreference Resolution with Mention Head Prediction

September 2022

·

6 Reads

This paper describes our approach to the CRAC 2022 Shared Task on Multilingual Coreference Resolution. Our model is based on a state-of-the-art end-to-end coreference resolution system. Apart from joined multilingual training, we improved our results with mention head prediction. We also tried to integrate dependency information into our model. Our system ended up in 3rd3^{rd} place. Moreover, we reached the best performance on two datasets out of 13.


Evaluating Attribution Methods for Explainable NLP with Transformers

September 2022

·

13 Reads

·

1 Citation

Lecture Notes in Computer Science

This paper describes the experimental evaluation of several attribution methods on two NLP tasks: Sentiment analysis and multi-label document classification. Our motivation is to find the best method to use with Transformers to interpret model decisions. For this purpose, we introduce two new evaluation datasets. The first one is derived from Stanford Sentiment Treebank, where the sentiment of individual words is annotated along with the sentiment of the whole sentence. The second dataset comes from Czech Text Document Corpus, where we added keyword information assigned to each category. The keywords were manually assigned to each document and automatically propagated to categories via PMI. We evaluate each attribution method on several models of different sizes. The evaluation results are reasonably consistent across all models and both datasets. It indicates that both datasets with proposed evaluation metrics are suitable for interpretability evaluation. We show how the attribution methods behave concerning model size and task. We also consider practical applications – we show that while some methods perform well, they can be replaced with slightly worse-performing methods requiring significantly less time to compute.KeywordsExplainable AITransformersDocument classification


System comparison. The baseline solution, if involved, was either modified internally, or only its predic- tions were used and modified subsequently ("files only"). "DL" stands for a deep learning solution.
Recall / Precision / F1 for individual secondary metrics. All scores macro-averaged over all datasets. Note that the high recall and F1 MOR scores of ONDFA (relative to STRAKA* systems) is caused by the fact that ONDFA does not use any post-processing restricting mention spans to the head.
Findings of the Shared Task on Multilingual Coreference Resolution

September 2022

·

18 Reads

This paper presents an overview of the shared task on multilingual coreference resolution associated with the CRAC 2022 workshop. Shared task participants were supposed to develop trainable systems capable of identifying mentions and clustering them according to identity coreference. The public edition of CorefUD 1.0, which contains 13 datasets for 10 languages, was used as the source of training and evaluation data. The CoNLL score used in previous coreference-oriented shared tasks was used as the main evaluation metric. There were 8 coreference prediction systems submitted by 5 participating teams; in addition, there was a competitive Transformer-based baseline system provided by the organizers at the beginning of the shared task. The winner system outperformed the baseline by 12 percentage points (in terms of the CoNLL scores averaged across all datasets for individual languages).


Citations (25)


... The CRAC 2024 Shared Task on Multilingual Coreference Resolution (Novák et al., 2024) is a third iteration of a shared task, whose goal is to accelerate research in multilingual coreference resolution (Žabokrtský et al., 2023, 2022). This year, the shared task features 21 datasets in 15 languages from the CorefUD 1.2 collection (Popel et al., 2024). ...

Reference:

CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text
Findings of the Third Shared Task on Multilingual Coreference Resolution
  • Citing Conference Paper
  • January 2024

... We chose this dataset because our goal in this paper is to study real-life software projects and this dataset was curated using real-life OSS projects from GitHub. Furthermore, it has been used by many existing works [14,29,31,53,55,61,66,72]. CodeSearchNet is a collection of functions (both standalone functions and methods) extracted from real-life projects on GitHub along with their function signatures compiled as (comment, code) pairs. ...

MQDD: Pre-training of Multimodal Question Duplicity Detection for Software Engineering Domain
  • Citing Conference Paper
  • January 2023

... These datasets vary based on features such as domain, annotation schemes, and types of references labeled. These variations often lead to annotation inconsistencies, evaluation challenges, and domain limitations (Žabokrtský et al., 2023;Aloraini et al., 2024;Nedoluzhko et al., 2021b). This, along with the need for relatively large annotated datasets to train current state-of-the-art models, motivated us to look into automatic data annotation and harmonization. ...

Findings of the Second Shared Task on Multilingual Coreference Resolution

... Agirre et al. (2016);Cer et al. (2017)). However, a recent dataset of Sido et al. (2021) includes similarity annotations of Czech sentence pairs in document context, thus to our knowledge being the first STS dataset which could directly be applied to span detection modelling. ...

Czech News Dataset for Semanic Textual Similarity

... For our experiments, we have selected several of the systems used in Barberesi's evaluation [24] including Inscriptis, BoilerPy3 12 , jusText 13 and Dragnet. Several other tools have been initially targeted but were not included since their source code was not available online at the date the experiments were performed (e.g., Sido's forum extraction tool [25]) or due to various errors (e.g., the News-Please tool). We will continue to pursue the developers of these tools to include their systems in future versions of the created evaluation plugin. ...

English Dataset For Automatic Forum Extraction

Computacion y Sistemas

... Text data is essential nowadays more than before, which is valuable and can be easy to store in massive amounts to be processed and mining [9]. Using social media is expanding from the public, and the customer's reviews and reactions are necessary and powerful tool for quality services, sustainable tourism [10] and transport and other aspects such as maintenance. ...

Deep Learning for Text Data on Mobile Devices
  • Citing Conference Paper
  • September 2019

... Another usage of LSTM neural network with the self-attention mechanism (Humphreys and Sui, 2016) can be found in (Libovickỳ et al., 2018). Similarly, Sido and Konopík (2019) tried to use curriculum learning with CNN and LSTM. Lehečka et al. (2020) pre-trained a BERT-based model for polarity detection with an improved pooling layer and distillation of knowledge technique. ...

Curriculum Learning in Sentiment Analysis
  • Citing Chapter
  • July 2019

Lecture Notes in Computer Science

... The word vectors in both systems were trained using word2vec with the skipgram architecture. The best performance was brought by Konopík and Pražák (2018). They use a deep neural model with LSTM layers encoding character sequences and word sequences together with a wider context information obtained from Latent Dirichlet Allocation. ...

LDA in Character-LSTM-CRF Named Entity Recognition: 21st International Conference, TSD 2018, Brno, Czech Republic, September 11-14, 2018, Proceedings
  • Citing Chapter
  • September 2018

Lecture Notes in Computer Science

... These have demonstrated promising performance due to recent improvements in neural machine translation (NMT) [12,13,17]. Translation-based projection involves tree-to-tree mappings to build cross-lingual SRL-annotated corpora [31], based on tree/graph-based representations [33]. By contrast, our approach aims to accommodate divergences for SRL projection via word-to-word mapping without relying on additional structure (e.g., trees or graphs). ...

Cross-Lingual SRL Based upon Universal Dependencies
  • Citing Conference Paper
  • November 2017