Michael Kadantsev’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Data description.
Hate-related dataset characteristics.
Model validation results for English Subtask A, %.
Performance of our final models for English Subtasks A & B, official results, %.
Model validation results for Marathi Subtask A, %.

+1

Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi
  • Conference Paper
  • Full-text available

December 2021

·

44 Reads

·

20 Citations

·

Michael Kadantsev

·

This paper describes neural models developed for the Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages Shared Task 2021. Our team called neuro-utmn-thales participated in two tasks on binary and fine-grained classification of English tweets that contain hate, offensive, and profane content (English Subtasks A & B) and one task on identification of problematic content in Marathi (Marathi Subtask A). For English subtasks, we investigate the impact of additional corpora for hate speech detection to fine-tune transformer models. We also apply a one-vs-rest approach based on Twitter-RoBERTa to discrimination between hate, profane and offensive posts. Our models ranked third in English Subtask A with the F1-score of 81.99% and ranked second in English Subtask B with the F1-score of 65.77%. For the Marathi tasks, we propose a system based on the Language-Agnostic BERT Sentence Embedding (LaBSE). This model achieved the second result in Marathi Subtask A obtaining an F1 of 88.08%.

Download

Performance of our final model for the Marathi Subtask A, official results, %.
Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi

October 2021

·

35 Reads

This paper describes neural models developed for the Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages Shared Task 2021. Our team called neuro-utmn-thales participated in two tasks on binary and fine-grained classification of English tweets that contain hate, offensive, and profane content (English Subtasks A & B) and one task on identification of problematic content in Marathi (Marathi Subtask A). For English subtasks, we investigate the impact of additional corpora for hate speech detection to fine-tune transformer models. We also apply a one-vs-rest approach based on Twitter-RoBERTa to discrimination between hate, profane and offensive posts. Our models ranked third in English Subtask A with the F1-score of 81.99% and ranked second in English Subtask B with the F1-score of 65.77%. For the Marathi tasks, we propose a system based on the Language-Agnostic BERT Sentence Embedding (LaBSE). This model achieved the second result in Marathi Subtask A obtaining an F1 of 88.08%.

Citations (1)


... However, ensemble learning is a widely used method in hate speech detection especially in shared tasks and challenges, as mentioned in Section 2.1. We implement two of the state-of-the-art ensemble learning models which yielded the best results on their shared tasks (Wiedemann, Yimam, and Biemann 2020;Glazkova, Kadantsev, and Glazkov 2021). The results are reported in Table 4. Glazkova et al. (2021) proposed an ensemble model for HASOC in FIRE 2021 (Mandl et al. 2021). ...

Reference:

Constructing ensembles for hate speech detection
Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi