• Home
  • Matúš Pikuliak
Matúš Pikuliak

Matúš Pikuliak
Kempelen Institute of Intelligent Technologies · Natural Language Processing

About

12
Publications
1,066
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
58
Citations

Publications

Publications (12)
Article
Full-text available
Hate speech should be tackled and prosecuted based on how it is operationalized. However, the existing theoretical definitions of hate speech are not sufficiently fleshed out or easily operable. To overcome this inadequacy, and with the help of interdisciplinary experts, we propose an empirical definition of hate speech by providing a list of 10 ha...
Preprint
Full-text available
We introduce a new Slovak masked language model called SlovakBERT in this paper. It is the first Slovak-only transformers-based model trained on a sizeable corpus. We evaluate the model on several NLP tasks and achieve state-of-the-art results. We publish the masked language model, as well as the subsequently fine-tuned models for part-of-speech ta...
Chapter
From a computer science perspective, addressing on-line hate speech is a challenging task that is attracting the attention of both industry (mainly social media platform owners) and academia. In this chapter, we provide an overview of state-of-the-art data-science approaches – how they define hate speech, which tasks they solve to mitigate the phen...
Preprint
Full-text available
From a computer science perspective, addressing on-line hate speech is a challenging task that is attracting the attention of both industry (mainly social media platform owners) and academia. In this chapter, we provide an overview of state-of-the-art data-science approaches - how they define hate speech, which tasks they solve to mitigate the phen...
Chapter
In this work we combine cross-lingual and cross-task supervision for zero-shot learning. Our main contribution is that we discovered that coupling models, i.e. models that share neither a task nor a language with the zero-shot target model, can improve the results significantly. Coupling models serve as a regularization for the other auxiliary mode...
Chapter
Many languages still lack the annotated training data needed for supervised learning. This issue is often addressed by using auxiliary supervision and the so called transfer learning. In this work we focus on the problem of combining two types of auxiliary supervision – cross-lingual and cross-task. Previous work has shown promising results for thi...
Article
Many intelligent systems in business, government or academy process natural language as an input for their inference or they might even communicate with users in natural language. The natural language processing within them is currently often done utilizing machine learning models. However, machine learning needs training data and such data are oft...
Chapter
Machine learning is an increasingly important approach to Natural Language Processing. Most languages however do not possess enough data to fully utilize it. When dealing with such languages it is important to use as much auxiliary data as possible. In this work we propose a combination of multitask and multilingual learning. When learning a new ta...
Preprint
Growing amount of comments make online discussions difficult to moderate by human moderators only. Antisocial behavior is a common occurrence that often discourages other users from participating in discussion. We propose a neural network based method that partially automates the moderation process. It consists of two steps. First, we detect inappr...

Network

Cited By