Marcin Oleksy

Marcin Oleksy
Wrocław University of Science and Technology | WUT · Department of Computational Intelligence

PhD

About

32
Publications
8,546
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
679
Citations
Introduction
Marcin Oleksy currently works at the Department of Artificial Intelligence at Wroclaw University of Science and Technology. Marcin does research in Computing in Social science, Arts and Humanities, Data Mining and Artificial Intelligence. He is currently involved in 'CLARIN-PL' project.

Publications

Publications (32)
Preprint
Full-text available
Advancements in AI and natural language processing have revolutionized machine-human language interactions, with question answering (QA) systems playing a pivotal role. The knowledge base question answering (KBQA) task, utilizing structured knowledge graphs (KG), allows for handling extensive knowledge-intensive questions. However, a significant ga...
Conference Paper
Full-text available
This article compiles research on the extraction of human characteristics using three different methods: questionnaires, annotations , and biases. We have performed an analysis of how personalized perception of texts is affected by individual human profile and bias. To acquire comprehensive knowledge about individual user preferences , we have gath...
Conference Paper
Full-text available
The article discusses the challenges of cross-linguistic dialogue act annotation, which involves using methods developed for a multilingual framework to annotate conversations in a specific language. The article specifically focuses on the research on dialogue act annotation in Polish based on the ISO standard. To ensure applicability across langua...
Article
Full-text available
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-kn...
Preprint
Full-text available
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-kn...
Conference Paper
Full-text available
This article presents the specification and evaluation of DiaBiz.Kom-the corpus of dialogue texts in Polish. The corpus contains transcriptions of telephone conversations conducted according to a prepared scenario. The transcripts of conversations have been manually annotated with a layer of information concerning communicative functions. DiaBiz.Ko...
Chapter
We introduce a comprehensive evaluation benchmark for Polish Word Sense Disambiguation task. The benchmark consists of 7 distinct datasets with sense annotations based on plWordNet–4.2. As far as we know, our work is a first attempt to standardise existing sense annotated data for Polish. We also follow the recent trends of neural WSD solutions and...
Article
The anonymization of unstructured texts has become a very popular and widely researched topic. This is due not only to the latest GDPR regulation, but also due to the development of state-of-the-art models in the field of natural language processing. The texts required for building such models have to be anonymized before and very often have to be...
Article
Full-text available
This paper discusses the problem of shallow parsing of Polish, most specifically — chunking. We describe the linguistic work on annotation guidelines development, manual corpus annotation, and preparing the neural models used for chunking - the first one for the Polish language, and conducted the evaluation of these models. Finally, we describe con...
Chapter
In the paper, we deal with the problem of spatial expression recognition. The goal of this task is to recognize in text information structures that represent a relative spatial relationship between two objects (a trajector and a landmark) indicated by a preposition of location, for example, a bookonthe table. We used the Corpus of Polish Spatial Te...
Presentation
Full-text available
Udostępnianie zasobów i zarządzanie korpusami tekstowymi w infrastrukturze CLARIN-PL
Poster
Full-text available
In the paper we present the latest changes introduce to Inforex-a web-based system for qualitative and collaborative text corpora annotation and analysis. One of the most important news is the release of source codes. Now the system is available on the GitHub repository (https://github.com/ CLARIN-PL/Inforex) as an open source project. The system c...
Conference Paper
Full-text available
In the paper we present the latest changes introduce to Inforex-a web-based system for qualitative and collaborative text corpora annotation and analysis. One of the most important news is the release of source codes. Now the system is available on the GitHub repository (https://github.com/ CLARIN-PL/Inforex) as an open source project. The system c...
Conference Paper
In this paper we present a morpho-syntactic tagger dedicated to Computer-mediated Communication texts in Polish. Its construction is based on an expanded RNN-based neural network adapted to the work on noisy texts. Among several techniques, the tagger utilises fastText embedding vectors, sequential character embedding vectors, and Brown clustering...
Preprint
Full-text available
This article introduces the issue of recognition and normalisation of temporal expressions for the Polish language. We describe what temporal information is and we present TimeML specification, adapted to Polish as a model for the description of temporal expressions. Classes of temporal expressions are presented as well as guidelines for annotation...
Conference Paper
Full-text available
This article presents the research in the recognition and normalization of Polish temporal expressions as the result of the first PolEval 2019 shared task. Temporal information extracted from the text plays a significant role in many information extraction systems, like question answering, event recognition or text summarization. A specification fo...
Article
Communicative planning has been widely criticized for having little to do with the official legal procedures and for low-quality spatial solutions. It has also been blamed to be an empty concept, referring to an action that in itself has no content. This critique gives ground to the question: what is actually the role of the communicative and parti...
Conference Paper
Full-text available
In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity recognition. The tool is tuned to recognize and lemmatize named entities for Polish.
Data
In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity recognition. The tool is tuned to recognize and lemmatize named entities for Polish.
Presentation
Full-text available
In the paper we present an adaptation of Liner2 framework to solve the BSNLP 2017 shared task on multilingual named entity recognition. The tool is tuned to recognize and lemmatize named entities for Polish.
Presentation
Full-text available
Ważnym zadaniem Centrum Technologii Językowych CLARIN-PL jest dostarczenie narzędzi umożliwiających wygodne prace korpusowe. Podczas wykładu słuchacze zapoznają się z podstawowymi zagadnieniami dotyczącymi przetwarzania i znakowania tekstów w systemie Inforex, na przykładzie wybranych korpusów zarówno treningowych (KPWr, PCSN), jak i użytkowych. Pr...
Article
Full-text available
In this paper, the problem of spatial relation recognition in Polish is examined. We present the different ways of distributing spatial information throughout a sentence by reviewing the lexical and grammatical signals of various relations between objects. We focus on the spatial usage of prepositions and their meaning, determined by the ‘conceptua...
Conference Paper
In the paper we cover the problem of spatial expression recognition in text for Polish language. A spatial expression is a text fragment which describes a relative location of two or more physical objects to each other. The first part of the paper treats about a Polish corpus annotated with spatial expressions and annotators agreement. In the secon...
Article
Full-text available
p> Temporal Expressions in Polish Corpus KPWr This article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from te...
Article
Full-text available
p> Towards an event annotated corpus of Polish The paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels – onto...

Network

Cited By