Jelena Mitrović

Jelena Mitrović
Universität Passau · Faculty of Computer Science and Mathematics

PhD

About

56
Publications
10,843
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
319
Citations
Additional affiliations
September 2015 - October 2015
Institute for Language and Speech Processing
Position
  • Visiting Research Scientist

Publications

Publications (56)
Conference Paper
Full-text available
The paper presents a language dependent model for classification of statements into ironic and non-ironic. The model uses various language resources: morphological dictionaries, sentiment lexicon, lexicon of markers and a WordNet based ontology. This approach uses various features: antonymous pairs obtained using the reasoning rules over the Serbia...
Article
Full-text available
This paper surveys ontological modeling of rhetorical concepts, developed for use in argument mining and other applications of computational rhetoric, projecting their future directions. We include ontological models of argument schemes applying Rhetorical Structure Theory (RST); the RhetFig proposal for modeling; the related RetFig Ontology of Rhe...
Preprint
Full-text available
This paper presents our submission for the SemEval shared task 6, sub-task A on the identification of offensive language. Our proposed model, C-BiGRU, combines a Convolu-tional Neural Network (CNN) with a bidirectional Recurrent Neural Network (RNN). We utilize word2vec to capture the semantic similarities between words. This composition allows us...
Conference Paper
Full-text available
We discuss ontological modeling of legal terminology in SUMO (Pease, 2001) in combination with the lexico-semantic database WordNet (Fellbaum, 1998). Formal systems that allow for automated semantic interpretation of law supported by lexical resources can provide solutions to many tasks related to legal reasoning. We wish to formalize legal issues...
Conference Paper
Full-text available
Abusive language detection is an unsolved and challenging problem for the NLP community. Recent literature suggests various approaches to distinguish between different language phenomena (e.g., hate speech vs. cyberbullying vs. offensive language) and factors (degree of explicitness and target) that may help to classify different abusive language p...
Chapter
To date, the number of studies that address the generalization of argument models is still relatively small. In this study, we extend our stacking model from argument identification to an argument unit classification task. Using this model, and for each of the learned tasks, we address three real-world scenarios concerning the model robustness over...
Conference Paper
In the current world, individuals are faced with decision making problems and opinion formation processes on a daily basis. Nevertheless, answering a comparative question by retrieving documents based only on traditional measures (such as TF-IDF and BM25) does not always satisfy the need. In this paper, we propose a multi-layer architecture to answ...
Article
This paper examines several widespread assumptions about artificial intelligence, particularly machine learning, that are often taken as factual premises in discussions on the future of patent law in the wake of ‘artificial ingenuity’. The objective is to draw a more realistic and nuanced picture of the human-computer interaction in solving technic...
Conference Paper
Full-text available
In this paper, we present our submission from the team CAROLL_Passau for subtask 1A of the HASOC 2021 workshop. Our presented model, C-BiGRU, is composed of a Convolutional Neural Network (CNN) together with a bidirectional Recurrent Neural Network (RNN). We utilized word embeddings to allow our model to apprehend the correlation between words in t...
Conference Paper
Full-text available
The main focus of the paper is the definitional revision and enrichment of offensive language typology, making reference to publicly available offensive language datasets and testing them on available pretrained lexical embedding systems. We review over 60 available corpora and compare tagging schemas applied there while making an attempt to explai...
Conference Paper
Full-text available
Argument identification is the cornerstone of a complete argument mining pipeline. Furthermore, it is the essential key for a wide spectrum of applications such as decision making, assisted writing, and legal counselling. Nevertheless, most existing argument mining approaches are limited to a single, specific domain. The problem of building a robus...
Research
Full-text available
Abstract: The paper examines a set of assumptions about artificial intelligence, particularly machine learning, often taken as factual premises in discussions on the future of patent law in the wake of ‘artificial ingenuity’. The objective is to draw a more realistic and nuanced picture of the human-computer interaction in solving technical problem...
Conference Paper
Full-text available
In the current world, individuals are faced with decision making problems and opinion formation processes on a daily basis. For example, debating or choosing between two similar products. However, answering a comparative question by retrieving documents based only on traditional measures (such as TF-IDF and BM25) does not always satisfy the need. T...
Article
Full-text available
Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter the beliefs and views of the public. One of the ways in which bias in news articles can be introduced is by altering word choice. Such a form of bias is very challenging to identify automatically due to the high conte...
Conference Paper
Full-text available
This paper presents our work on the refinement and improvement of the Serbian language part of Hurtlex, a multilingual lexicon of words to hurt. We pay special attention to adding Multi-word expressions that can be seen as abusive, as such lexical entries are very important in obtaining good results in a plethora of abusive language detection tasks...
Preprint
Full-text available
In this paper, we introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we have collected and made available to the public. We present the results of a detailed co...
Conference Paper
The main tool of a lawyer is their language. Legal prose is bound by writing styles, especially in Germany. These styles ensure that, i.a. judgments are written in a structured and comprehensive way. The writing style used for German judgements is called Urteilsstil and consist of several subcomponents. These subcomponents should be classifiable wi...
Conference Paper
Full-text available
This paper describes a neural network (NN) model that was used for participating in the OffensEval, Task 12 of the SemEval 2020 workshop. The aim of this task is to identify offensive speech in social media, specifically in tweets. The model we used, C-BiGRU, is composed of a Convolutional Neural Network (CNN) along with a bidirectional Recurrent N...
Conference Paper
Full-text available
The Common European Framework of Reference (CEFR) provides generic guidelines for the evaluation of language proficiency. Nevertheless, for automated proficiency classification systems, different approaches for different languages are proposed. Our paper evaluates and extends the results of an approach to Automatic Essay Scoring proposed as a part...
Article
Full-text available
This paper is about the Greek and Serbian multiword expressions (MWEs) that belong to the rhetorical figure simile. We use a corpus-driven crowdsourcing method to identify the most commonly used similes in Serbian and Greek. We attempt a first comparison of the two sets of data and discuss issues of simile encoding in lexical resources that are use...
Conference Paper
Full-text available
In this paper, we introduce the architecture used for our PAN@CLEF-2019 author profiling participation. In this task, we had to predict if the author of 100 tweets was a bot, a female human, or a male human user. This task is proposed from a multilingual perspective, for English and Spanish. We handled this task in two steps, using different featur...
Conference Paper
Full-text available
In the 2015 migration crisis thousands of refugees and migrants crossed the border to Hungary, Austria and Germany. The movements of these people are reflected in social media, especially on Twitter. In this paper we present a dataset of 3275 Tweets from the months September and October 2015. These Tweets are annotated regarding their relevance of...
Conference Paper
Full-text available
As part of the shared task of GermEval 2018 we developed a system that is able to detect offensive speech in German tweets. To increase the size of the existing training set we made an application for gathering trending tweets in Germany. This application also assists in manual annotation of those tweets. The main part of the training data consists...
Conference Paper
Full-text available
Automation in law can have far reaching advantages in providing direct access to justice. Simple to use applications could help in querying legal concerns and obtaining a preliminary analysis. Legal professionals could use those applications for help with case research and for detecting edge conditions, inequality and loopholes (Ashley, 2017). The...
Poster
Full-text available
https://www.uni-hildesheim.de/~linde002/wnlex2018_poster_abstracts.pdf
Poster
Full-text available
Research related to rhetorical figures and their automatic processing for the Serbian language started with building the Ontology of Rhetorical Figures (Mladenović & Mitrović, 2013) which gives a formal description of 98 rhetorical figures and allows for their automatic processing. An overview of the way this ontology was built and evaluated will b...
Poster
Full-text available
http://typo.uni-konstanz.de/parseme/images/Meeting/2016-09-26-Dubrovnik-meeting/WG1-MITROVIC-MARKANTONATOU-MLADENOVIC-KRSTEV-POSTER.pdf
Conference Paper
Full-text available
The aim of this paper is to show a language-independent process of creating a new semantic relation between adjectives and nouns in wordnets. The existence of such a relation is expected to improve the detection of figurative language and sentiment analysis (SA). The proposed method uses an annotated corpus to explore the semantic knowledge contain...
Article
This paper presents a process of building a Sentiment Analysis Framework for Serbian (SAFOS). We created a hybrid method that uses a sentiment lexicon and Serbian WordNet (SWN) synsets assigned with sentiment polarity scores in the process of feature selection. As the use of stemming for morphologically rich languages (MRLs) may result in loss or g...
Conference Paper
Full-text available
Poster presented at the 2nd General Meeting of The IC1207 COST Action, PARSEME, an interdisciplinary scientific network devoted to the role of multi-word expressions (MWEs) in parsing.
Conference Paper
Full-text available
Abstract In this paper we present a set of tools that will help developers of wordnets not only to increase the number of synsets but also to ensure their quality, thus preventing it to become obsolete too soon. We discuss where the dangers lay in a Word-Net production and how they were faced in the case of the Serbian WordNet. Developed tools fall...
Conference Paper
Full-text available
In this paper we present a set of new additions and func-tionalities to recently introduced software tools and techniques that will help researchers in the area of semantics and especially developers of wordnets. The motivation lies in our wish to get an on-line, fully comprehensive , modular, multiuser and safe system for further development of th...
Conference Paper
Full-text available
The paper presents RetFig, a formal domain ontology of rhetorical figures for Serbian. This ontology is one of the necessary steps in developing tools for Natural Language Processing in the Serbian language, especially for tools pertinent to discourse analysis, sentiment analysis and opinion mining. The RetFig ontology was developed taking into acc...
Article
Full-text available
Abstract: The goal of this paper is to point out the importance of crowdsourcing and to present some of the most successful projects that are functioning on the basis of this management model that originated in the business world, but it found its way into the world of culture and science. The ways in which crowdsourcing systems function are explor...
Article
Full-text available
Abstract. Paper presents details of the project “Europeana libraries: Aggregating digital content from Europe’s libraries” with special focus on participation of University library “Svetozar Markovic” in it. This CIP-Best Practice Network ICT-PSP project brought together 24 institutions including some of Europe’s leading research libraries from...
Article
Full-text available
U radu su predstavljene inovativne tehnolo gije koje mogu da omoguće značajna unapređenja u oblasti razvoja i promocije digitalnih zbirki zavičajne građe. Optičko prepoznavanje teksta u procesu digitalizacije zavičajnih zbirki predstavlja usko grlo koje zbog velikih zahteva za ljudskim resursima predstavlja značajan problem za realizaciju, pos...
Article
Full-text available
LIBER, CERL and CELN have teamed up with The Europeana foundation in order to implement the project "Europeana libraries". This CIP ICT PSP project started in January 2011 and by 2013 a brand new library aggregator of Europeana will be operational and 5 million new items will be added to the digital portal of the European cultural heritage. To achi...

Questions

Question (1)
Question
I am interested in possible methods of evaluation of the collected data.

Network

Cited By

Projects

Projects (3)
Project
SSIX will search and index conversations taking place on social network services, such as Twitter, StockTwits and Facebook including the most reliable and trustworthy news agencies, newspapers, blogs and industry publications. SSIX will classify and score content using a framework of qualitative and quantitative parameters called X-Scores, regardless of language, locale or data architecture