
Sajeetha ThavareesanEastern University Sri Lanka | EUSL · Department of Mathematics
Sajeetha Thavareesan
BSc(hons) Special in Computer Science
About
32
Publications
13,052
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
309
Citations
Citations since 2017
Introduction
Sajeetha currently works at the Department of Mathematics, Eastern University Sri Lanka. Sajeetha does research in NLP.
Her google scholar id :
https://scholar.google.com/citations?hl=en&user=yiL0uLUAAAAJ&view_op=list_works&gmla=AJsN-F5j9btFITxsL5Sk_SRqobKnG8xrPM86IJKFJDqs7bJM9A7irYPlyF84jRL8-IdzkkK7zDxcDMwmFOOWmKZRhcediwhp0BzMJ0nvsySpryqckAAry5g
Skills and Expertise
Publications
Publications (32)
We present the results of the Dravidian-CodeMix shared task held at FIRE 2021, a track on sentiment analysis for Dravidian Languages in Code-Mixed Text. We describe the task, its organization, and the submitted systems. This shared task is the continuation of last year's Dravidian-CodeMix shared task held at FIRE 2020. This year's tasks included co...
With the fast growth of mobile computing and Web technologies, offensive language has become more prevalent on social networking platforms. Since offensive language identification in local languages is essential to moderate the social media content, in this paper we work with three Dravidian languages, namely Malayalam, Tamil, and Kannada, that are...
Social media has effectively become the prime hub of communication and digital marketing. As these platforms enable the free manifestation of thoughts and facts in text, images and video, there is an extensive need to screen them to protect individuals and groups from offensive content targeted at them. Our work intends to classify codemixed social...
To obtain extensive annotated data for under-resourced languages is challenging, so in this research, we have investigated whether it is beneficial to train models using multi-task learning. Sentiment analysis and offensive language identification share similar discourse properties. The selection of these tasks is motivated by the lack of large lab...
A meme is an part of media created to share an opinion or emotion across the internet. Due to its popularity, memes have become the new forms of communication on social media. However, due to its nature, they are being used in harmful ways such as trolling and cyberbullying progressively. Various data modelling methods create different possibilitie...
In the last few decades, Code-Mixed Offensive texts are used penetratingly in social media posts. Social media platforms and online communities showed much interest on offensive text identification in recent years. Consequently , research community is also interested in identifying such content and also contributed to the development of corpora. Ma...
Tamil is a Dravidian language that is commonly used and spoken in the southern part of Asia. In the era of social media, memes have been a fun moment in the day-to-day life of people. Here, we try to analyze the true meaning of Tamil memes by categorizing them as troll and non-troll. We propose an ingenious model comprising of a transformer-transfo...
In a world filled with serious challenges like climate change, religious and political conflicts, global pandemics, terrorism, and racial discrimination, an internet full of hate speech, abusive and offensive content is the last thing we desire for. In this paper, we work to identify and promote positive and supportive content on these platforms. W...
This paper describes the IIITK team’s submissions to the offensive language identification, and troll memes classification shared tasks for Dravidian languages at DravidianLangTech 2021 workshop@ EACL 2021. Our best configuration for Tamil troll meme classification achieved 0.55 weighted average F1 score, and for offensive language identification,...
This paper demonstrates our work for the shared task on Offensive Language Identification in Dravidian Languages-EACL 2021. Offensive language detection in the various social media platforms was identified previously. But with the increase in diversity of users, there is a need to identify the offensive language in multilingual posts that are large...
In the last few decades, Code-Mixed Offensive texts are used penetratingly in social media posts. Social media platforms and online communities showed much interest on offensive text identification in recent years. Consequently, research community is also interested in identifying such content and also contributed to the development of corpora. Man...
This paper describes the IIITK’s team submissions to the hope speech detection for equality, diversity and inclusion in Dravidian languages shared task organized by LT-EDI 2021 workshop@ EACL 2021. Our best configurations for the shared tasks achieve weighted F1 scores of 0.60 for Tamil, 0.83 for Malayalam, and 0.93 for English. We have secured ran...
This paper proposes a word embeddingbased Part of Speech (POS) tagger for Tamil language. The experiments are conducted with different word embeddings BoW, TF-IDF, Word2vec, fastText and GloVe that are created using UJ_Tamil corpus. Different combinations of eight features with three classifiers linear SVM, Extreme Gradient Boosting and k-Nearest N...
Social media has penetrated into multilingual societies, however most of them use English to be a preferred language for communication. So it looks natural for them to mix their cultural language with English during conversations resulting in abundance of multilingual data, call this code-mixed data, available in todays' world.Downstream NLP tasks...
Sentiment Analysis is the process of identifying and categorising the sentiments expressed in a text into positive or negative. The words which carry the sentiments are the keys in sentiment prediction. The SentiWordNet is the sentiment lexicon used to determine the sentiment of texts. There are huge number of sentiment terms that are not in the Se...
An Improved kNN Algorithm using K-means and fastText to Predict
Sentiments Expressed in Tamil Texts
Sentiment Analysis (SA) is an application of Natural Language Processing (NLP) to extract the sentiments expressed in the text. In this paper, we experimented five approaches to perform SA, namely, Lexicon based approach, Supervised Machine learning based approach, Hybrid approach, K-means with Bag of Word (BoW) approach and K-modes with BoW approa...
Sentiment Analysis (SA) is a hot topic in Natural Language Processing. It aims to detect the
opinion expressed in the text. There were found quite a few published researches in SA in
English but rare for Tamil. This paper reviews the research papers of SA on Tamil text,
taking into consideration of the corpus, preprocessing steps, features and clas...
Sentiment Analysis (SA) is an application of Natural Language Processing (NLP) to analyse the sentiments expressed in the text. It classifies into categories of qualities and opinions such as good, bad, positive, negative, neutral, etc. It employs machine learning techniques and lexicons for the classification. Nowadays, people share their opinions...
The expeditious development of new technologies and their improvement in performance and
functionality, determine the new trends of strengthening teaching and learning practices of higher
education settings. Development of mobile technologies enables the users fully mobile which leads
to new ways of bridging the boundaries of time and space in educ...
An endeavor is made to advance the interaction between a general physician
and a person via a smart phone application. This paper describes a tool with which we
can improve the quality of treatment for patients using mobile application. The
application, MyCare, runs on several Android based devices with Wi-Fi capability.
This system allows users to...