Using Sentiment Information for Preemptive Detection of Toxic Comments in Online Conversations

Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.


The challenge of automatic detection of toxic comments online has been the subject of a lot of research recently, but the focus has been mostly on detecting it in individual messages after they have been posted. Some authors have tried to predict if a conversation will derail into toxicity using the features of the first few messages. In this paper, we combine that approach with previous work on toxicity detection using sentiment information, and show how the sentiments expressed in the first messages of a conversation can help predict upcoming toxicity. Our results show that adding sentiment features does help improve the accuracy of toxicity prediction, and also allow us to make important observations on the general task of preemptive toxicity detection.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Full-text available
While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F1 score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.
Full-text available
Harassment by cyberbullies is a significant phenomenon on the social media. Existing works for cyberbullying detection have at least one of the following three bottlenecks. First, they target only one particular social media platform (SMP). Second, they address just one topic of cyberbullying. Third, they rely on carefully handcrafted features of the data. We show that deep learning based models can overcome all three bottlenecks. Knowledge learned by these models on one dataset can be transferred to other datasets. We performed extensive experiments using three real-world datasets: Formspring (12k posts), Twitter (16k posts), and Wikipedia(100k posts). Our experiments provide several useful insights about cyberbullying detection. To the best of our knowledge, this is the first work that systematically analyzes cyberbullying detection on various topics across multiple SMPs using deep learning based models and transfer learning.
Conference Paper
Full-text available
In recent years, bullying and aggression against social media users have grown significantly, causing serious consequences to victims of all demographics. Nowadays, cyberbullying affects more than half of young social media users worldwide, suffering from prolonged and/or coordinated digital harassment. Also, tools and technologies geared to understand and mitigate it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of bullies and aggressors, and what features distinguish them from regular users. We find that bullies post less, participate in fewer online communities, and are less popular than normal users. Aggressors are relatively popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, with over 90% AUC.
Full-text available
Based on data from the 2014 General Social Survey, this article examines the characteristics associated with being a victim of cyberbullying or cyberstalking within the last five years for the population aged 15 to 29. This article also examines the association between cyberbullying and cyberstalking and various indicators of trust, personal behaviour and mental health.
Conference Paper
Full-text available
This paper considers the task of sentiment classification of subjective text across many domains, in particular on scenarios where no in-domain data is available. Motivated by the more general applicability of such methods, we propose an extensible approach to sentiment classification that leverages sentiment lexicons and out-of-domain data to build a case-based system where solutions to past cases are reused to predict the sentiment of new documents from an unknown domain. In our approach the case representation uses a set of features based on document statistics, while the case solution stores sentiment lexicons employed on past predictions allowing for later retrieval and reuse on similar documents. The case-based nature of our approach also allows for future improvements since new lexicons and classification methods can be added to the case base as they become available. On a cross domain experiment our method has shown robust results when compared to a baseline single-lexicon classifier where the lexicon has to be pre-selected for the domain in question.
Conference Paper
Full-text available
We present an approach to detecting hate speech in online text, where hate speech is defined as abusive speech targeting specific group characteristics, such as ethnic origin, religion, gender, or sexual orientation. While hate speech against any group may exhibit some common characteristics, we have observed that hatred against each different group is typically characterized by the use of a small set of high frequency stereotypical words; however, such words may be used in either a positive or a negative sense, making our task similar to that of words sense disambiguation. In this paper we describe our definition of hate speech, the collection and annotation of our hate speech corpus, and a mechanism for detecting some commonly used methods of evading common "dirty word" filters. We describe pilot classification experiments in which we classify anti-semitic speech reaching an accuracy 94%, precision of 68% and recall at 60%, for an F1 measure of. 6375.
Conference Paper
Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior.
Free to play? hate, harassment, and positive social experiences in online games
ADL, "Free to play? hate, harassment, and positive social experiences in online games,", 2019.
Using machine learning to detect cyberbullying
  • K Reynolds
  • A Kontostathis
  • L Edwards
K. Reynolds, A. Kontostathis, and L. Edwards, "Using machine learning to detect cyberbullying," in Machine learning and applications and workshops (ICMLA), 2011 10th International Conference on, vol. 2. IEEE, 2011, pp. 241-244.
Forecasting the presence and intensity of hostility on instagram using unsing linguistic and social features
  • P Liu
  • J Guberman
  • L Hemphill
  • A Culotta
P. Liu, J. Guberman, L. Hemphill, and A. Culotta, "Forecasting the presence and intensity of hostility on instagram using unsing linguistic and social features," in Twelfth International AAAI Conference on Web and Social Media, 2018.
A new anew: Evaluation of a word list for sentiment analysis in microblogs
  • F Å Nielsen
F.Å. Nielsen, "A new anew: Evaluation of a word list for sentiment analysis in microblogs," arXiv preprint arXiv:1103.2903, 2011.
Opinion mining in natural language processing using sentiwordnet and fuzzy
  • P Tumsare
  • A S Sambare
  • S R Jain
  • A Olah
P. Tumsare, A. S. Sambare, S. R. Jain, and A. Olah, "Opinion mining in natural language processing using sentiwordnet and fuzzy," International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Volume, vol. 3, pp. 154-158, 2014.
Opinion observer: Analyzing and comparing opinions on the web
  • B Liu
  • M Hu
  • J Cheng
B. Liu, M. Hu, and J. Cheng, "Opinion observer: Analyzing and comparing opinions on the web," in Proceedings of the 14th International Conference on World Wide Web, ser. WWW '05. New York, NY, USA: ACM, 2005, pp. 342-351. [Online]. Available: