Conference Paper

Abusive Comments in Online Media and How to Fight Them: State of the Domain and a Call to Action

Conference Paper

Abusive Comments in Online Media and How to Fight Them: State of the Domain and a Call to Action

If you want to read the PDF, try requesting it from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Religious hatred is a serious problem on Arabic Twitter space and has the potential to ignite terrorism and hate crimes beyond cyber space. To the best of our knowledge, this is the first research effort investigating the problem of recognizing Arabic tweets using inflammatory and dehumanizing language to promote hatred and violence against people on the basis of religious beliefs. In this work, we create the first public Arabic dataset of tweets annotated for religious hate speech detection. We also create three public Arabic lexicons of terms related to religion along with hate scores. We then present a thorough analysis of the labeled dataset, reporting most targeted religious groups and hateful and non-hateful tweets’ country of origin. The labeled dataset is then used to train seven classification models using lexicon-based, n-gram-based, and deep-learning-based approaches. These models are evaluated on new unseen dataset to assess the generalization ability of the developed classifiers. While using Gated Recurrent Units with pre-trained word embeddings provides best precision (0.76) and \(F_1\) score (0.77), training that same neural network on additional temporal, users, and content features provides the state-of-the-art performance in terms of recall (0.84).
Conference Paper
Full-text available
English. The automatic misogyny identification (AMI) task proposed at IberEval and EVALITA 2018 is an example of the active involvement of scientific Research to face up the online spread of hate contents against women. Considering the encouraging results obtained for Spanish and English in the precedent edition of AMI, in the EVALITA framework we tested the robustness of a similar approach based on topic and stylistic information on a new collection of Italian and English tweets. Moreover, to deal with the dynamism of the language on social platforms , we also propose an approach based on automatically-enriched lexica. Despite resources like the lexica prove to be useful for a specific domain like misogyny, the analysis of the results reveals the limitations of the proposed approaches.
Article
Full-text available
This paper addresses the important problem of discerning hateful content in social media. We propose a detection scheme that is an ensemble of Recurrent Neural Network (RNN) classifiers, and it incorporates various features associated with user- related information, such as the users’ tendency towards racism or sexism. These data are fed as input to the above classifiers along with the word frequency vectors derived from the textual content. We evaluate our approach on a publicly available corpus of 16k tweets, and the results demonstrate its effectiveness in comparison to existing state-of-the-art solutions. More specifically, our scheme can successfully distinguish racism and sexism messages from normal text, and achieve higher classification quality than current state-of-the-art algorithms.
Article
Full-text available
In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, and researchers. A large number of methods have been developed for automated hate speech detection online. This aims to classify textual content into non-hate or hate speech, in which case the method may also identify the targeting characteristics (i.e., types of hate, such as race, and religion) in the hate speech. However, we notice significant difference between the performance of the two (i.e., non-hate v.s. hate). In this work, we argue for a focus on the latter problem for practical reasons. We show that it is a much more challenging task, as our analysis of the language in the typical datasets shows that hate speech lacks unique, discriminative features and therefore is found in the 'long tail' in a dataset that is difficult to discover. We then propose Deep Neural Network structures serving as feature extractors that are particularly effective for capturing the semantics of hate speech. Our methods are evaluated on the largest collection of hate speech datasets based on Twitter, and are shown to be able to outperform the best performing method by up to 5 percentage points in macro-average F1, or 8 percentage points in the more challenging case of identifying hateful content.
Conference Paper
Full-text available
Although it has been a part of the dark underbelly of the Internet since its inception, recent events have brought the discussion board site 4chan to the forefront of the world's collective mind. In particular, /pol/, 4chan's "Politically Incorrect" board has become a central figure in the outlandish 2016 Presidential election. Even though 4chan has long been viewed as the "final boss of the Internet," it remains relatively unstudied in the academic literature. In this paper we analyze /pol/ along several axes using a dataset of over 8M posts. We first perform a general characterization that reveals how active posters are, as well as how some unique features of 4chan affect the flow of discussion. We then analyze the content posted to /pol/ with a focus on determining topics of interest and types of media shared, as well as the usage of hate speech and differences in poster demographics. We additionally provide quantitative evidence of /pol/'s collective attacks on other social media platforms. We perform a quantitative case study of /pol/'s attempt to poison anti-trolling machine learning technology by altering the language of hate on social media. Then, via analysis of comments from the 10s of thousands of YouTube videos linked on /pol/, we provide a mechanism for detecting attacks from /pol/ threads on 3rd party social media services.
Article
Full-text available
This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks -- but that their use makes the interpretation of the value of the coefficient even harder.
Conference Paper
Full-text available
Science is a cumulative endeavour as new knowledge is often created in the process of interpreting and combining existing knowledge. This is why literature reviews have long played a decisive role in scholarship. The quality of literature reviews is particularly determined by the literature search process. As Sir Isaac Newton eminently put it: “If I can see further, it is because I am standing on the shoulders of giants.” Drawing on this metaphor, the goal of writing a literature review is to reconstruct the giant of accumulated knowledge in a specific domain. And in doing so, a literature search represents the fundamental first step that makes up the giant’s skeleton and largely determines its reconstruction in the subsequent literature analysis. In this paper, we argue that the process of searching the literature must be comprehensibly described. Only then can readers assess the exhaustiveness of the review and other scholars in the field can more confidently (re)use the results in their own research. We set out to explore the methodological rigour of literature review articles published in ten major information systems (IS) journals and show that many of these reviews do not thoroughly document the process of literature search. The results drawn from our analysis lead us to call for more rigour in documenting the literature search process and to present guidelines for crafting a literature review and search in the IS domain.
Article
Full-text available
Recent research indicates a high recall in Google Scholar searches for systematic reviews. These reports raised high expectations of Google Scholar as a unified and easy to use search interface. However, studies on the coverage of Google Scholar rarely used the search interface in a realistic approach but instead merely checked for the existence of gold standard references. In addition, the severe limitations of the Google Search interface must be taken into consideration when comparing with professional literature retrieval tools.The objectives of this work are to measure the relative recall and precision of searches with Google Scholar under conditions which are derived from structured search procedures conventional in scientific literature retrieval; and to provide an overview of current advantages and disadvantages of the Google Scholar search interface in scientific literature retrieval. General and MEDLINE-specific search strategies were retrieved from 14 Cochrane systematic reviews. Cochrane systematic review search strategies were translated to Google Scholar search expression as good as possible under consideration of the original search semantics. The references of the included studies from the Cochrane reviews were checked for their inclusion in the result sets of the Google Scholar searches. Relative recall and precision were calculated. We investigated Cochrane reviews with a number of included references between 11 and 70 with a total of 396 references. The Google Scholar searches resulted in sets between 4,320 and 67,800 and a total of 291,190 hits. The relative recall of the Google Scholar searches had a minimum of 76.2% and a maximum of 100% (7 searches). The precision of the Google Scholar searches had a minimum of 0.05% and a maximum of 0.92%. The overall relative recall for all searches was 92.9%, the overall precision was 0.13%. The reported relative recall must be interpreted with care. It is a quality indicator of Google Scholar confined to an experimental setting which is unavailable in systematic retrieval due to the severe limitations of the Google Scholar search interface. Currently, Google Scholar does not provide necessary elements for systematic scientific literature retrieval such as tools for incremental query optimization, export of a large number of references, a visual search builder or a history function. Google Scholar is not ready as a professional searching tool for tasks where structured retrieval methodology is necessary.
Chapter
Abusive language has been corrupting online conversations since the inception of the internet. Substantial research efforts have been put into the investigation and algorithmic resolution of the problem. Different aspects such as “cyberbullying”, “hate speech” or “profanity” have undergone ample amounts of investigation, however, often using inconsistent vocabulary such as “offensive language” or “harassment”. This led to a state of confusion within the research community. The inconsistency can be considered an inhibitor for the domain: It increases the risk of unintentional redundant work and leads to undifferentiated and thus hard to use and justifiable machine learning classifiers. To remedy this effect, this paper introduces a novel configurable, multi-view approach to define abusive language concepts.
Conference Paper
In recent years, online public discussions face a proliferation of racist, politically, and religiously motivated hate comments, threats, and insults. With the failure of purely manual moderation, platform operators started searching for semi-automated or even completely automated approaches for comment moderation. One promising option to (semi-) automate the moderation process is the application of Natural Language Processing and Machine Learning (ML) techniques. In this paper we describe the challenges, that currently prevent the application of these techniques and therefore the development of (semi-) and automated solutions. As most of the challenges (e.g., curation of big datasets) require huge financial investments, only big players, such as Google or Facebook, will be able to invest in them. Many of the smaller and medium-sized internet companies will fall behind. To allow this bulk of (media) companies to stay competitive, we design a novel Analytics as a Service (AaaS) offering that will also allow small and medium sized enterprises to profit from ML decision support. We then use the identified challenges to evaluate the conceptual design of the business model and highlight areas of future research to enable the instantiation of the AaaS platform.
Conference Paper
User-generated online comments and posts increasingly contain abusive content that needs moderation from an ethical but also legislative perspective. The amount of comments and the need for moderation in our digital world often overpower the capacity of manual moderation. To remedy this, platforms often adopt semi-automated moderation systems. However, because such systems are typically black boxes, user trust in and acceptance of the system is not easily achieved, as black box systems can be perceived as nontransparent and moderating user comments is easily associated with censorship. Therefore, we investigate the relationship of system transparency through explanations, user trust and system acceptance with an online experiment. Our results show that the transparency of an automatic online comment moderation system is a prerequisite for user trust in the system. However, the objective transparency of the moderation system does not influence the user's acceptance.
Article
The scientific study of hate speech, from a computer science point of view, is recent. This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used. This work also discusses the complexity of the concept of hate speech, defined in many platforms and contexts, and provides a unifying definition. This area has an unquestionable potential for societal impact, particularly in online communities and digital media platforms. The development and systematization of shared resources, such as guidelines, annotated datasets in multiple languages, and algorithms, is a crucial step in advancing the automatic detection of hate speech.
Article
A taxonomy of literature reviews in education and psychology is presented. The taxonomy categorizes reviews according to: (a) focus; (b) goal; (c) perspective; (d) coverage; (e) organization; and (f) audience. The seven winners of the American Educational Research Association’s Research Review Award are used to illustrate the taxonomy’s categories. Data on the reliability of taxonomy codings when applied by readers is presented. Results of a survey of review authors provides baseline data on how frequently different types of reviews appear in the education and psychology literature. How the taxonomy might help in judging the quality of literature reviews is discussed, along with more general standards for evaluating reviews.
Chapter
In this chapter we begin our discussion of some specific methods for supervised learning. These techniques each assume a (different) structured form for the unknown regression function, and by doing so they finesse the curse of dimensionality. Of course, they pay the possible price of misspecifying the model, and so in each case there is a tradeoff that has to be made. They take off where Chapters 3–6 left off. We describe five related techniques: generalized additive models, trees, multivariate adaptive regression splines, the patient rule induction method, and hierarchical mixtures of experts.
Article
A review of prior, relevant literature is an essential feature of any academic project. An effective review creates a firm foundation for advancing knowledge. It facilitates theory development, closes areas where a plethora of research exists, and uncovers areas where research is needed.
Article
We present VOSviewer, a freely available computer program that we have developed for constructing and viewing bibliometric maps. Unlike most computer programs that are used for bibliometric mapping, VOSviewer pays special attention to the graphical representation of bibliometric maps. The functionality of VOSviewer is especially useful for displaying large bibliometric maps in an easy-to-interpret way. The paper consists of three parts. In the first part, an overview of VOSviewer's functionality for displaying bibliometric maps is provided. In the second part, the technical implementation of specific parts of the program is discussed. Finally, in the third part, VOSviewer's ability to handle large maps is demonstrated by using the program to construct and display a co-citation map of 5,000 major scientific journals.
Countering dangerous speech to prevent mass violence during Kenya's 2013 elections
  • S Benesch
Benesch, S.: Countering dangerous speech to prevent mass violence during Kenya's 2013 elections. Tech. rep., Dangerous Speech Project (2013)
Automated hate speech detection and the problem of offensive language
  • T Davidson
  • D Warmsley
  • M Macy
  • I Weber
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of Eleventh International Conference on Web and Social Media, ICWSM-2017, Montreal, Canada, pp. 512-515 (2017)
What happened after 7 news sites got rid of reader comments
  • J Ellis
Ellis, J.: What happened after 7 news sites got rid of reader comments (2015). https://www.niemanlab.org/2015/09/what-happened-after-7-news-sitesgot-rid-of-reader-comments/
Merging datasets for hate speech classification in Italian
  • P Fortuna
  • I Bonavita
  • S Nunes
Fortuna, P., Bonavita, I., Nunes, S.: Merging datasets for hate speech classification in Italian. In: Proceedings of the Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, EVALITA 2018, Turin, Italy, pp. 1-6 (2018)
The dark side of guardian comments
  • B Gardiner
  • M Mansfield
  • I Anderson
  • J Holder
  • D Louter
  • M Ulmanu
Gardiner, B., Mansfield, M., Anderson, I., Holder, J., Louter, D., Ulmanu, M.: The dark side of guardian comments (2016). https://www.theguardian.com/ technology/2016/apr/12/the-dark-side-of-guardian-comments
No comment! why more news sites are dumping their comment sections
  • M Green
Green, M.: No comment! why more news sites are dumping their comment sections (2018). https://www.kqed.org/lowdown/29720/no-comment-why-agrowing-number-of-news-sites-are-dumping-their-comment-sections
Methodological challenges for detecting interethnic hostility on social media
  • O Koltsova
Koltsova, O.: Methodological challenges for detecting interethnic hostility on social media. In: Bodrunova, S.S., et al. (eds.) INSCI 2018. LNCS, vol. 11551, pp. 7-18.
The classification of aggressive dialogue in social media platforms
  • J Langham
  • K Gosha
Langham, J., Gosha, K.: The classification of aggressive dialogue in social media platforms. In: Proceedings of 2018 ACM SIGMIS Conference on Computers and People Research, SIGMIS-CPR 2018, Buffalo-Niagara Falls, NY, USA, pp. 60-63 (2018)
How we analysed 70m comments on the guardian website
  • M Mansfield
Mansfield, M.: How we analysed 70m comments on the guardian website (2016). https://www.theguardian.com/technology/2016/apr/12/how-we-analysed-70m-comments-guardian-website
It's able to create knowledge itself': Google unveils AI that learns on its own
  • I Sample
Sample, I.: 'It's able to create knowledge itself': Google unveils AI that learns on its own (2017). https://www.theguardian.com/science/2017/oct/18/its-ableto-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own
W3Techs: Historical trends in the usage statistics of content languages for websites
  • S Siegert
Siegert, S.: Nahezu jede zweite Zeitungsredaktion schränkt Online-Kommentare ein (2016). http://www.journalist.de/aktuelles/meldungen/journalist-umfragenahezu-jede-2-zeitungsredaktion-schraenkt-onlinekommentare-ein.html 31. W3Techs: Historical trends in the usage statistics of content languages for websites, February 2020. https://w3techs.com/technologies/history overview/content language