Viviana Patti

Viviana Patti
  • PhD in Computer Science
  • Professor (Associate) at University of Turin

About

228
Publications
28,722
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,404
Citations
Current institution
University of Turin
Current position
  • Professor (Associate)

Publications

Publications (228)
Preprint
We describe Evalita-LLM, a new benchmark designed to evaluate Large Language Models (LLMs) on Italian tasks. The distinguishing and innovative features of Evalita-LLM are the following: (i) all tasks are native Italian, avoiding issues of translating from Italian and potential cultural biases; (ii) in addition to well established multiple-choice ta...
Preprint
Full-text available
Gender-fair language aims at promoting gender equality by using terms and expressions that include all identities and avoid reinforcing gender stereotypes. Implementing gender-fair strategies is particularly challenging in heavily gender-marked languages, such as Italian. To address this, the Gender-Fair Generation challenge intends to help shift t...
Article
Full-text available
Stereotypes have been studied extensively in the fields of social psychology and, especially with the recent advances in technology, in computational linguistics. Stereotypes have also gained even more attention nowadays because of a notable rise in their dissemination due to demographic changes and world events. This paper focuses on ethnic stereo...
Preprint
Full-text available
There is a mismatch between psychological and computational studies on emotions. Psychological research aims at explaining and documenting internal mechanisms of these phenomena, while computational work often simplifies them into labels. Many emotion fundamentals remain under-explored in natural language processing, particularly how emotions devel...
Article
Full-text available
The possibility of raising awareness –especially in young generations– about misbehavior online such as hate speech, could help society to reduce its impact, and thus, its negative consequences. In the last years, the Computer Science Department of the University of Turin has designed various technologies that support educational projects and activ...
Chapter
Digital media have enabled the access to unprecedented literary knowledge. Authors, readers, and scholars are now able to discover and share an increasing amount of information about books and their authors. However, these sources of knowledge are fragmented and do not adequately represent non-Western writers and their works. In this paper we prese...
Chapter
Linguistic literature on irony discusses sarcasm as a form of irony characterized by its biting nature and the intention to mock a victim. This particular trait makes sarcasm apt to convey hate speech and not only humour. Previous works on abusive language stressed the need to address ironic language to lead the system to recognize correctly hate s...
Conference Paper
Full-text available
The Hate Speech Detection (HaSpeeDe3) task is the third edition of a shared task on the detection of hateful content in Italian tweets. It differs from the previous editions while maintaining continuity in analysing and contrasting hate speech (HS) on social media. While HaSpeeDe and HaSpeeDe2 were focused on HS against immigrants, Muslims and Roms...
Preprint
Full-text available
Digital media have enabled the access to unprecedented literary knowledge. Authors, readers, and scholars are now able to discover and share an increasing amount of information about books and their authors. However, these sources of knowledge are fragmented and do not adequately represent non-Western writers and their works. In this paper we prese...
Preprint
Full-text available
Biographical event detection is a relevant task for the exploration and comparison of the ways in which people's lives are told and represented. In this sense, it may support several applications in digital humanities and in works aimed at exploring bias about minoritized groups. Despite that, there are no corpora and models specifically designed f...
Preprint
Full-text available
Digital media have enabled the access to unprecedented literary knowledge. Authors, readers, and scholars are now able to discover and share an increasing amount of information about books and their authors. Notwithstanding, digital archives are still unbalanced: writers from non-Western countries are less represented, and such a condition leads to...
Chapter
Full-text available
Starting from the first edition held in 2007, EVALITA is the initiative for the evaluation of Natural Language Processing tools for Italian. We describe the EVALITA4ELG project, whose main aim is to systematically collect the resources released as benchmarks for this evaluation campaign, and make them easily accessible through the European Language...
Article
The generation of stereotypes allows us to simplify the cognitive complexity we have to deal with in everyday life. Stereotypes are extensively used to describe people who belong to a different ethnic group, particularly in racial hoaxes and hateful content against immigrants. This paper addresses the study of stereotypes from a novel perspective t...
Article
In this article, we investigate the case of human-machine dialogues in the specific domain of commercial customer care. We built a corpus of conversations between users and a customer-care chatbot of an Italian Telecom Company, focusing on a sample of conversations where users contact the service asking for explanations about billing issues or over...
Article
Full-text available
Abusive language is becoming a problematic issue for our society. The spread of messages that reinforce social and cultural intolerance could have dangerous effects in victims’ life. State-of-the-art technologies are often effective on detecting explicit forms of abuse, leaving unidentified the utterances with very weak offensive language but a str...
Chapter
Full-text available
The Hate and Morality (HaMor) submission for the Profiling Hate Speech Spreaders on Twitter task at PAN 2021 ranked as the 19th position - over 67 participating teams - according to the averaged accuracy value of \(73\%\) over the two languages - English (\(62\%\)) and Spanish (\(84\%\)). The method proposed four types of features for inferring use...
Preprint
Full-text available
Inside the NLP community there is a considerable amount of language resources created, annotated and released every day with the aim of studying specific linguistic phenomena. Despite a variety of attempts in order to organize such resources has been carried on, a lack of systematic methods and of possible interoperability between resources are sti...
Chapter
Connotation is a dimension of lexical meaning at the semantic-pragmatic interface. Connotations can be used to express point of views, perspectives, and implied emotional associations. Variations in connotations of the same lexical item can occur at different level of analysis: from individuals, to community of speech, specific domains, and even ti...
Preprint
Full-text available
Despite the large number of computational resources for emotion recognition, there is a lack of data sets relying on appraisal models. According to Appraisal theories, emotions are the outcome of a multi-dimensional evaluation of events. In this paper, we present APPReddit, the first corpus of non-experimental data annotated according to this theor...
Article
Full-text available
Swearing plays an ubiquitous role in everyday conversations among humans, both in oral and textual communication, and occurs frequently in social media texts, typically featured by informal language and spontaneous writing. Such occurrences can be linked to an abusive context, when they contribute to the expression of hatred and to the abusive effe...
Article
In the last decade, the need to detect automatically irony to correctly recognize the sentiment and hate speech involved in online texts increased the investigation on humorous figures of speech in NLP. The slight boundaries among various types of irony lead to think of irony as a linguistic phenomenon that covers sarcasm, satire, humor and parody...
Article
Full-text available
Hate Speech and harassment are widespread in online communication, due to users' freedom and anonymity and the lack of regulation provided by social media platforms. Hate speech is topically focused (misogyny, sexism, racism, xenophobia, homophobia, etc.), and each specific manifestation of hate speech targets different vulnerable groups based on c...
Chapter
The possibility of raising awareness about misbehaviour online, such as hate speech, especially in young generations could help society to reduce their impact, and thus, their consequences. The Computer Science Department of the University of Turin has designed various technologies that support educational projects and activities in this perspectiv...
Chapter
The availability of large annotated corpora from social media and the development of powerful classification approaches have contributed in an unprecedented way to tackle the challenge of monitoring users’ opinions and sentiments in online social platforms across time but also arose the challenge of temporal robustness of such detection and monitor...
Article
Full-text available
The present work introduces MoralConvITA, the first Italian corpus of conversations on Twitter about immigration whose annotation is focused on how moral beliefs shape users interactions. The corpus currently consists of a set of 1,724 tweets organized in adjacency pairs and annotated by referring to a pluralistic social psychology theory about mor...
Poster
Full-text available
In this paper we describe the largest corpus annotated with hate speech in the political domain in Italian. Policycorpus XL has 7000 tweets, manually annotated, and a presence of hate labels above 40%, while in other corpora of the same type is usually below 30%. Here we describe the collection of data and test some baseline with simple classificat...
Conference Paper
Full-text available
In this paper we describe the Hate and Morality (HaMor) submission for the Profiling Hate Speech Spreaders on Twitter task at PAN 2021. We ranked as the 19th position-over 66 participating teams-according to the averaged accuracy value of 73% reached by our proposed models over the two languages. We obtained the 43th higher accuracy for English (62...
Conference Paper
Full-text available
This paper proposes a methodology for investigating populism by analyzing proto-slogans, nominal utterances (NUs) typical of a political community on social media. We extracted more than 700. This paper proposes a methodology for investigating populism by analyzing proto-slogans, nominal utterances (NUs) typical of a political community on social m...
Article
Full-text available
Abusive language is an important issue in online communication across different platforms and languages. Having a robust model to detect abusive instances automatically is a prominent challenge. Several studies have been proposed to deal with this vital issue by modeling this task in the cross-domain and cross-lingual setting. This paper outlines a...
Article
Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on au...
Preprint
Full-text available
Social media platforms provide users the freedom of expression and a medium to exchange information and express diverse opinions. Unfortunately, this has also resulted in the growth of abusive content with the purpose of discriminating people and targeting the most vulnerable communities such as immigrants, LGBT, Muslims, Jews and women. Because ab...
Article
Full-text available
Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an impor...
Article
Full-text available
We present DEGARI (Dynamic Emotion Generator And ReclassIfier), an explainable system for emotion attribution and recommendation. This system relies on a recently introduced commonsense reasoning framework, the TCL logic, which is based on a human-like procedure for the automatic generation of novel concepts in a Description Logics knowledge base....
Article
Misogyny is a multifaceted phenomenon and can be linguistically manifested in numerous ways. The evaluation campaigns of EVALITA and IberEval in 2018 proposed a shared task of Automatic Misogyny Identification (AMI) based on Italian, English and Spanish tweets. Since the participating teams’ results were pretty low in the misogynistic behaviour cat...
Preprint
Full-text available
We present DEGARI (Dynamic Emotion Generator And ReclassIfier), an explainable system for emotion attribution and recommendation. This system relies on a recently introduced commonsense reasoning framework, the TCL logic, which is based on a human-like procedure for the automatic generation of novel concepts in a Description Logics knowledge base....
Conference Paper
Full-text available
SardiStance is the first shared task for Italian on the automatic classification of stance in tweets. It is articulated in two different settings: A) Textual Stance Detection, exploiting only the information provided by the tweet, and B) Contextual Stance Detection, with the addition of information on the tweet itself such as the number of retweets...
Conference Paper
Full-text available
The Hate Speech Detection (HaSpeeDe 2) task is the second edition of a shared task on the detection of hateful content in Italian Twitter messages. HaSpeeDe 2 is composed of a Main task (hate speech detection) and two Pilot tasks, (stereotype and nominal utterance detection). Systems were challenged along two dimensions: (i) time, with test data co...
Article
Full-text available
Starting from the first edition held in 2007, EVALITA is the initiative for the evaluation of Natural Language Processing tools for Italian. This paper describes the EVALITA4ELG project, whose main aim is at systematically collecting the resources released as benchmarks for this evaluation campaign, and making them easily accessible through the Eur...
Conference Paper
Full-text available
The detection of abusive or offensive remarks in social texts has received significant attention in research. In several related shared tasks, BERT has been shown to be the state-of-the-art. In this paper, we propose to utilize lexical features derived from a hate lexicon towards improving the performance of BERT in such tasks. We explore different...
Preprint
We present a novel corpus for personality prediction in Italian, containing a larger number of authors and a different genre compared to previously available resources. The corpus is built exploiting Distant Supervision, assigning Myers-Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to a variety of experiments. We repo...
Preprint
As a contribution to personality detection in languages other than English, we rely on distant supervision to create Personal-ITY, a novel corpus of YouTube comments in Italian, where authors are labelled with personality traits. The traits are derived from one of the mainstream personality theories in psychology research, named MBTI. Using persona...
Article
The freedom of expression given by social media has a dark side: the growing proliferation of abusive contents on these platforms. Misogynistic speech is a kind of abusive language, which can be simplified as hate speech targeting women, and it is becoming a more and more relevant issue in recent years. AMI IberEval 2018 and AMI EVALITA 2018 were t...
Article
Full-text available
In this paper we propose an approach to exploit the fine-grained knowledge expressed by individual human annotators during a hate speech (HS) detection task, before the aggregation of single judgments in a gold standard dataset eliminates non-majority perspectives. We automatically divide the annotators into groups, aiming at grouping them by simil...
Article
Full-text available
Interest has grown around the classification of stance that users assume within online debates in recent years. Stance has been usually addressed by considering users posts in isolation, while social studies highlight that social communities may contribute to influence users' opinion. Furthermore, stance should be studied in a diachronic perspectiv...
Preprint
Interest has grown around the classification of stance that users assume within online debates in recent years. Stance has been usually addressed by considering users posts in isolation, while social studies highlight that social communities may contribute to influence users' opinion. Furthermore, stance should be studied in a diachronic perspectiv...
Article
Full-text available
In the last years, the control of online user generated content is becoming a priority, because of the increase of online aggressiveness and hate speech legal cases. Considering the complexity and the importance of this issue, this paper presents an approach that combines the deep learning framework with linguistic features for the recognition of a...
Article
Full-text available
The availability of large annotated corpora from social media and the development of powerful classification approaches have contributed in an unprecedented way to tackle the challenge of monitoring users’ opinions and sentiments in online social platforms across time. Such linguistic data are strongly affected by events and topic discourse, and th...
Article
Full-text available
The paper describes the Web platform built within the project “Contro l’Odio”, for monitoring and contrasting discrimination and hate speech against immigrants in Italy. It applies a combination of computational linguistics techniques for hate speech detection and data visualization tools on data drawn from Twitter.It allows users to access a huge...
Conference Paper
Full-text available
This paper describes our participation to the TRAC-2 Shared Tasks on Aggression Identification. Our team, FlorUniTo, investigated the applicability of using an abusive lexicon to enhance word embeddings towards improving detection of aggressive language. The embeddings used in our paper are word-aligned pre-trained vectors for English, Hindi, and B...
Article
Full-text available
Stance Detection is the task of automatically determining whether the author of a text is in favor, against, or neutral towards a given target. In this paper we investigate the portability of tools performing this task across different languages, by analyzing the results achieved by a Stance Detection system (i.e. MultiTACOS) trained and tested in...
Chapter
Full-text available
Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian...
Conference Paper
Full-text available
The paper describes the Web platform built within the project "Contro l'odio", for monitoring and contrasting discrimination and hate speech against immigrants in Italy. It applies a combination of computational linguistics techniques for hate speech detection and data visualization tools on data drawn from Twitter. It allows users to access a huge...
Chapter
The number of social media users is ever-increasing. Unfortunately, this has also resulted in the massive rise of uncensored online hate against vulnerable communities such as immigrants, LGBT and women. Current work on the automatic detection of various forms of hate speech (HS) typically employs supervised learning, requiring manually annotated d...
Article
Full-text available
In the last decade, social media gained a very significant role in public debates, and despite the many intrinsic difficulties of analyzing data streaming from on-line platforms that are poisoned by bots, trolls, and low-quality information, it is undeniable that such data can still be used to test the public opinion and overall mood and to investi...
Conference Paper
Full-text available
The paper proposes an investigation on the role of populist themes and rhetoric in an Italian Twitter corpus of hate speech against immigrants. The corpus has been annotated with four new layers of analysis: Nominal Utterances , that can be seen as consistent with pop-ulist rhetoric; In-out-group rhetoric, a very common populist strategy to polariz...
Conference Paper
Full-text available
Important issues, such as abortion governmental laws, are discussed everyday online involving different opinions that could be favorable or not. Often the debates change tone and become more aggressive undermining the discussion. In this paper, we analyze the relation between abusive language and the stances of disapproval toward some controversial...
Article
Full-text available
Provided the difficulties that still affect a correct identification of irony within the context of Sentiment Analysis tasks, in this paper we describe the main issues emerged during the development of a novel resource for Italian annotated for irony. The project mainly consists in the application on the Twitter corpus TWITTIRÒ of a multi-layered s...
Article
Full-text available
BACKGROUND Demographers are increasingly interested in connecting demographic behaviour and trends with 'soft' measures, i.e., complementary information on attitudes, values, feelings, and intentions. OBJECTIVE The aim of this paper is to demonstrate how computational linguistic techniques can be used to explore opinions and semantic orientations r...
Preprint
Full-text available
Analysing how people react to rumours associated with news in social media is an important task to prevent the spreading of misinformation, which is nowadays widely recognized as a dangerous tendency. In social media conversations, users show different stances and attitudes towards rumourous stories. Some users take a definite stance, supporting or...
Conference Paper
Full-text available
English. The Italian Emoji Prediction task (ITAmoji) is proposed at EVALITA 2018 evaluation campaign for the first time, after the success of the twin Multilingual Emoji Prediction Task, organized in the context of SemEval-2018 in order to challenge the research community to automatically model the semantics of emojis in Twitter. Participants were...
Article
Sentiment analysis in social media is a popular task attracting the interest of the research community, also in recent evaluation campaigns of natural language processing tasks in several languages. We report on our experience in the organization of SENTIPOLC (SENTIment POLarity Classification Task), a shared task on sentiment classification of Ita...
Poster
Abstract In the last years, the necessity of controlling user-generated content online is becoming a priority, because of the increase of cases of online aggressiveness and hate speech. Considering the complexity and the importance of this issue, in this paper we present an approach that combines deep learning framework with linguistic features for...
Chapter
In this work, we propose a variant of a well-known instance-based algorithm: WKNN. Our idea is to exploit task-dependent features in order to calculate the weight of the instances according to a novel paradigm: the Textual Attraction Force, that serves to quantify the degree of relatedness between documents. The proposed method was applied to a cha...
Chapter
The presence of figurative language represents a big challenge for sentiment analysis. In this work, we address the task of assigning sentiment polarity to Twitter texts when figurative language is employed, with a special focus on the presence of ironic devices. We introduce a pipeline model which aims to assign a polarity value exploiting, on the...
Conference Paper
In this paper we present a data visualization platform designed to support the Natural Language Processing (NLP) scholar to study and analyze different corpora collected with the purpose to understand the hate speech phenomenon in social media. The project started with the creation of a corpus which collects tweets addressed to specific groups of e...
Conference Paper
Full-text available
This document contains the Guidelinesfor Participants to the task IronITA (Irony Detection in Italian Tweets) @ EVALITA 2018. The task consists in automatically annotating messages from Twitter for irony and sarcasm and it is organized in a main task (Task A) centered on irony, and a subtask (Subtask B) centered on sarcasm, whose results will be s...
Conference Paper
Full-text available
The number of communications and messages generated by users on social media platforms has progressively increased in the last years. Therefore, the issue of developing automated systems for a deep analysis of users' generated contents and interactions is becoming increasingly relevant. In particular, when we focus on the domain of online political...
Conference Paper
Full-text available
In this paper we describe the main issues emerged within the application of a multi-layered scheme for the fine-grained annotation of irony (Karoui et al., 2017) on an Italian Twitter corpus, i.e. TWITTIRÒ, which is composed of about 1,500 tweets with various provenance. A discussion is proposed about the limits and advantages of the application of...
Conference Paper
Full-text available
EVALITA is a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. The general objective of EVALITA is to promote the development of language and speech technologies for the Italian language, providing a shared framework where different systems and approaches can be evaluated in a consistent ma...
Chapter
Full-text available
EVALITA is a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. The general objective of EVALITA is to promote the development of language and speech technologies for the Italian language, providing a shared framework where different systems and approaches can be evaluated in a consistent ma...
Chapter
EVALITA is a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. The general objective of EVALITA is to promote the development of language and speech technologies for the Italian language, providing a shared framework where different systems and approaches can be evaluated in a consistent ma...
Chapter
In this paper we describe our submission to the shared task of Automatic Misogyny Identification in English and Italian Tweets (AMI) organized at EVALITA 2018. Our approach is based on SVM classifiers and enhanced by stylistic and lexical features. Additionally, we analyze the use of the novel HurtLex multilingual linguistic resource, developed by...
Chapter
Full-text available
EVALITA is a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. The general objective of EVALITA is to promote the development of language and speech technologies for the Italian language, providing a shared framework where different systems and approaches can be evaluated in a consistent ma...

Network

Cited By