Chapter

Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We briefly report on the four shared tasks organized as part of the PAN 2020 evaluation lab on digital text forensics and authorship analysis. Each tasks is introduced, motivated, and the results obtained are presented. Altogether, the four tasks attracted 230 registrations, yielding 83 successful submissions. This, and the fact that we continue to invite the submissions of software rather than its run output using the TIRA experimentation platform, marks for a good start into the second decade of PAN evaluations labs.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Sie muss auch den Stil eines Textes berücksichtigen und wertet Satzstruktur, die Häufigkeit von Wortarten, den Wortschatz sowie dessen Variationen aus. Im Rahmen der PAN-Initiative für Plagiats-Erkennung entstehen für die Evaluierung Korpora aus verschiedenen Formen von Plagiaten (Bevendorff et al. 2020). ...
Chapter
Full-text available
... Author verification has also been the focus of several iterations of the PAN shared tasks, 4 e.g. (Stamatatos et al., 2014;Stamatatos et al., 2015;Bevendorff et al., 2020;Kestemont et al., 2021). These events contribute datasets and evaluation measures, thus allowing the community to compare different methods on the same basis and in turn boosting the development of author verification methods. ...
Article
Full-text available
The task of authorship verification consists in detecting whether two texts have been written by the same person. This paper describes the CLG Authorship Analytics software, which implements several individual methods as well as a stacked generalization system for authorship verification. The approach relies primarily on ensemble learning methods, i.e. repeatedly sampling the data in order to capture the invariant stylistic patterns. The approach is tested through a series of experiments designed to test the ability of the system to generalize, depending on various parameters. The code and results of the experiments are publicly available https://github.com/erwanm/clg-authorship-experiments.
... As a result, the requirement for automated fact verification has been recognised by several researchers. Workshops and shared tasks like FEVER [1], Fakeddit [7], Constraint2021 [16], pan2020 [17], DeepFake challenge [18] etc. have drawn attention to this task. Along with FEVER, other datasets such as LIAR [3], CREDBANK [4], Constraint2021 [19] etc. have focused on fact verification of textual modality. ...
Preprint
Full-text available
Fake news can spread quickly on social media and it is important to detect it before it creates lot of damage. Automatic fact/claim verification has recently become a topic of interest among diverse research communities. We present the findings of the Factify shared task, which aims undertake multi-modal fact verification, organized as a part of the De-Factify workshop at AAAI'22. The task is modeled as a multi-modal entailment task, where each input needs to be classified into one of 5 classes based on entailment and modality. A total of 64 teams participated in the Factify shared task, and of them, 9 teams submitted their predictions on test set. The most successful models were BigBird or other variations of BERT. The highest F1 score averaged across all the classes was 76.82%.
... Human aspects related to cognitive psychology affect the susceptibility and vulnerability of social network users to misinformation [26]. With regards to the psychological aspect of fake news adoption and diffusion, only recently there has been a shift of attention in the literature utilizing the user profiles and psychological patterns of social media users in order to classify them as fake or real news spreaders [6]. Recently, Giachanou et al. [17] showed that personality combined with contextual information have a higher predictive power at classifying fake news spreaders. ...
Article
Full-text available
Fake news spreading is strongly connected with the human involvement as individuals tend to fall, adopt and circulate misinformation stories. Until recently, the role of human characteristics in fake news diffusion, in order to deeply understand and fight misinformation patterns, has not been explored to the full extent. This paper suggests a human-centric approach on detecting fake news spreading behavior by building an explainable fake-news-spreader classifier based on psychological and behavioral cues of individuals. Our model achieves promising classification results while offering explanations of human motives and features behind fake news spreading behavior. Moreover, to the best of our knowledge, this is the first study that aims at providing a fully explainable setup that evaluates fake news spreading based on users credibility applied to public discussions aiming to a comprehensive way to combat fake news through human involvement.
... Our fanfiction dataset is derived from the training set released with Bevendorff et al. (2020), which was collected by crawling fanfiction.net. The dataset consists of 278,169 stories by 41,000 distinct authors. ...
... Our method was relatively simple, but robust and proven as a good standard with great results. For example, TF-IDF-based methods are often winning solutions in PAN competitions (series of scientific events and shared tasks on digital text forensics and stylometry) [4] [17]. Also, the relevance of our method was proven by interyear experiments, where it achieved high accuracy across all tested programming languages (average accuracy more than 0.96). ...
Conference Paper
Full-text available
Stylochronometry deals with the influence of time in an author’sstyle, specifically how it changes stylometric features. Analysis oftime drift occurrence is important especially for a dataset creationprocess of other works in this area. In this paper, we performed ex-periments using the Google Code Jam dataset to show the influenceof time drift in the area of source code authorship attribution. Ourexperiments revealed that there is significant time drift in stylo-metric features in one year difference, which is enlargening as thedifference of time increases. Another interesting result is that whentraining our authorship attribution method on data from the futureand testing on data from the past, the time drift is lower than in op-posite direction. Also, we found the relation between the length ofsource code and the accuracy of our authorship attribution method.
... An interesting aspect of authorship problems is that technology used elsewhere in NLP has not yet penetrated it. Up until the very recent PAN 2018 and PAN 2020 Authorship event [3,4], the most popular and effective approaches still largely relies on n-gram features and traditional machine learning classifiers, such as support vector machines (SVM) [5] and trees [6]. Elsewhere, these methods recently had to give up much of their spotlight to deep neural networks. ...
Preprint
Full-text available
We propose an unsupervised solution to the Authorship Verification task that utilizes pre-trained deep language models to compute a new metric called DV-Distance. The proposed metric is a measure of the difference between the two authors comparing against pre-trained language models. Our design addresses the problem of non-comparability in authorship verification, frequently encountered in small or cross-domain corpora. To the best of our knowledge, this paper is the first one to introduce a method designed with non-comparability in mind from the ground up, rather than indirectly. It is also one of the first to use Deep Language Models in this setting. The approach is intuitive, and it is easy to understand and interpret through visualization. Experiments on four datasets show our methods matching or surpassing current state-of-the-art and strong baselines in most tasks.
... PAN provides evaluation resources consisting of large-scale corpora, performance measures, and web services that allow for meaningful evaluations. The main goal is to provide for sustainable and reproducible evaluations, to get a clear view of the capabilities of state-of-the-art-algorithms [19]. ...
Article
Full-text available
This is a report on the tenth edition of the \textsl{Conference and Labs of the Evaluation Forum} (CLEF 2020), (virtually) held from September 22--25, 2020, in Thessaloniki, Greece. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Ellen Voorhees and Yiannis Kompasiaris, and presentation of peer reviewed research papers covering a wide range of topics in addition to many posters. The Evaluation Forum consisted to twelve Labs: ARQMath, BioASQ, CheckThat!, ChEMU, CLEF eHealth, eRisk, HIPE, ImageCLEF, LifeCLEF, LiLAS, PAN, and Touch\'{e}, addressing a wide range of tasks, media, languages, and ways to go beyond standard test collections.
Chapter
The paper gives a brief overview of three shared tasks which have been organized at the PAN 2022 lab on digital text forensics and stylometry hosted at the CLEF 2022 conference. The tasks include authorship verification across discourse types, multi-author writing style analysis and author profiling. Some of the tasks continue and advance past editions (authorship verification and multi-author analysis) and some are new (profiling irony and stereotypes spreaders). The general goal of the PAN shared tasks is to advance the state of the art in text forensics and stylometry while ensuring objective evaluation on newly developed benchmark datasets.
Chapter
The paper gives a brief overview of the four shared tasks to be organized at the PAN 2022 lab on digital text forensics and stylometry hosted at the CLEF 2022 conference. The tasks include authorship verification across discourse types, multi-author writing style analysis, author profiling, and content profiling. Some of the tasks continue and advance past editions (authorship verification and multi-author analysis) and some are new (profiling irony and stereotypes spreaders and trigger detection). The general goal of the PAN shared tasks is to advance the state of the art in text forensics and stylometry while ensuring objective evaluation on newly developed benchmark datasets.
Article
In general, people are usually more reluctant to follow advice and directions from politicians who do not have their ideology. In extreme cases, people can be heavily biased in favour of a political party at the same time that they are in sharp disagreement with others, which may lead to irrational decision making and can put people’s lives at risk by ignoring certain recommendations from the authorities. Therefore, considering political ideology as a psychographic trait can improve political micro-targeting by helping public authorities and local governments to adopt better communication policies during crises. In this work, we explore the reliability of determining psychographic traits concerning political ideology. Our contribution is twofold. On the one hand, we release the PoliCorpus-2020, a dataset composed by Spanish politicians’ tweets posted in 2020. On the other hand, we conduct two authorship analysis tasks with the aforementioned dataset: an author profiling task to extract demographic and psychographic traits, and an authorship attribution task to determine the author of an anonymous text in the political domain. Both experiments are evaluated with several neural network architectures grounded on explainable linguistic features, statistical features, and state-of-the-art transformers. In addition, we test whether the neural network models can be transferred to detect the political ideology of citizens. Our results indicate that the linguistic features are good indicators for identifying fine-grained political affiliation, they boost the performance of neural network models when combined with embedding-based features, and they preserve relevant information when the models are tested with ordinary citizens. Besides, we found that lexical and morphosyntactic features are more effective on author profiling, whereas stylometric features are more effective in authorship attribution.
Article
Full-text available
Authorship verification (AV) is one of the main problems of authorship analysis and digital text forensics. The classical AV problem is to decide whether or not a particular author wrote the document in question. However, if there is one and relatively short document as the author's known document, the verification problem becomes more difficult than the classical AV and needs a generalised solution. Regarding to decide AV of the given two unlabeled documents (2D-AV), we proposed a system that provides an author-independent solution with the help of a Binary Background Model (BBM). The BBM is a supervised model that provides an informative background to distinguish document pairs written by the same or different authors. To evaluate the document pairs in one representation, we also proposed a new, simple and efficient document combination method based on the geometric mean of the stylometric features. We tested the performance of the proposed system for both author-dependent and author-independent AV cases. In addition, we introduced a new, well-defined, manually labelled Turkish blog corpus to be used in subsequent studies about authorship analysis. Using a publicly available English blog corpus for generating the BBM, the proposed system demonstrated an accuracy of over 90% from both trained and unseen authors' test sets. Furthermore, the proposed combination method and the system using the BBM with the English blog corpus were also evaluated with other genres, which were used in the international PAN AV competitions, and achieved promising results.
Conference Paper
Full-text available
Celebrities are among the most prolific users of social media, promoting their personas and rallying followers. This activity is closely tied to genuine writing samples, which makes them worthy research subjects in many respects, not least profiling. With this paper we introduce the Webis Celebrity Corpus 2019. For its construction the Twitter feeds of 71,706 verified accounts have been carefully linked with their respective Wikidata items, crawling both. After cleansing, the resulting profiles contain an average of 29,968 words per profile and up to 239 pieces of personal information. A cross-evaluation that checked the correct association of Twitter account and Wikidata item revealed an error rate of only 0.6%, rendering the profiles highly reliable. Our corpus comprises a wide cross-section of local and global celebrities , forming a unique combination of scale, profile comprehensiveness, and label reliability. We further establish the state of the art's profiling performance by evaluating the winning approaches submitted to the PAN gender prediction tasks in a transfer learning experiment. They are only outperformed by our own deep learning approach, which we also use to exemplify celebrity occupation prediction for the first time.
Conference Paper
Full-text available
The spread of false information on the Web is one of the main problems of our society. Automatic detection of fake news posts is a hard task since they are intentionally written to mislead the readers and to trigger intense emotions to them in an attempt to be disseminated in the social networks. Even though recent studies have explored different linguistic patterns of false claims, the role of emotional signals has not yet been explored. In this paper, we study the role of emotional signals in fake news detection. In particular, we propose an LSTM model that incorporates emotional signals extracted from the text of the claims to differentiate between credible and non-credible ones. Experiments on real world datasets show the importance of emotional signals for credibility assessment.
Conference Paper
Full-text available
In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., snopes.com and politifact.com) and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for com-bating fake news. In particular, we (1) leverage online users named fact-checkers, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers' engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.
Conference Paper
Full-text available
Consuming news from social media is becoming increasingly popular nowadays. Social media brings benefits to users due to the inherent nature of fast dissemination, cheap cost, and easy access. However, the quality of news is considered lower than traditional news outlets, resulting in large amounts of fake news. Detecting fake news becomes very important and is attracting increasing attention due to the detrimental effects on individuals and the society. The performance of detecting fake news only from content is generally not satisfactory, and it is suggested to incorporate user social engagements as auxiliary information to improve fake news detection. Thus it necessitates an in-depth understanding of the correlation between user profiles on social media and fake news. In this paper, we construct real-world datasets measuring users trust level on fake news and select representative groups of both “experienced” users who are able to recognize fake news items as false and “naı̈ve” users who are more likely to believe fake news. We perform a comparative analysis over explicit and implicit profile features between these user groups, which reveals their potential to differentiate fake news. The findings of this paper lay the foundation for future automatic fake news detection research.
Conference Paper
Full-text available
There are several tasks where is preferable not responding than responding incorrectly. This idea is not new, but despite several previous attempts there isn't a commonly accepted measure to assess non-response. We study here an extension of accuracy measure with this feature and a very easy to understand interpretation. The measure proposed (c@1) has a good balance of discrimination power, stability and sensitivity properties. We show also how this measure is able to reward systems that maintain the same number of correct answers and at the same time decrease the number of incorrect ones, by leaving some questions unanswered. This measure is well suited for tasks such as Reading Comprehension tests, where multiple choices per question are given, but only one is correct.
Article
Full-text available
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.
Chapter
Users play a critical role in the creation and propagation of fake news online by consuming and sharing articles with inaccurate information either intentionally or unintentionally. Fake news are written in a way to confuse readers and therefore understanding which articles contain fabricated information is very challenging for non-experts. Given the difficulty of the task, several fact checking websites have been developed to raise awareness about which articles contain fabricated information. As a result of those platforms, several users are interested to share posts that cite evidence with the aim to refute fake news and warn other users. These users are known as fact checkers. However, there are users who tend to share false information, who can be characterised as potential fake news spreaders. In this paper, we propose the CheckerOrSpreader model that can classify a user as a potential fact checker or a potential fake news spreader. Our model is based on a Convolutional Neural Network (CNN) and combines word embeddings with features that represent users’ personality traits and linguistic patterns used in their tweets. Experimental results show that leveraging linguistic patterns and personality traits can improve the performance in differentiating between checkers and spreaders.
Article
Fake news is risky, since it has been created to manipulate readers’ opinions and beliefs. In this work, we compared the language of false news to the real one of real news from an emotional perspective, considering a set of false information types (propaganda, hoax, clickbait, and satire) from social media and online news article sources. Our experiments showed that false information has different emotional patterns in each of its types, and emotions play a key role in deceiving the reader. Based on that, we proposed an LSTM neural network model that is emotionally infused to detect false news.
Chapter
The paper gives a brief overview of the four shared tasks that are to be organized at the PAN 2020 lab on digital text forensics and stylometry, hosted at CLEF conference. The tasks include author profiling, celebrity profiling, cross-domain author verification, and style change detection, seeking to advance the state of the art and to evaluate it on new benchmark datasets.
Overview of the Cross-domain Authorship Attribution Task at PAN 2019. Working Notes Papers of the CLEF 2019 Evaluation Labs
  • M Kestemont
  • E Stamatatos
  • E Manjavacas
  • W Daelemans
  • M Potthast
  • B Stein
Kestemont, M., Stamatatos, E., Manjavacas, E., Daelemans, W., Potthast, M., Stein, B.: Overview of the Cross-domain Authorship Attribution Task at PAN 2019. Working Notes Papers of the CLEF 2019 Evaluation Labs. CEUR Workshop Proceedings. 2019.
Overview of the Author Identification Task at PAN-2018: Cross-domain Authorship Attribution and Style Change Detection. Working Notes Papers of the CLEF 2018 Evaluation Labs
  • M Kestemont
  • M Tschuggnall
  • E Stamatatos
  • W Daelemans
  • G Specht
  • B Stein
  • M Potthast
Kestemont, M., Tschuggnall, M., Stamatatos, E., Daelemans, W., Specht, G., Stein, B., Potthast, M.: Overview of the Author Identification Task at PAN-2018: Cross-domain Authorship Attribution and Style Change Detection. Working Notes Papers of the CLEF 2018 Evaluation Labs. CEUR Workshop Proceedings. 2018.
Overview of the Celebrity Profiling Task at PAN 2019. CLEF 2019 Labs and Workshops, Notebook Papers
  • M Wiegmann
  • B Stein
  • M Potthast
Wiegmann, M., Stein, B., Potthast, M.: Overview of the Celebrity Profiling Task at PAN 2019. CLEF 2019 Labs and Workshops, Notebook Papers. 2019.
Overview of the Style Change Detection Task at PAN 2020. CLEF 2020 Labs and Workshops, Notebook Papers
  • E Zangerle
  • M Mayerl
  • G Specht
  • M Potthast
  • B Stein
Zangerle, E., Mayerl, M., Specht, G., Potthast, M., Stein, B.: Overview of the Style Change Detection Task at PAN 2020. CLEF 2020 Labs and Workshops, Notebook Papers. 2020.
Overview of the Style Change Detection Task at PAN 2019. CLEF 2019 Labs and Workshops, Notebook Papers
  • E Zangerle
  • M Tschuggnall
  • G Specht
  • M Potthast
  • B Stein
Zangerle, E., Tschuggnall, M., Specht, G., Potthast, M., Stein, B.: Overview of the Style Change Detection Task at PAN 2019. CLEF 2019 Labs and Workshops, Notebook Papers. 2019.
Overview of the cross-domain authorship attribution task at PAN
  • M Kestemont
  • E Stamatatos
  • E Manjavacas
  • W Daelemans
  • M Potthast
  • B Stein
Overview of the celebrity profiling task at PAN 2020
  • M Wiegmann
  • M Potthast
  • B Stein
Overview of the style change detection task at PAN
  • E Zangerle
  • M Tschuggnall
  • G Specht
  • M Potthast
  • B Stein
Overview of the style change detection task at PAN 2020
  • E Zangerle
  • M Mayerl
  • G Specht
  • M Potthast
  • B Stein
Overview of the author identification task at PAN-2018: cross-domain authorship attribution and style change detection
  • M Kestemont
Overview of the 8th author profiling task at PAN 2020: profiling fake news spreaders on Twitter
  • F Rangel
  • A Giachanou
  • B Ghanem
  • P Rosso