
Savvas Zannettou- Assistant Professor at Delft University of Technology
Savvas Zannettou
- Assistant Professor at Delft University of Technology
About
115
Publications
144,627
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,599
Citations
Current institution
Publications
Publications (115)
To facilitate accountability and transparency, the Digital Services Act (DSA) sets up a process through which Very Large Online Platforms (VLOPs) need to grant vetted researchers access to their internal data (Article 40(4)). Operationalising such access is challenging for at least two reasons. First, data access is only available for research on s...
Online platforms have enacted various policies to maintain a safe and trustworthy advertising environment. However, the extent to which these policies are adhered to and enforced remains a subject of interest and concern. In this work, we present a large-scale audit of adult advertising on Twitter (now X), specifically focusing on compliance with i...
The comprehensibility and reliability of data download packages (DDPs) provided under the General Data Protection Regulation's (GDPR) right of access are vital for both individuals and researchers. These DDPs enable users to understand and control their personal data, yet issues like complexity and incomplete information often limit their utility....
Large Language Models (LLMs) have raised increasing concerns about their misuse in generating hate speech. Among all the efforts to address this issue, hate speech detectors play a crucial role. However, the effectiveness of different detectors against LLM-generated hate speech remains largely unknown. In this paper, we propose HateBench, a framewo...
Language is a dynamic aspect of our culture that changes when expressed in different technologies and/or communities. On the Internet, social networks have enabled the diffusion and evolution of different dialects, including African American English (AAE). However, this increased usage of different dialects is not without barriers. One particular b...
Recently, autonomous agents built on large language models (LLMs) have experienced significant development and are being deployed in real-world applications. These agents can extend the base LLM's capabilities in multiple ways. For example, a well-built agent using GPT-3.5-Turbo as its core can outperform the more advanced GPT-4 model by leveraging...
The Covid-19 pandemic brought about an unprecedented cycle of digitally spread humor. This article analyzes a corpus of 12,337 humor items from 80+ countries, mainly in visual format, and mostly memes, collected during the first half of 2020, to understand the features and intended audiences of this “pandemic humor”. Employing visual machine-learni...
WhatsApp provides a fertile ground for the large-scale dissemination of information, particularly in countries like Brazil and India. Given its increasing popularity and use for political discussions, it is paramount to ensure that WhatsApp groups are adequately protected from attackers who aim to disrupt the activity of WhatsApp groups. Motivated...
Online messaging platforms are key communication tools but are vulnerable to fake news and conspiracy theories. Mainstream platforms such as Facebook are increasing content moderation of harmful and conspiratorial content. In response, users from fringe communities are migrating to alternative platforms like Telegram. These platforms offer more fre...
WhatsApp has evolved into a popular communication tool, facilitating the exchange of billions of multimedia messages globally. With its large public groups and forwarding features, the platform has enabled messages to go viral, rapidly disseminating across the WhatsApp network. This has also brought WhatsApp to a central position in spreading misin...
During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population...
Incel communities have recently attracted the public's interest mainly due to their high degree of extreme views and involvement in real-world violence. A common theme in Incel communities is self-harm discussions. Despite this, beyond small-scale qualitative analyses of self-harm discussions in Incel communities, we lack a large-scale quantitative...
In recent years, user feeds on social media platforms have shifted from simple, chronologically ordered content posted by their network connections (i.e., friends) to opaque, algorithmically ordered and curated content. This shift has led to regulations that require platforms to offer end users greater transparency and control over their algorithmi...
Online platforms have enacted various policies to maintain a safe and trustworthy advertising environment. However, the extent to which these policies are adhered to and enforced remains a subject of interest and concern. In this work, we present a large-scale audit of adult advertising on Twitter, specifically focusing on compliance with its adult...
The spread of toxic content online is an important problem that has adverse effects on user experience online and in our society at large. Motivated by the importance and impact of the problem, research focuses on developing solutions to detect toxic content, usually leveraging machine learning (ML) models trained on human-annotated datasets. While...
With large social media platforms coming under increasing pressure to deplatform far-right users, the Alternative Technology movement (Alt-Tech) emerged as a new digital support infrastructure for the far right. We conduct a qualitative analysis of the prominent Alt-Tech platform Gab, a social networking service primarily modelled on Twitter, to as...
During the COVID-19 pandemic, health-related misinformation and harmful content shared online had a significant adverse effect on society. In an attempt to mitigate this adverse effect, mainstream social media platforms like Facebook, Twitter, and TikTok employed soft moderation interventions (i.e., warning labels) on potentially harmful posts. Suc...
The spread of hate speech and hateful imagery on the Web is a significant problem that needs to be mitigated to improve our Web experience. This work contributes to research efforts to detect and understand hateful content on the Web by undertaking a multimodal analysis of Antisemitism and Islamophobia on 4chan’s /pol/ using OpenAI’s CLIP. This lar...
Previous research has documented the existence of both online echo chambers and hostile intergroup interactions. In this paper, we explore the relationship between these two phenomena by studying the activity of 5.97M Reddit users and 421M comments posted over 13 years. We examine whether users who are more engaged in echo chambers are more hostile...
State-of-the-art Text-to-Image models like Stable Diffusion and DALLE$\cdot$2 are revolutionizing how people generate visual content. At the same time, society has serious concerns about how adversaries can exploit such models to generate unsafe images. In this work, we focus on demystifying the generation of unsafe images and hateful memes from Te...
Researchers use information about the amount of time people spend on digital media for a variety of purposes including to understand impacts on physical and mental health as well as attention and learning. To measure time spent on digital media, participants' self-estimation is a common alternative method if the platform does not allow external acc...
TikTok is a relatively novel and widely popular media platform. In response to its expanding user base and cultural impact, researchers are turning to study the platform; however, TikTok, like many social media platforms, restricts external access to data. Prior works have acquired data from scraping the platform, user self-reports, and from accoun...
The dissemination of hateful memes online has adverse effects on social media platforms and the real world. Detecting hateful memes is challenging, one of the reasons being the evolutionary nature of memes; new hateful memes can emerge by fusing hateful connotations with other cultural ideas or symbols. In this paper, we propose a framework that le...
To curb the problem of false information, social media platforms like Twitter started adding warning labels to content discussing debunked narratives, with the goal of providing more context to their audiences. Unfortunately, these labels are not applied uniformly and leave large amounts of false content unmoderated. This paper presents LAMBRETTA,...
Previous research has documented the existence of both online echo chambers and hostile intergroup interactions. In this paper, we explore the relationship between these two phenomena by studying the activity of 5.97M Reddit users and 421M comments posted over 13 years. We examine whether users who are more engaged in echo chambers are more hostile...
Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public da...
Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public da...
Sinophobia, anti-Chinese sentiment, has existed on the Web for a long time. The outbreak of COVID-19 and the extended quarantine has further amplified it. However, we lack a quantitative understanding of the cause of Sinophobia as well as how it evolves over time. In this paper, we conduct a largescale longitudinal measurement of Sinophobia, betwee...
The QAnon conspiracy theory claims that a cabal of (literally) blood-thirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic “drops” of information from an anonymous insider calling themself Q, adherents of the conspiracy theory believe that Donald Trump is leading them in an active fight against...
The role played by YouTube's recommendation algorithm in unwittingly promoting misinformation and conspiracy theories is not entirely understood. Yet, this can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, such as the COVID-19 pandemic. In this paper, we set out to characterize a...
Growing evidence points to recurring influence campaigns on social media, often sponsored by state actors aiming to manipulate public opinion on sensitive political topics. Typically, campaigns are performed through instrumented accounts, known as troll accounts; despite their prominence, however, little work has been done to detect these accounts...
The articles in this special issue focus on the emerging effects that social media can have on the real world. Social media has quickly become not just ubiquitous, but also integral to society. A large portion of social media's quick ascent was due to its modeling of real-world relationships, meaning the offline world informed the development and a...
Internet memes have become a dominant method of communication; at the same time, however, they are also increasingly being used to advocate extremism and foster derogatory beliefs. Nonetheless, we do not have a firm understanding as to which perceptual aspects of memes cause this phenomenon. In this work, we assess the efficacy of current state-of-...
During the COVID-19 pandemic, health-related misinformation and harmful content shared online had a significant adverse effect on society. To mitigate this adverse effect, mainstream social media platforms employed soft moderation interventions (i.e., warning labels) on potentially harmful posts. Despite the recent popularity of these moderation in...
The spread of hate speech and hateful imagery on the Web is a significant problem that needs to be mitigated to improve our Web experience. This work contributes to research efforts to detect and understand hateful content on the Web by undertaking a multimodal analysis of Antisemitism and Islamophobia on 4chan's /pol/ using OpenAI's CLIP. This lar...
The QAnon conspiracy theory claims that a cabal of (literally) blood-thirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic “drops” of information from an anonymous insider calling themself Q, adherents of the conspiracy theory believe that Donald Trump is leading them in an active fight against...
Growing evidence points to recurring influence campaigns on social media, often sponsored by state actors aiming to manipulate public opinion on sensitive political topics. Typically, campaigns are performed through instrumented accounts, known as troll accounts; despite their prominence, however, little work has been done to detect these accounts...
This paper presents a multi-platform computational pipeline geared to identify social media posts discussing (known) conspiracy theories. We use 189 conspiracy claims collected by Snopes, and find 66k posts and 277k comments on Reddit, and 379k tweets discussing them. Then, we study how conspiracies are discussed on different Web communities and wh...
YouTube is by far the largest host of user-generated video content worldwide. Alas, the platform has also come under fire for hosting inappropriate, toxic, and hateful content. One community that has often been linked to sharing and publishing hateful and misogynistic content are the Involuntary Celibates (Incels), a loosely defined movement ostens...
When toxic online communities on mainstream platforms face moderation measures, such as bans, they may migrate to other platforms with laxer policies or set up their own dedicated websites. Previous work suggests that within mainstream platforms, community-level moderation is effective in mitigating the harm caused by the moderated communities. It...
Recent research suggests that not all fact checking efforts are equal: when and what is fact checked plays a pivotal role in effectively correcting misconceptions. In this paper, we propose a framework to study fact checking efforts using Google Trends, a signal that captures search interest over topics on the world's largest search engine. Our fra...
This paper presents the first data-driven analysis of Gettr, a new social network platform launched by former US President Donald Trump's team. Among other things, we find that users on the platform heavily discuss politics, with a focus on the Trump campaign in the US and Bolsonaro's in Brazil. Activity on the platform has steadily been decreasing...
Aiming to enhance the safety of their users, social media platforms enforce terms of service by performing active moderation, including removing content or suspending users. Nevertheless, we do not have a clear understanding of how effective it is, ultimately, to suspend users who engage in toxic behavior, as that might actually draw users to alter...
QAnon is a far-right conspiracy theory that became popular and mainstream over the past few years. Worryingly, the QAnon conspiracy theory has implications in the real world, with supporters of the theory participating in real-world violent acts like the US capitol attack in 2021. At the same time, the QAnon theory started evolving into a global ph...
We present a large-scale characterization of the Manosphere, a conglomerate of Web-based misogynist movements focused on men's issues, which has prospered online. Analyzing 28.8M posts from 6 forums and 51 subreddits, we paint a comprehensive picture of its evolution across the Web, showing the links between its different communities over the years...
Parler is as an ``alternative'' social network promoting itself as a service that allows to ``speak freely and express yourself openly, without fear of being deplatformed for your views.'' Because of this promise, the platform become popular among users who were suspended on mainstream social networks for violating their terms of service, as well a...
We present a large-scale characterization of the Manosphere, a conglomerate of Web-based misogynist movements focused on men's issues, which has prospered online. Analyzing 28.8M posts from 6 forums and 51 subreddits, we paint a comprehensive picture of its evolution across the Web, showing the links between its different communities over the years...
YouTube is by far the largest host of user-generated video content worldwide. Alas, the platform also hosts inappropriate, toxic, and hateful content. One community that has often been linked to sharing and publishing hateful and misogynistic content is the so-called Involuntary Celibates (Incels), a loosely defined movement ostensibly focusing on...
Online fringe communities offer fertile grounds to users seeking and sharing ideas fueling suspicion of mainstream news and conspiracy theories. Among these, the QAnon conspiracy theory emerged in 2017 on 4chan, broadly supporting the idea that powerful politicians, aristocrats, and celebrities are closely engaged in a global pedophile ring. Simult...
Despite the increasingly important role played by image memes, we do not yet have a solid understanding of the elements that might make a meme go viral on social media. In this paper, we investigate what visual elements distinguish image memes that are highly viral on social media from those that do not get re-shared, across three dimensions: compo...
The role played by YouTube's recommendation algorithm in unwittingly promoting misinformation and conspiracy theories is not entirely understood. Yet, this can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, such as the COVID-19 pandemic. In this paper, we set out to characterize a...
The news ecosystem has become increasingly complex, encompassing a wide range of sources with varying levels of trustworthiness, and with public commentary giving different spins to the same stories. In this paper, we present a multi-platform measurement of this ecosystem. We compile a list of 1,073 news websites and extract posts from four Web com...
The QAnon conspiracy theory claims that a cabal of (literally) bloodthirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic "drops" of information from an anonymous insider calling themselves Q, adherents of the conspiracy theory believe that they are being led by Donald Trump in an active fight...
Over the past few years, there is a heated debate and serious public concerns regarding online content moderation, censorship, and the basic principle of free speech on the Web. To ease some of these concerns, mainstream social media platforms like Twitter and Facebook refined their content moderation systems to support soft moderation intervention...
Despite the increasingly important role played by image memes, we do not yet have a solid understanding of the elements that might make a meme go viral on social media. In this paper, we investigate what visual elements distinguish image memes that are highly viral on social media from those that do not get re-shared, across three dimensions: compo...
Parler is as an alternative social network promoting itself as a service that allows its users to "Speak freely and express yourself openly, without fear of being deplatformed for your views." Because of this promise, the platform become popular among users who were suspended on mainstream social networks for violating their terms of service, as we...
Despite the influence that image-based communication has on online discourse, the role played by images in disinformation is still not well understood. In this paper, we present the first large-scale study of fauxtography, analyzing the use of manipulated or misleading images in news discussion on online communities. First, we develop a computation...
Parler is as an ``alternative'' social network promoting itself as a service that allows to ``speak freely and express yourself openly, without fear of being deplatformed for your views.'' Because of this promise, the platform become popular among users who were suspended on mainstream social networks for violating their terms of service, as well a...
Online messaging platforms such as WhatsApp, Telegram, and Discord, each with hundreds of millions of users, are one of the dominant modes of communicating or interacting with one another. Despite the widespread use of public group chats, there exists no systematic or detailed characterization of these group chats. There is, more importantly, lack...
When toxic online communities on mainstream platforms face moderation measures, such as bans, they may migrate to other platforms with laxer policies or set up their own dedicated website. Previous work suggests that, within mainstream platforms, community-level moderation is effective in mitigating the harm caused by the moderated communities. It...
Despite the influence that image-based communication has on online discourse, the role played by images in disinformation is still not well understood. In this paper, we present the first large-scale study of fauxtography, analyzing the use of manipulated or misleading images in news discussion on online communities. First, we develop a computation...
Online fringe communities offer fertile grounds for users to seek and share paranoid ideas fueling suspicion of mainstream news, and outright conspiracy theories. Among these, the QAnon conspiracy theory has emerged in 2017 on 4chan, broadly supporting the idea that powerful politicians, aristocrats, and celebrities are closely engaged in a global...
Recent progress in genomics has enabled the emergence of a flourishing market for direct-to-consumer (DTC) genetic testing. Companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. Consequently, news, experiences, and views on genetic testing are increa...
This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of p...
Progress in genomics has enabled the emergence of a booming market for “direct-to-consumer” genetic testing. Nowadays, companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. At the same time, alt- and far-right groups have also taken an interest in ge...
State-sponsored organizations are increasingly linked to efforts aimed to exploit social media for information warfare and manipulating public opinion. Typically, their activities rely on a number of social network accounts they control, aka trolls, that post and interact with other users disguised as “regular” users. These accounts often use image...
State-sponsored organizations are increasingly linked to efforts aimed to exploit social media for information warfare and manipulating public opinion. Typically, their activities rely on a number of social network accounts they control, aka trolls, that post and interact with other users disguised as “regular” users. These accounts often use image...
Messaging platforms, especially those with a mobile focus, have become increasingly ubiquitous in society. These mobile messaging platforms can have deceivingly large user bases, and in addition to being a way for people to stay in touch, are often used to organize social movements, as well as a place for extremists to congregate.In this paper, we...
A large number of the most-subscribed YouTube channels target children of very young age. Hundreds of toddler-oriented channels on YouTube feature inoffensive, well produced, and educational videos. Unfortunately, inappropriate content that targets this demographic is also common. YouTube's algorithmic recommendation system regrettably suggests ina...
A new wave of growing antisemitism, driven by fringe Web communities, is an increasingly worrying presence in the socio-political realm. The ubiquitous and global nature of the Web has provided tools used by these groups to spread their ideology to the rest of the Internet. Although the study of antisemitism and hate is not new, the scale and rate...
Social media data has become crucial to the advancement of scientific understanding. However, even though it has become ubiquitous, just collecting large-scale social media data involves a high degree of engineering skill set and computational resources. In fact, research is often times gated by data engineering problems that must be overcome befor...
This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of p...
Progress in genomics has enabled the emergence of a booming market for “direct-to-consumer” genetic testing. Nowadays, companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. At the same time, alt- and far-right groups have also taken an interest in ge...
The Web has become the main source for news acquisition. At the same time, news discussion has become more social: users can post comments on news articles or discuss news articles on other platforms like Reddit. These features empower and enable discussions among the users; however, they also act as the medium for the dissemination of toxic discou...
The outbreak of the COVID-19 pandemic has changed our lives in unprecedented ways. In the face of the projected catastrophic consequences, many countries have enacted social distancing measures in an attempt to limit the spread of the virus. Under these conditions, the Web has become an indispensable medium for information acquisition, communicatio...
Social media data has become crucial to the advancement of scientific understanding. However, even though it has become ubiquitous, just collecting large-scale social media data involves a high degree of engineering skill set and computational resources. In fact, research is often times gated by data engineering problems that must be overcome befor...
Messaging platforms, especially those with a mobile focus, have become increasingly ubiquitous in society. These mobile messaging platforms can have deceivingly large user bases, and in addition to being a way for people to stay in touch, are often used to organize social movements, as well as a place for extremists and other ne'er-do-well to congr...
This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of p...
In this paper, we present a large-scale characterization of the Manosphere, a conglomerate of Web-based misogynist movements roughly focused on "men's issues," which has seen significant growth over the past years. We do so by gathering and analyzing 28.8M posts from 6 forums and 51 subreddits. Overall, we paint a comprehensive picture of the evolu...
Current authentication methods on the Web have serious weaknesses. First, services heavily rely on the traditional password paradigm, which diminishes the end-users' security and usability. Second, the lack of attribute-based authentication does not allow anonymity-preserving access to services. Third, users have multiple online accounts that often...
The Web consists of numerous Web communities, news sources, and services, which are often exploited by various entities for the dissemination of false information. Yet, we lack tools and techniques to effectively track the propagation of information across the multiple diverse communities, and to model the interplay and influence between them. Also...
Recent evidence has emerged linking coordinated campaigns by state-sponsored actors to manipulate public opinion on the Web. Campaigns revolving around major political events are enacted via mission-focused ?trolls." While trolls are involved in spreading disinformation on social media, there is little understanding of how they operate, what type o...
Over the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” Although they are often involved in spreading disinformation on social media, there is little unders...
Rapid progress in genomics has enabled a thriving market for ``direct-to-consumer'' genetic testing, whereby people have access to their genetic information without the involvement of a healthcare provider. Companies like 23andMe and AncestryDNA, which provide affordable health, genealogy, and ancestry reports, have already tested tens of millions...
A large number of the most-subscribed YouTube channels target children of very young age. Hundreds of toddler-oriented channels on YouTube feature inoffensive, well produced, and educational videos. Unfortunately, inappropriate content that targets this demographic is also common. YouTube’s algorithmic recommendation system regrettably suggests ina...
Anecdotal evidence has emerged suggesting that state-sponsored organizations, like the Russian Internet Research Agency, have exploited mainstream social. Their primary goal is apparently to conduct information warfare operations to manipulate public opinion using accounts disguised as "normal" people. To increase engagement and credibility of thei...
Current authentication methods on the Web have serious weaknesses.
First, services heavily rely on the traditional password paradigm, which diminishes the end-users' security and usability. Second, the lack of attribute-based authentication does not allow anonymity-preserving access to services. Third, users have multiple online accounts that ofte...
Over the past few years, extensive anecdotal evidence emerged that suggests the involvement of state-sponsored actors (or "trolls") in online political campaigns with the goal to manipulate public opinion and sow discord. Recently, Twitter and Reddit released ground truth data about Russian and Iranian state-sponsored actors that were active on the...
Internet memes are increasingly used to sway and manipulate public opinion. This prompts the need to study their propagation, evolution, and influence across the Web. In this paper, we detect and measure the propagation of memes across multiple Web communities, using a processing pipeline based on perceptual hashing and clustering techniques, and a...