ArticlePDF Available

Unpacking the Social Media Bot: A Typology to Guide Research and Policy: Unpacking the Social Media Bot



Amid widespread reports of digital influence operations during major elections, policymakers, scholars, and journalists have become increasingly interested in the political impact of social media bots. Most recently, platform companies like Facebook and Twitter have been summoned to testify about bots as part of investigations into digitally enabled foreign manipulation during the 2016 U.S. Presidential election. Facing mounting pressure from both the public and from legislators, these companies have been instructed to crack down on apparently malicious bot accounts. But as this article demonstrates, since the earliest writings on bots in the 1990s, there has been substantial confusion as to exactly what a “bot” is and what it does. We argue that multiple forms of ambiguity are responsible for much of the complexity underlying contemporary bot-related policy, and that before successful policy interventions can be formulated, a more comprehensive understanding of bots—especially how they are defined and measured—will be needed. In this article, we provide a typology of different types of bots, provide clear guidelines for better categorizing political automation, and unpack the impact that it can have on contemporary technology policy. We conclude by outlining the main challenges and ambiguities that will face both researchers and legislators as they tackle bots in the future.
arXiv:1801.06863v2 [cs.CY] 28 Jul 2018
Unpacking the Social Media Bot: A Typology
to Guide Research and Policy
Robert GorwaDouglas Guilbeault
Amidst widespread reports of digital influence operations during
major elections, policymakers, scholars, and journalists have become
increasingly interested in the political impact of social media ‘bots.’
Most recently, platform companies like Facebook and Twitter have
been summoned to testify about bots as part of investigations into
digitally-enabled foreign manipulation during the 2016 US Presiden-
tial election. Facing mounting pressure from both the public and from
legislators, these companies have been instructed to crack down on
apparently malicious bot accounts. But as this article demonstrates,
since the earliest writings on bots in the 1990s, there has been sub-
stantial confusion as to exactly what a ‘bot’ is and what exactly a bot
does. We argue that multiple forms of ambiguity are responsible for
much of the complexity underlying contemporary bot-related policy,
and that before successful policy interventions can be formulated, a
more comprehensive understanding of bots — especially how they are
defined and measured — will be needed. In this article, we provide
a history and typology of different types of bots, provide clear guide-
lines to better categorize political automation and unpack the impact
that it can have on contemporary technology policy, and outline the
main challenges and ambiguities that will face both researchers and
legislators concerned with bots in the future.
Department of Politics and International Relations, University of Oxford. @rgorwa
Annenberg School for Communication, University of Pennsylvania. @dzguilbeault
Policy & Internet, Fall 2018. This is a pre-publication version: please refer to final for
page numbers/references. A draft of this paper was presented at ICA 2018, Prague (CZ).
1 Introduction
The same technologies that once promised to enhance democracy are now
increasingly accused of undermining it. Social media services like Facebook
and Twitter—once presented as liberation technologies predicated on global
community and the open exchange of ideas—have recently proven themselves
especially susceptible to various forms of political manipulation (Tucker et
al. 2017). One of the leading mechanisms of this manipulation is the social
media “bot,” which has become a nexus for some of the most pressing issues
around algorithms, automation, and Internet policy (Woolley and Howard
2016). In 2016 alone, researchers documented how social media bots were
used in the French elections to spread misinformation through the concerted
MacronLeaks campaign (Ferrara 2017), to push hyper-partisan news dur-
ing the Brexit referendum (Bastos and Mercea 2017), and to affect political
conversation in the lead up to the 2016 US Presidential election (Bessi and
Ferrara 2016). Recently, representatives from Facebook and Twitter were
summoned to testify before Congress as part of investigations into digitally
enabled foreign manipulation during the 2016 US Presidential election, and
leading international newspapers have extensively covered the now-widely
accepted threat posed by malicious bot accounts trying to covertly influ-
ence political processes around the world. Since then, a number of spec-
ulative solutions have been proposed for the so-called bot problem, many
of which appear to rely on tenuous technical capacities at best, and oth-
ers which threaten to significantly alter the rules governing online speech,
and at worst, embolden censorship on the behalf of authoritarian and hy-
brid regimes. While the issues that we discuss in this article are complex, it
has become clear that the technology policy decisions made by social media
platforms as they pertain to automation, as in other areas (Gillespie 2015),
can have a resounding impact on elections and politics at both the domestic
and international level.
It is no surprise that various actors are therefore increasingly interested
in influencing bot policy, including governments, corporations, and citizens.
However, it appears that these stakeholders often continue to talk past each
other, largely due to a lack of basic conceptual clarity. What exactly are bots?
What do they do? Why do different academic communities understand bots
quite differently? The goal of this article is to unpack some of these questions,
and to discuss the key challenges faced by researchers and legislators when
it comes to bot detection, research, and eventually, policy.
1.1 An Overview of Ambiguities
Reading about bots requires one to familiarize oneself with an incredible
breadth of terminology, often used seemingly interchangeably by academics,
journalists, and policymakers. These different terms include: robots, bots,
chatbots, spam bots, social bots, political bots, botnets, sybils, and cyborgs,
which are often used without precision to refer to everything from auto-
mated social media accounts, to recommender systems and web scrapers.
Equally important to these discussions are terms like trolling, sock-puppets,
troll farms, and astroturfing (Woolley 2016). According to some scholars,
bots are responsible for significant proportions of online activity, are used
to game algorithms and recommender systems (Yao et al. 2017), can stifle
(Ferrara et al. 2016) or encourage (Savage, Monroy-Hernandez, and Hollerer
2015) political speech, and can play an important role in the circulation of
hyperpartisan “fake news” (Shao et al. 2017). Bots have become a fact
of life, and to state that bots manipulate voters online is now accepted as
uncontroversial. But what exactly are bots?
Although it is now a commonly used term, the etymology of “bot” is com-
plicated and ambiguous. During the early days of personal computing, the
term was employed to refer to a variety of different software systems, such as
daemons and scripts that communicated warning messages to human users
(Leonard 1997). Other types of software, such as the early programs that
deployed procedural writing to converse with a human user, were eventu-
ally referred to as “bots” or “chatbots.” In the 2000s, “bot” developed an
entirely new series of associations in the network and information security
literatures, where it was used to refer to computers compromised, co-opted,
and remotely controlled by malware (Yang et al. 2014). These devices can
be linked in a network (a “botnet”) and used to carry out distributed de-
nial of service (DDoS) attacks (Moore and Anderson 2012). Once Twitter
emerged as a major social network (and major home for automated accounts),
some researchers began calling these automated accounts “bots,” while oth-
ers, particularly computer scientists associated with the information security
community, preferred the term “sybil”—a computer security term that refers
to compromised actors or nodes within a network (Alvisi et al. 2013; Ferrara
et al. 2016).
This cross-talk would not present such a pressing problem were not for
the policymakers and pundits currently calling for platform companies to
prevent foreign manipulation of social networks and to enact more stringent
bot policy (Glaser 2017). Researchers hoping to contribute to these policy
discussions have been hindered by a clear lack of conceptual clarity, akin
to the phenomenon known by social scientists as concept misformation or
category ambiguity (Sartori 1970). As Lazarsfeld and Barton (1957) once
argued, before we can investigate the presence or absence of some concept,
we need to know precisely what that concept is. In other words, we need to
better understand bots before we can really research and write about them.
In this article, we begin by outlining a typology of bots, covering early
uses of the term in the pre-World Wide Web era up to the recent increase
in bot-related scholarship. Through this typology, we then go on to demon-
strate three major sources of ambiguity in defining bots: (1) structure, which
concerns the substance, design, and operation of the “bot” system, as well
as whether these systems are algorithmically or human-based; (2) function,
which concerns how the “bot” system operates over social media, for example,
as a data scraper or an account emulating a human user and communicating
with other users; and (3) uses, which concerns the various ways that people
can use bots for personal, corporate, and political ends, where questions of
social impact are front and center. We conclude with a discussion of the ma-
jor challenges in advancing a general understanding of political bots, moving
forward. These challenges include access to data, bot detection methods,
and the general lack of conceptual clarity that scholars, journalists, and the
public have had to grapple with.
2 A Typology of Bots
In its simplest form, the word “bot” is derived from “robot.” Bots are have
been generally defined as automated agents that function on an online plat-
form (Franklin and Graesser 1996). As some put it, these are programs
that run continuously, formulate decisions, act upon those decisions with-
out human intervention, and are able adapt to the context they operate
in (Tsvetkova et al. 2017). However, since the rise of computing and the
eventual creation of the World Wide Web, there have been many different
programs that have all been called bots, including some that fulfill signifi-
cantly different functions and have different effects than those that we would
normally associate with bots today. One of the pioneering early works on
bots, Leonard’s Origin of New Species (1997), provides an excellent example
of the lack of clarity that the term had even as it first became widely used
in the 1990s. Various programs and scripts serving many different functions
are all lumped into Leonard’s “bot kingdom,” such as web scrapers, crawlers,
indexers, interactive chatbots that interact with users via a simple text in-
terface, and the simple autonomous agents that played a role in early online
“multi-user dungeon” (MUD) games. Each one of these types functions in
different ways, and in recent years, has become associated with a different
scholarly community. While a complete typology would be worthy of its own
article, we provide here a brief overview of the major different processes and
programs often referred to as “bots,” paying particular attention to those
that are most relevant to current policy concerns.
2.1 ‘Web Robots’: Crawlers and Scrapers
As the Web grew rapidly following its inception in the 1990s, it became clear
that both accessing and archiving the incredible number of webpages that
were being added every day would be an extremely difficult task. Given the
unfeasibility of using manual archiving tools in the long term, automated
scripts—commonly referred to as robots or spiders—were deployed to down-
load and index websites in bulk, and eventually became a key component
of what are now known as search engines (Olston and Najork 2010; Pant,
Srinivasan, and Menczer 2004).
While these crawlers did not interact directly with humans, and operated
behind the scenes, they could still have a very real impact on end-users: it
quickly became apparent that these scripts posed a technology policy issue,
given that poorly executed crawlers could inadvertently overwhelm servers
by querying too many pages at once, and because users and system admin-
istrators would not necessarily want all of their content indexed by search
engines. To remedy these issues, the “Robot Exclusion Protocol” was devel-
oped by the Internet Engineering Task Force (IETF) to govern these “Web
Robots” via a robots.txt file embedded in webpages, which provided rules
for crawlers as to what should be considered off limits (Koster 1996). From
their early days, these crawlers were often referred to as bots: for example,
Polybot and IRLBot were two popular early examples (Olston and Najork
2010). Other terminology used occasionally for these web crawlers included
“wanderers,” “worms,” “fish,” “walkers,” or “knowbots” (Gudivada et al.
Today, it has become common to begin writing on social media bots with
big figures that demonstrate their apparent global impact. For example, re-
ports from private security and hosting companies have estimated that more
than half of all web traffic is created by “bots,” and these numbers are oc-
casionally cited by scholars in the field (Gilani, Farahbakhsh, and Crowcroft
2017). But a closer look indicates that the “bots” in question are in fact these
kinds of web crawlers and other programs that perform crawling, indexing,
and scraping functions. These are an infrastructural element of search en-
gines and other features of the modern World Wide Web that do not directly
interact with users on a social platform, and are therefore considerably dif-
ferent than automated social media accounts.
2.2 Chatbots
Chatbots are a form of human–computer dialog system that operate through
natural language via text or speech (Deryugina 2010; Sansonnet, Leray, and
Martin 2006). In other words, they are programs that approximate human
speech and interact with humans directly through some sort of interface.
Chatbots are almost as old as computers themselves: Joseph Weizenbaum’s
program, ELIZA, which operated on an early time-shared computing system
at MIT in the 1960s, impersonated a psychoanalyst by responding to simple
text-based input from a list of pre-programmed phrases (Weizenbaum 1966).
Developers of functional chatbots seek to design programs that can sus-
tain at least basic dialogue with a human user. This entails processing inputs
(through natural language processing, for example), and making use of a
corpus of data to formulate a response to this input (Deryugina 2010). Mod-
ern chatbots are substantially more sophisticated than their predecessors:
today, chatbot programs have many commercial implementations, and are
often known as virtual assistants or assisting conversational agents (Sanson-
net, Leray, and Martin 2006), with current voice-based examples including
Apple’s Siri and Amazon’s Alexa. Another implementation for chatbots is
within messaging applications, and as instant messaging platforms have be-
come extremely popular, text-based chatbots have been developed for mul-
tiple messaging apps, including Facebook Messenger, Skype, Slack, WeChat,
and Telegram (Folstad and Brandtzaeg 2017). Bots have been built by de-
velopers to perform a range of practical functions on these apps, including
answering frequently asked questions and performing organizational tasks.
While some social media bots, like those on Twitter, can occasionally fea-
ture chatbot functionality that allows them to interact directly with human
users (see, for instance, the infamous case of Microsoft’s “Tay” in Neff and
Nagy 2016), most chatbots remain functionally separate from typical social
media bots.
2.3 Spambots
Spam has been a long-standing frustration for users of networked services,
pre-dating the Internet on bulletin boards like USENET (Brunton 2013). As
the early academic ARPANET opened up to the general public, commercial
interests began to take advantage of the reach provided by the new medium
to send out advertisements. Spamming activity escalated rapidly as the Web
grew, to the point that spam was said to “threaten the Internet’s stability
and reliability” (Weinstein 2003). As spam grew in scale, spammers wrote
scripts to spread their messages at scale—enter the first “spambots.”
Spambots, as traditionally understood, are not simple scripts but rather
computers or other networked devices compromised by malware and con-
trolled by a third party (Brunton 2012). These have been traditionally
termed “bots” in the information security literature (Moore and Anderson
2012). Machines can be harnessed into large networks (botnets), which can
be used to send spam en masse or perform Distributed Denial of Service
(DDoS) attacks. Major spam botnets, like Storm, Grum, or Rostock, can
send billions of emails a day and are composed of hundreds of thousands of
compromised computers (Rodr´ıguez-G´omez et al. 2013). These are machines
commandeered for a specific purpose, and not automated agents in the sense
of a chatbot or social bot (see below).
Two other forms of spam that users often encounter on the web and on
social networks are the “spambots” that post on online comment sections,
and those that spread advertisements or malware on social media platforms.
Hayati et al. (2009) study what they call “web spambots,” programs that are
often application specific and designed to attack certain types of comment
infrastructures, like the WordPress blogging tools that provide the back-end
for many sites, or comment services like Disqus. These scripts function like
a crawler, searching for sites that accept comments and then mass posting
messages. Similar spam crawlers search the web to harvest emails for eventual
spam emails (Hayati et al. 2009). These spambots are effectively crawlers
and are distinct functionally from social bots. However, in a prime example
of the ambiguity that these terms can have, once social networking services
rose to prominence, spammers began to impersonate users with manually
controlled or automated accounts, creating profiles on social networks and
trying to spread commercial or malicious content onto sites like MySpace
(Lee, Eoff, and Caverlee 2011). These spambots are in fact distinct from the
commonly discussed spambots (networks of compromised computers or web
crawlers) and in some cases may only differ from contemporary social media
bots in terms of their use.
2.4 Social Bots
As the new generation of “Web 2.0” social networks were established in the
mid 2000s, bots became increasingly deployed on a host of new platforms. On
Wikipedia, editing bots were deployed to help with the automated adminis-
tration and editing of the rapidly growing crowdsourced encyclopedia (Geiger
2014, 342). The emergence of the microblogging service Twitter, founded in
2006, would lead to the large-scale proliferation of automated accounts, due
to its open application programming interface (API) and policies that en-
couraged developers to creatively deploy automation through third party
applications and tools. In the early 2010s, computer scientists began to note
that these policies enabled a large population of automated accounts that
could be used for malicious purposes, including spreading spam and malware
links (Chu et al. 2010).
Since then, various forms of automation operating on social media plat-
forms have been referred to as social bots. Two subtly different, yet impor-
tant distinctions have emerged in the relevant social and computer science
literatures, linked to two slightly different spellings: “socialbot” (one word)
and “social bot” (two words). The first conference paper on socialbots”
published in 2011, describes how automated accounts, assuming a fabricated
identity, can infiltrate real networks of users and spread malicious links or
advertisements (Boshmaf et al. 2011). These socialbots are defined in in-
formation security terms as an adversary, and often called “sybils,” a term
derived from the network security literature for an actor that controls mul-
tiple false nodes within a network (Cao et al. 2012; Boshmaf et al. 2013;
Mitter, Wagner, and Strohmaier 2014).
Social bots (two words) are a broader and more flexible concept, gener-
ally deployed by the social scientists that have developed a recent interest in
various forms of automation on social media. A social bot is generally under-
stood as a program “that automatically produces content and interacts with
humans on social media” (Ferrara et al. 2016). As Stieglitz et al. (2017)
note in a comprehensive literature review of social bots, this definition of-
ten includes a stipulation that social bots mimic human users. For example,
Abokhodair et al. (2015, 840) define social bots as “automated social agents”
that are public facing and that seem to act in ways that are not dissimilar
to how a real human may act in an online space.
The major bot of interest of late is a subcategory of social bot: social
bots that are deployed for political purposes, also known as political bots
(Woolley and Howard 2016). One of the first political uses of social bots was
during the 2010 Massachusetts Special Election in the United States, where
a small network of automated accounts was used to launch a Twitter smear
campaign against one of the candidates (Metaxas and Mustafaraj 2012). A
more sophisticated effort was observed a year later in Russia, where activists
took to Twitter to mobilize and discuss the Presidential election, only to be
met with a concerted bot campaign designed to clog up hashtags and drown
out political discussion (Thomas, Grier, and Paxson 2012). Since 2012, re-
searchers have suggested that social bots have been used on Twitter to in-
terfere with political mobilization in Syria (Abokhodair, Yoo, and McDonald
2015; Verkamp and Gupta 2013) and Mexico (Su´arez-Serrato et al. 2016),
with journalistic evidence of their use in multiple other countries (Woolley
2016). Most recently, scholars have been concerned about the application
of political bots to important political events like referenda (Woolley and
Howard 2016), with studies suggesting that there may have been substan-
tial Twitter bot activity in the lead up to the UK’s 2016 Brexit referendum
(Bastos and Mercea 2017), the 2017 French general election (Ferrara 2017),
and the 2016 US Presidential Election (Bessi and Ferrara 2016). While social
bots are now often associated with state-run disinformation campaigns, there
are other automated accounts used to fulfill creative and accountability func-
tions, including via activism (Savage, Monroy-Hernandez, and Hollerer 2015;
Ford, Dubois, and Puschmann 2016) and journalism (Lokot &Diakopolous
2015). Social bots can be used for both benign commercial purposes as well
as more fraught activities such as search engine optimization, spamming, and
influencer marketing (Ratkiewicz et al. 2011).
2.5 Sockpuppets and ‘Trolls’
The term “sockpuppet” is another term that is often used to describe fake
identities used to interact with ordinary users on social networks (Bu, Xia,
and Wang 2013). The term generally implies manual control over accounts,
but it is often used to include automated bot accounts as well (Bastos and
Mercea 2017). Sockpuppets can be deployed by government employees, reg-
ular users trying to influence discussions, or by “crowdturfers,” workers on
gig-economy platforms like Fiverr hired to fabricate reviews and post fake
comments about products (Lee, Webb, and Ge 2014).
Politically motivated sockpuppets, especially when coordinated by gov-
ernment proxies or interrelated actors, are often called “trolls.” Multiple re-
ports have emerged detailing the activities of a legendary troll factory linked
to the Russian government and located outside of St Petersburg, allegedly
housing hundreds of paid bloggers who inundate social networks with pro-
Russia content published under fabricated profiles (Chen 2015). This com-
pany, the so-called “Internet Research Agency,” has further increased its
infamy due to Facebook and Twitter’s recent congressional testimony that
the company purchased advertising targeted at American voters during the
2016 Presidential election (Stretch 2017). There are varying degrees of evi-
dence for similar activity, confined mostly to the domestic context and carried
out by government employees or proxies, with examples including countries
like China, Turkey, Syria, and Ecuador (King et al. 2017; Cardullo 2015;
Al-Rawi 2014; Freedom House 2016).
The concept of the “troll farm” is imprecise due to its differences from the
practice of “trolling” as outlined by Internet scholars like Phillips (2015) and
Coleman (2012). Also challenging are the differing cultural contexts and un-
derstandings of some of these terms. Country-specific work on digital politics
has suggested that the lexicon for these terms can vary in different countries:
for instance, in Polish, the terms “troll” and “bot” are generally seen by some
as interchangeable, and used to indicate manipulation without regard to au-
tomation (Gorwa 2017). In the public discourse in the United States and
United Kingdom around the 2016 US Election and about the Internet Re-
search Agency, journalists and commentators tend to refer to Russian trolls
and Russian bots interchangeably. Some have tried to get around these am-
biguous terms: Bastos and Mercea (2017) use the term sockpuppet instead,
noting that most automated accounts are in a sense sockpuppets, as they
often impersonate users. But given that the notion of simulating the general
behavior of a human user is inherent in the common definition of social bots
(Maus 2017), we suggest that automated social media accounts be called so-
cial bots, and that the term sockpuppet be used (instead of the term troll)
for accounts with manual curation and control.
2.6 Cyborgs and Hybrid Accounts
Amongst the most pressing challenges for researchers today are accounts
which exhibit a combination of automation and of human curation, often
called “cyborgs.” Chu et al. (2010, 21) provided one of the first, and most
commonly implemented definitions of the social media cyborg as a “bot-
assisted human or human-assisted bot.” However, it has never been clear
exactly how much automation makes a human user a cyborg, or how much
human intervention is needed to make a bot a cyborg, and indeed, cyborgs
are very poorly understood in general. Is a user that makes use of the service
Tweetdeck (which was acquired by Twitter in 2011, and is widely used) to
schedule tweets or to tweet from multiple accounts simultaneously considered
a cyborg? Should organizational accounts (from media organizations like
the BBC, for example) which tweet automatically with occasional human
oversight be considered bots or cyborgs?
Another ambiguity regarding hybrids is apparent in the emerging trend of
users volunteering their real profiles to be automated for political purposes, as
seen in the 2017 UK general election (Gorwa and Guilbeault 2017). Similarly,
research has documented the prevalence of underpaid, human “clickworkers”
hired to spread political messages and to like, upvote, and share content al-
gorithms (Lee et al., 2011, 2014). Clickworkers offer a serviceable alternative
to automated processes, while also exhibiting enough human-like behavior to
avoid anti-spam filters and bot detection algorithms (Golumbia 2013). The
conceptual distinction between social bots, cyborgs, and sock-puppets is un-
clear, as it depends on a theoretical and hereto undetermined threshold of
automation. This lack of clarity has a real effect: problematically, the best
current academic methods for Twitter bot detection are not able to accu-
rately detect cyborg accounts, as any level of human engagement is enough
to throw off machine-learning based models based on account features (Fer-
rara et al. 2016).
3 A Framework for Understanding Bots: Three
The preceding sections have outlined the multitude of different bots, and the
challenges of trying to formulate static definitions. When creating a concep-
tual map or typology, should we lump together types of automation by their
use, or by how they work? Rather than attempting to create a definitive, pre-
scriptive framework for the countless different types of bots, we recommend
three core considerations that are useful when thinking about them, inspired
by past work on developer–platform relations and APIs (Bogost & Mont-
fort 2008). Importantly, these considerations are not framed as a rejection
of pre-existing categorizations, and they account for the fact that bots are
constantly changing and increasing in their sophistication. The framework
has three parts, which can be framed as simple questions. The idea is that
focusing on each consideration when assessing a type of bot will provide a
more comprehensive sense of how to categorize the account, relative to one’s
goals and purposes. The first question is structural: How does the technology
actually work? The second is functional: What kind of operational capacities
does the technology afford? The third is ethical: How are these technologies
actually deployed, and what social impact do they have? We discuss these
three considerations, and their implications for policy and research, below.
3.1 The Structure of the System
The first category concerns the substance, design, and operation of the sys-
tem. There are many questions that need to be considered. What envi-
ronment does it operate in? Does it operate on a social media platform?
Which platform or platforms? How does the bot work? What type of code
does it use? Is it a distinct script written by a programmer, or a publicly
available tool for automation like If This Then That (IFTTT), or perhaps
a type of content management software like SocialFlow or Buffer? Does it
use the API, or does it use software designed to automate web-browsing by
interacting with website html and simulating clicks (headless browsing)? Is it
fully automated, or is it a hybrid account that keeps a “human in the loop”?
What type of algorithm does it use? Is it strictly procedural (e.g. has a set
number of responses, like ELIZA) or does it use machine learning to adapt to
conversations and exhibit context sensitivity (Adams 2017)? Policy at both
the industry and public level will need to be designed differently to target
“bots” with different structural characteristics.
Perhaps the simplest and most important question about structure for
bot regulation is whether the “bot” is made of software at all, or if it is
a human exhibiting bot-like behavior. A surprising number of journalists
and researchers describe human-controlled accounts as bots: for example,
Munger’s (2017) online experiment where the so-called bot accounts were
manually controlled by the experimenter. Similarly, the recent media cover-
age of “Russian bots” often lumps together automated accounts and manu-
ally controlled ones under a single umbrella (Shane 2017). Even more am-
biguous are hybrid accounts, where users can easily automate their activity
using various types of publicly available software. At the structural level,
technology policy will have to determine how this type of automation will
be managed, and how these types of content management systems should
be designed. The structure of the bot is also essential for targeting techni-
cal interventions, either in terms of automated detection and removal, or in
terms of prevention via API policies. If policy makers are particularly con-
cerned with bots that rely on API access to control and operate accounts,
then lobbying social media companies to impose tighter constraints on their
API could be an effective redress. Indeed, it appears as if most of the Twit-
ter bots that can be purchased online or through digital marketing agencies
are built to rely on the public API, so policy interventions at this level are
likely to lead to a significant reduction in bot activity. Similarly, structural
interventions would include a reshaping of how content management allows
the use of multiple accounts to send duplicate messages and schedule groups
of posts ahead of time.
3.2 The Bot’s Function
The second category pertains more specifically to what the bot does. Is the
role of the bot to operate a social media account? Does it identify itself as
a bot, or does it impersonate a human user, and if so, does it do so convinc-
ingly? Does it engage with users in conversation? Does it communicate with
individual users, or does it engage in unidirectional mass-messaging?
Questions concerning function are essential for targeting policy to spe-
cific kinds of bots. They are also vital for avoiding much of the cross-talk
that occurs in bot-related discourse. For instance, chatbots are occasionally
confused with other types of social bots, even though both exhibit distinct
functionalities, with different structural underpinnings. In their narrow, con-
trolled environment, chatbots are often clearly identified as bots, and they
can perform a range of commercial services such as making restaurant reser-
vations or booking flights. Some chatbots have even been designed to build
personal relationships with users—such as artificial companies and therapist
bots (Floridi 2014; Folstad and Brandtzaeg 2017).
These new self-proclaimed bots pose their own issues and policy concerns,
such as the collection and marketing of sensitive personal data to advertisers
(Neff and Nafus 2016). Importantly, chatbots differ substantially in both
structure and function from most social bots, which communicate primarily
over public posts that appear on social media pages. These latter bots are
typically built to rely on hard-coded scripts that post predetermined mes-
sages, or that copy the messages of users in a predictable manner, such that
they are incapable of participating in conversations. Questions about func-
tionality allow us to distinguish social bots, generally construed, from other
algorithms that may not fall under prospective bot-related policy interven-
tions aimed at curbing political disinformation. If the capacity to commu-
nicate with users is definitive of the type of bot in question, where issues of
deception and manipulation are key, then algorithms that do not have direct
public interaction with users should not be considered to be conceptually
similar; for example, web-scrapers,
3.3 The Bot’s Use
This third category specifically refers to how the bot is used, and what the
end goal of the bot is. This is arguably the most important from a policy
standpoint, as it contains ethical and normative judgements as to what pos-
itive, acceptable online behavior is—not just for bots, but also for users in
general. Is the bot being used to fulfil a political or ideological purpose? Is
it spreading a certain message or belief? If so, is its goal designed to em-
power certain communities or promote accountability and transparency? Or
instead, does the bot appear to have a commercial agenda?
Because of the diversity of accounts that qualify as bots, automation
policies cannot operate without normative assumptions about what kinds
of bots should be allowed to operate over social media. The problem for
the policymakers currently trying to make bots illegal (see, for example, the
proposed “Bot Disclosure and Accountability Act, 2018,” also known as the
Feinstein Bill). is that structurally, the same social bots can simultaneously
enable a host of positive and negative actors. The affordances that make
social bots a potentially powerful political organizing tool are the same ones
that allow for their implementation by foreign governments (for example),
much like social networks themselves, and other recent digital technologies
with similar “dual-use” implications (Pearce 2015). Therefore, it is difficult
to constrain negative uses without also curbing positive uses at the structural
For instance, if social media platforms were to ban bots of all kinds as
a way of intervening on political social bots, this could prevent the use of
various chat bot applications that users appreciate, such as automated per-
sonal assistants and customer service bots. Otherwise, any regulation on
bots, either from within or outside of social media companies, would need to
distinguish types of bots based on their function in order to formulate clear
regulations to address the types of bots that have negative impact, while
preserving the bots that are recognized as having a more positive impact. As
specified by the topology above, it may be most useful to develop regulations
to address social bots particularly, given that webscrapers are not designed
to influence users through direct communicative activities, and chatbots are
often provided by software companies to perform useful social functions.
The issue of distinguishing positive from negative uses of bots is espe-
cially complex when considering that social media companies often market
themselves as platforms that foster free speech and political conversation.
If organizations and celebrities are permitted certain types of automation—
including those who use it to spread political content—then it seems fair that
users should also be allowed to deploy bots that spread their own political
beliefs. Savage et al. (2015), for instance, have designed a system of bots
to help activists in Latin America mobilize against corruption. As politi-
cal activity is a core part of social media, and some accounts are permitted
automation, the creators of technology policy (most critically, the employ-
ees of social media platforms who work on policy matters) will be placed
in the difficult position of outlining guidelines that do not arbitrarily dis-
rupt legitimate cases, such as citizen-built bot systems, in their attempt to
block illegitimate political bot activity, such as manipulative foreign influ-
ence operations. But it is clear that automation policies—like other content
policies—should be made more transparent, or they will appear wholly ar-
bitrary or even purposefully negligent. A recent example is provided by the
widely covered account of ImpostorBuster, a Twitter bot built to combat
antisemitism and hate speech, which was removed by Twitter, rather than
the hate-speech bots and accounts it was trying to combat (Rosenberg 2017).
While Twitter is not transparent as to why it removes certain accounts, it
appears to have been automatically pulled down for structural reasons (such
as violating the rate-limit set by Twitter, after having been flagged by users
trying to take the bot down) without consideration of its normative use and
possible social benefit.
Overall, it is increasingly evident that the communities empowered by
tools such as automation are not always the ones that the social media plat-
forms may have initially envisioned when they hoped that users would use
the tools—with the sophisticated use of bots, sock-puppets, and other mech-
anisms for social media manipulation by the US “alt-right” in the past two
years providing an excellent example (Marwick and Lewis 2017). Should
social media companies crack down on automated accounts? As platforms
currently moderate what they consider to be acceptable bots, a range of
possible abuses of power become apparent as soon as debates around disin-
formation and “fake news” become politicized. Now that government inter-
ests have entered the picture, the situation has become even more complex.
Regimes around the world have already begun to label dissidents as “bots”
or “trolls,” and dissenting speech as “fake news”—consider the recent efforts
by the government of Vietnam to pressure Facebook to remove “false ac-
counts” that have espoused anti-government views (Global Voices 2017). It
is essential that social media companies become more transparent about how
they define and enforce their content policies—and that they avoid defining
bots in such a vague way that they can essentially remove any user account
suspected of demonstrating politically undesirable behavior.
4 Current Challenges for Bot-Related Policy
Despite mounting concern about digital influence operations over social me-
dia, especially from foreign sources, there have yet to be any governmental
policy interventions developed to more closely manage the political uses of
social media bots. Facebook and Twitter have been called to testify to Con-
gressional Intelligence Committees about bots and foreign influence during
the 2016 US presidential election, and have been pressed to discuss proposed
solutions for addressing the issue. Most recently, measures proposed by state
legislators in California in April 2018, and at the federal level by Senator
Diane Feinstein in June 2018, would require all bot accounts to be labeled
as such by social media companies (Wang 2018). However, any initiatives
suggested by policymakers and informed by research will have to deal with
several pressing challenges: the conceptual ambiguity outlined in the preced-
ing sections, as well as poor measurement and data access, lack of clarity
about who exactly is responsible, and the overarching challenge of business
incentives that are not predisposed towards resolving the aforementioned is-
4.1 Measurement and Data Access
Bot detection is very difficult. It is not a widely reported fact that researchers
are unable to fully represent the scale of the current issue by relying solely
on data provided through public APIs. Even the social media companies
themselves find bot detection a challenge, partially because of the massive
scale on which they (and the bot operators) function. In a policy statement
following its testimony to the Senate Intelligence Committee in November
2017, Twitter said it had suspended over 117,000 “malicious applications”
in the previous four months alone, and was catching more than 450,000
suspicious logins per day (Twitter Policy 2017). Tracking the thousands of
bot accounts created every day, when maintaining a totally open API, is
virtually impossible. Similarly, Facebook has admitted that their platform
is so large (with more than two billion users) that accurately classifying and
measuring “inauthentic” accounts is a major challenge (Weedon, Nuland,
and Stamos 2017). Taking this a step further by trying to link malicious
activity to a specific actor (e.g. groups linked to a foreign government) is
even more difficult, as IP addresses and other indicators can be easily spoofed
by determined, careful operators.
For academics, who do not have access to more sensitive account in-
formation (such as IP addresses, sign-in emails, browser fingerprints), bot
detection is even more difficult. Researchers cannot study bots on Facebook,
due to the limitations of the publicly available API, and as a result, virtu-
ally all studies of bot activity have taken place on Twitter (with the notable
exception of studies where researchers have themselves deployed bots that
invade Facebook, posing a further set of ethical dilemmas, see Boshmaf et
al. 2011). Many of the core ambiguities in bot detection stem from what
can be termed the “ground truth” problem: even the most advanced current
bot detection methods hinge on the successful identification of bot accounts
by human coders (Subrahmanian et al. 2016), a problem given that humans
are not particularly good at identifying bot accounts (Edwards et al. 2014).
Researchers can never be 100 percent certain that an account is truly a
bot, posing a challenge for machine learning models that use human-labeled
training data (Davis et al. 2016). The precision and recall of academic
bot detection methods, while constantly improving, is still seriously limited.
Less is known about the detection methods deployed by the private sector
and contracted by government agencies, but one can assume that they suffer
from the same issues.
Just like researchers, governments have data access challenges. For exam-
ple, what really was the scale of bot activity during the most recent elections
in the United States, France, and Germany? The key information about me-
dia manipulation and possible challenges to electoral integrity is now squarely
in the private domain, presenting difficulties for a public trying to understand
the scope of a problem while being provided with only the most cursory in-
formation. The policy implications of these measurement challenges become
very apparent in the context of the recent debate over a host of apparently
Russian-linked pages spreading inflammatory political content during the
2016 US presidential election. While Facebook initially claimed that only
a few million people saw advertisements that had been generated by these
pages, researchers used Facebook’s own advertising tools to track the reach
that these posts had generated, concluding that they had been seen more
than a hundred million times (Albright 2017). However, Karpf (2017) and
others suggested that these views could have been created by illegitimate
automated accounts, and that there was no way of telling how many of the
“impressions” were from actual Americans. It is currently impossible for
researchers to either discount or confirm the extent that indicators such as
likes and shares are being artificially inflated by false accounts, especially
on a closed platform like Facebook. The existing research that has been
conducted by academics into Twitter, while imperfect, has at least sought
to understand what is becoming increasingly perceived as a serious public
interest issue. However, Twitter has dismissed this work by stating that
their API does not actually reflect what users see on the platform (in effect,
playing the black box card). This argument takes the current problem of
measurement a step further: detection methods which are already imper-
fect operate on the assumption that the Twitter Streaming APIs provide a
fair account of content on the platform. To understand the scope and scale
of the problem, policymakers will need more reliable indicators and better
measurements than are currently available.
4.2 Responsibility
Most bot policy to date has in effect been entirely the purview of social
media companies, who understandably are the primary actors in dealing
with content on their platforms and manage automation based on their own
policies. However, the events of the past year have demonstrated that these
private (often rather opaque) policies can have serious political ramifications,
potentially placing them more squarely within the remit of regulatory and
legal authorities. A key, and unresolved challenge for policy is the question
of responsibility, and the inter-related questions of jurisdiction and authority.
To what extent should social media companies be held responsible for the
dealings of social bots? And who will hold these companies to account?
While the public debate around automated accounts is only nascent at
best, it is clearly related to the current debates around the governance of
political content and hyper-partisan “fake news.” In Germany, for instance,
there has been substantial discussion around newly enacted hate-speech laws
which impose significant fines against social media companies if they do not
respond quickly enough to illegal content, terrorist material, or harassment
(Tworek 2017). Through such measures, certain governments are keen to
assert that they do have jurisdictional authority over the content to which
their citizens are exposed. A whole spectrum of regulatory options under this
umbrella exist, with some being particularly troubling. For example, some
have argued that the answer to the “bot problem” is as simple as implement-
ing and enforcing strict “real-name” policies on Twitter—and making these
policies stricter for Facebook (Manjoo and Roose 2017). The recent emer-
gence of bots into the public discourse has re-opened age old debates about
anonymity and privacy online (boyd 2012; Hogan 2012), now with the added
challenge of balancing the anonymity that can be abused by sock-puppets
and automated fake accounts, and the anonymity that empowers activists
and promotes free speech around the world.
In a sense, technology companies have already admitted at least some de-
gree of responsibility for the current political impact of the misinformation
ecosystem, within which bots play an important role (Shao et al. 2017). In a
statement issued after Facebook published evidence of Russian-linked groups
that had purchased political advertising through Facebook’s marketing tools,
CEO Mark Zuckerberg mentioned that Facebook takes political activity se-
riously and was “working to ensure the integrity of the [then upcoming]
German elections” (Read 2017). This kind of statement represents a signifi-
cant acknowledgement of the political importance of social media platforms,
despite their past insistence that they are neutral conduits of information
rather than media companies or publishers (Napoli and Caplan 2017). It
is entirely possible that Twitter’s policies on automation have an effect, no
matter how minute, on elections around the world. Could they be held liable
for these effects? At the time of writing, the case has been legislated in the
court of public opinion, rather than through explicit policy interventions or
regulation, but policymakers (especially in Europe) have continued to put
Twitter under serious pressure to provide an honest account of the extent
that various elections and referenda (e.g. Brexit) have been influenced by
“bots.” The matter is by no means settled, and will play an important part
in the deeper public and scholarly conversation around key issues of platform
responsibility, governance, and accountability (Gillespie 2018).
4.3 Contrasting Incentives
Underlying these challenges is a more fundamental question about the busi-
ness models and incentives of social media companies. As Twitter has long
encouraged automation by providing an open API with very permissive third-
party application policies, automation drives a significant amount of traffic
on their platform (Chu et al. 2010). Twitter allows accounts to easily deploy
their own applications or use tools that automate their activity, which can be
useful: accounts run by media organizations, for example, can automatically
tweet every time a new article is published. Automated accounts appear to
drive a significant portion of Twitter traffic (Gilani et al. 2017; Wojcik et al.
2018), and indeed, fulfill many creative, productive functions alongside their
malicious ones. Unsurprisingly, Twitter naturally wishes to maintain the
largest possible user base, and reports “monthly active users” to its share-
holders, and as such, is loath to change its automation policies and require
meaningful review for applications. It has taken immense public pressure for
Twitter to finally start managing the developers who are allowed to build
on the Twitter API, announcing a new “developer onboarding process” in
January 2018 (Twitter Policy 2018).
As business incentives are critical in shaping content policy—and there-
fore policies concerning automation—for social media companies, slightly dif-
ferent incentives have yielded differing policies on automation and content.
For example, while Twitter’s core concern has been to increase their traffic
and to maintain as open a platform as possible (famously once claiming to be
the “free speech wing of the free speech party”), Facebook has been battling
invasive spam for years and has much tighter controls over its API. As such,
it appears that Facebook has comparatively much lower numbers of auto-
mated users (both proportionally and absolutely), but, instead, is concerned
primarily with manually controlled sock-puppet accounts, which can be set
up by anyone and are difficult or impossible to detect if they do not coor-
dinate at scale or draw too much attention (Weedon, Nuland, and Stamos
2017). For both companies, delineating between legitimate and illegitimate
activity is a key challenge. Twitter would certainly prefer to be able to keep
their legitimate and benign forms of automation (bots which regularly tweet
the weather, for example) and only clamp down on malicious automation,
but doing so is difficult, as the same structural features enable both types
of activity. These incentives seem to inform the platforms’ unwillingness to
share data with the public or with researchers, as well as their past lack of
transparency. Evidence that demonstrated unequivocally the true number
of automated accounts on Twitter, for example, could have major, adverse
effects on their bottom line. Similarly, Facebook faced public backlash after
a series of partnerships with academics that yielded unethical experiments
(Grimmelmann 2015). Why face another public relations crisis if they can
avoid it?
This illustrates the challenge that lies behind all the other issues we have
mentioned here: platform interests often clash with the preferences of the
academic research community and of the public. Academics strive to open the
black box and better understand the role that bots play in public debate and
information diffusion, while pushing for greater transparency and more access
to the relevant data, with little concern for the business dealings of a social
networking platform. Public commentators may wish for platforms to take
a more active stance against automated or manually orchestrated campaigns
of hate speech and harassment, and may be concerned by the democratic
implications of certain malicious actors invisibly using social media, without
necessarily worrying about how exactly platforms could prevent such activity,
or the implications of major interventions (e.g. invasive identity-verification
measures). There are no easy solutions to these challenges, given the complex
trade-offs and differing stakeholder incentives at play.
While scholars strive to unpack the architectures of contemporary media
manipulation, and legislators seek to understand the impact of social media
on elections and political processes, the corporate actors involved will nat-
urally weigh disclosures against their bottom line and reputations. For this
reason, the contemporary debates about information quality, disinformation,
and “fake news”—within which lie the questions of automation and content
policy discussed in this article—cannot exist separately from the broader de-
bates about technology policy and governance. Of the policy and research
challenges discussed in this last section, this is the most difficult issue moving
forward: conceptual ambiguity can be reduced by diligent scholarship, and
researchers can work to improve detection models, but business incentives
will not shift on their own. As a highly political, topical, and important
technology policy issue, the question of political automation raises a number
fundamental questions about platform responsibility and governance that
have yet to be fully explored by scholars.
5 Conclusion
Amidst immense public pressure, policymakers are trying to understand how
to respond to the apparent manipulation of the emerging architectures of
digitally enabled political influence. Admittedly, the debate around bots
and other forms of political automation is only in its embryonic stages; how-
ever, we predict that it will be a far more central component of future de-
bates around the political implications of social media, political polarization,
and the effects of “fake news,” hoaxes, and misinformation. For this to
happen, however, far more work will be needed to unpack the conceptual
mishmash of the current bot landscape. A brief review of the relevant schol-
arship shows that the notion of what exactly a “bot” is remains vague and
ill-defined. Given the obvious technology policy challenges that these am-
biguities present, we hope that others will expand on the basic framework
presented here and continue the work through definitions, typologies, and
conceptual mapping exercises.
Quantitative studies have recently made notable progress in the ability
to identify and measure bot influence on the diffusion of political messages,
providing promising directions for future work (Vosoughi et al. 2018). How-
ever, we expect that to maximize the benefits of these studies for developing
policy, their methods and results need to be coupled with a clearer theo-
retical foundation and understanding of the types of bots being measured
and analyzed. Although the relevant literature has expanded significantly
in the past two years, there has been little of the definitional debate and
the theoretical work one would expect: much of the recent theoretical and
ethnographic work on bots is not in conversation with current quantitative
efforts to measure bots and their impact. As a result, qualitative and quan-
titative approaches to bot research have yet to establish a common typology
for interpreting the outputs of these research communities, thereby requiring
policymakers to undergo unwieldly synthetic work in defining bots and their
impact in their effort to pursue evidence-based policy. As a translational
effort between quantitative and qualitative research, the typology developed
in this article aims to provide a framework for facilitating the cumulative
development of shared concepts and measurements regarding bots, media
manipulation, and political automation more generally, with the ultimate
goal of providing clearer guidance in the development of bot policy.
Beyond the conceptual ambiguities discussed is this article, there are sev-
eral other challenges that face the researchers, policymakers, and journalists
trying to understand and accurately engage with politically relevant forms of
online automation moving forward. These, most pressingly, include imper-
fect bot detection methods and an overall lack of reliable data. Future work
will be required to engage deeply with the question of what can be done to
overcome these challenges of poor measurement, data access, and—perhaps
most importantly—the intricate layers of overlapping public, corporate, and
government interests that define this issue area.
6 References
Abokhodair, Norah, Daisy Yoo, and David W. McDonald. 2015. “Dissecting
a Social Botnet: Growth, Content and Influence in Twitter.” In, 839–51.
Adams, Terrence. 2017. “AI-Powered Social Bots.” arXiv:1706.05143
[Cs], June.
Al-Rawi, Ahmed K. 2014. “Cyber Warriors in the Middle East: The Case
of the Syrian Electronic Army.” Public Relations Review 40 (3): 420–28.
Albright, Jonathan. 2017. “Itemized Posts and Historical Engagement -
6 Now-Closed FB Pages.”
Alvisi, Lorenzo, Allen Clement, Alessandro Epasto, Silvio Lattanzi, and
Alessandro Panconesi. 2013. “Sok: The Evolution of Sybil Defense via
Social Networks.” In Security and Privacy (SP), 2013 IEEE Symposium on,
382–96. IEEE.
Bastos, M. T., and D. Mercea. 2017. “The Brexit Botnet and User-
Generated Hyperpartisan News.” Social Science Computer Review, Septem-
Bessi, Alessandro, and Emilio Ferrara. 2016. “Social Bots Distort the
2016 U.S. Presidential Election Online Discussion.” First Monday 21 (11).
Bogost and N. Montfort, 2009. Platform Studies: Frequently Questioned
Answers. in Proceedings of the Digital Arts and Culture Conference, Irvine
CA, December 12-15.
Boshmaf, Yazan, Ildar Muslukhov, Konstantin Beznosov, and Matei Ri-
peanu. 2011. “The Socialbot Network: When Bots Socialize for Fame and
Money.” In Proceedings of the 27th Annual Computer Security Applications
Conference, 93–102. ACSAC ’11. New York, NY, USA: ACM.
———. 2013. “Design and Analysis of a Social Botnet.” Computer
Networks, Botnet Activity: Analysis, Detection and Shutdown, 57 (2): 556–
boyd, danah. 2012. “The Politics of Real Names.” Communications of
the ACM 55 (8): 29–31.
Brunton, Finn. 2012. “Constitutive Interference: Spam and Online Com-
munities.” Representations 117 (1): 30–58.
———. 2013. Spam: A Shadow History of the Internet. MIT Press.
Bu, Zhan, Zhengyou Xia, and Jiandong Wang. 2013. “A Sock Pup-
pet Detection Algorithm on Virtual Spaces.” Knowledge-Based Systems 37
(January): 366–77.
Cao, Qiang, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro.
2012. Aiding the Detection of Fake Accounts in Large Scale Social On-
line Services.” In. USENIX Association.
Cardullo, Paolo. 2015. “‘Hacking Multitude’ and Big Data: Some In-
sights from the Turkish ‘Digital Coup’.” Big Data & Society 2 (1): 2053951715580599.
Chen, Adrian. 2015. “The Agency.” The New York Times, June.
Chu, Zi, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2010.
“Who Is Tweeting on Twitter: Human, Bot, or Cyborg?” In Proceedings of
the 26th Annual Computer Security Applications Conference, 21–30. ACM.
Coleman, E. Gabriella. 2012. “Phreaks, Hackers, and Trolls: The Politics
of Transgression and Spectacle.” In The Social Media Reader, edited by
Mandiberg, Michael. New York: New York University Press.
Davis, Clayton Allen, Onur Varol, Emilio Ferrara, Alessandro Flammini,
and Filippo Menczer. 2016. “BotOrNot: A System to Evaluate Social Bots.”
In Proceedings of the 25th International Conference Companion on World
Wide Web, 273–74. 2889302: International World Wide Web Conferences
Steering Committee.
Deryugina, OV. 2010. “Chatterbots.” Scientific and Technical Informa-
tion Processing 37 (2): 143–47.
Edwards, Chad, Autumn Edwards, Patric R. Spence, and Ashleigh K.
Shelton. 2014. “Is That a Bot Running the Social Media Feed? Testing the
Differences in Perceptions of Communication Quality for a Human Agent
and a Bot Agent on Twitter.” Computers in Human Behavior 33: 372–76.
Ferrara, Emilio. 2017. “Disinformation and Social Bot Operations in
the Run up to the 2017 French Presidential Election.” arXiv:1707.00086
[Physics], June.
Ferrara, Emilio, Onur Varol, C. Davis, F. Menczer, and A. Flammini.
2016. “The Rise of Social Bots.” Communications of the ACM 59 (7):
Floridi, Luciano. 2014. The Fourth Revolution: How the Infosphere Is
Reshaping Human Reality. Oxford University Press.
Folstad, Asbjørn, and Petter Bae Brandtzaeg. 2017. “Chatbots and the
New World of HCI.” Interactions 24 (4): 38–42.
Ford, Heather, Elizabeth Dubois, and Cornelius Puschmann. 2016. “Keep-
ing Ottawa Honest - One Tweet at a Time? Politicians, Journalists, Wikipedi-
ans and Their Twitter Bots.” International Journal of Communication 10:
Franklin, Stan, and Art Graesser. 1996. “Is It an Agent, or Just a
Program?: A Taxonomy for Autonomous Agents.” In Intelligent Agents
III Agent Theories, Architectures, and Languages, 21–35. Lecture Notes in
Computer Science. Springer, Berlin, Heidelberg.
Freedom House. 2016. “Freedom on the Net Report: Ecuador.”
Geiger, Stuart. 2014. Bots, bespoke, code and the materiality of software
platforms. Information, Communication & Society 17 (3): 342–356.
Gilani, Zafar, Jon Crowcroft, Reza Farahbakhsh, and Gareth Tyson.
2017. “The Implications of Twitterbot Generated Data Traffic on Networked
Systems.” In Proceedings of the SIGCOMM Posters and Demos, 51–53. SIG-
COMM Posters and Demos ’17. New York.
Gilani, Zafar, Reza Farahbakhsh, and Jon Crowcroft. 2017. “Do Bots
Impact Twitter Activity?” In Proceedings of the 26th International Confer-
ence on World Wide Web Companion, 781–82. International World Wide
Web Conferences Steering Committee.
Gillespie, Tarleton. 2015. “Platforms Intervene.” Social Media + Society
1 (1): 2056305115580479.
Gillespie, Tarleton. 2018. Custodians of the Internet: Platforms, Content
Moderation, and the Hidden Decisions that Shape Social Media. New Haven:
Yale University Press.
Glaser, April. 2017. “Twitter Could Do a Lot More to Curb the Spread
of Russian Misinformation.” Slate, October.
Global Voices. 2017. “Netizen Report: Vietnam Says Facebook Will Co-
operate with Censorship Requests on Offensive and ‘Fake’ Content ·Global
Golumbia, David, Commercial Trolling: Social Media and the Corporate
Deformation of Democracy (July 31, 2013). SSRN.
Gorwa, Robert. 2017. “Computational Propaganda in Poland: False
Amplifiers and the Digital Public Sphere.” Project on Computational Pro-
paganda Working Paper Series: Oxford, UK.
Gorwa, Robert, and Douglas Guilbeault. 2017. “Tinder Nightmares: The
Promise and Peril of Political Bots.” WIRED UK, July.
Grimmelmann, James. 2015. “The Law and Ethics of Experiments on
Social Media Users.” SSRN Scholarly Paper ID 2604168. Rochester, NY:
Social Science Research Network.
Gudivada, Venkat N, Vijay V Raghavan, William I Grosky, and Rajesh
Kasanagottu. 1997. “Information Retrieval on the World Wide Web.” IEEE
Internet Computing 1 (5): 58–68.
Hayati, Pedram, Kevin Chai, Vidyasagar Potdar, and Alex Talevski.
2009. “HoneySpam 2.0: Profiling Web Spambot Behaviour.” In Principles
of Practice in Multi-Agent Systems, 335–44.
Hogan, Bernie. 2012. “Pseudonyms and the Rise of the Real-Name Web.”
SSRN Scholarly Paper ID 2229365. Rochester, NY: Social Science Research
Karpf, David. 2017. “People Are Hyperventilating over a Study of Rus-
sian Propaganda on Facebook. Just Breathe Deeply.” Washington Post.
King, G., J. Pan, and M. Roberts, 2017. How the Chinese Government
Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argu-
ment. American Political Science Review 111 (3): 484-501
Koster, Martijn. 1996. “A Method for Web Robots Control.” IETF
Network Working Group, Internet Draft.
Lazarsfeld, Paul Felix, and Allen H Barton. 1957. Qualitative Measure-
ment in the Social Siences: Classification, Typologies, and Indices. Stanford
University Press.
Lee, Kyumin, Brian David Eoff, and James Caverlee. 2011. “Seven
Months with the Devils: A Long-Term Study of Content Polluters on Twit-
ter.” In In AAAI Int’l Conference on Weblogs and Social Media (ICWSM).
Lee, Kyumin, Steve Webb, and Hancheng Ge. 2014. “The Dark Side
of Micro-Task Marketplaces: Characterizing Fiverr and Automatically De-
tecting Crowdturfing.” In International Conference on Weblogs and Social
Media (ICWSM).
Leonard, Andrew. 1997. Bots: The Origin of the New Species. Wired
Lokot, Tetyana, and Nicholas Diakopoulos. 2016. “News Bots: Automat-
ing News and Information Dissemination on Twitter.” Digital Journalism 4
(6): 682–699.
Manjoo, Farhad, and Kevin Roose. 2017. “How to Fix Facebook? We
Asked 9 Experts.” The New York Times, October.
Marwick, Alice, and Rebecca Lewis. 2017. “Media Manipulation and
Disinformation Online.” Data and Society Research Institute Report.
Maus, Gregory. 2017. “A Typology of Socialbots (Abbrev.).” In Pro-
ceedings of the 2017 ACM on Web Science Conference, 399–400. WebSci ’17.
New York, NY, USA: ACM.
Metaxas, Panagiotis T, and Eni Mustafaraj. 2012. “Science and Society.
Social Media and the Elections.” Science 338 (6106): 472–73.
Mitter, Silvia, Claudia Wagner, and Markus Strohmaier. 2014. “Under-
standing the Impact of Socialbot Attacks in Online Social Networks.” arXiv
Preprint arXiv:1402.6289.
Moore, Tyler, and Ross Anderson. 2012. “Internet Security.” In The
Oxford Handbook of the Digital Economy. Oxford University Press.
Munger, Kevin. 2017. “Tweetment Effects on the Tweeted: Experimen-
tally Reducing Racist Harassment.” Political Behavior 39 (3): 629–49.
Napoli, Philip, and Robyn Caplan. 2017. “Why Media Companies Insist
They’re Not Media Companies, Why They’re Wrong, and Why It Matters.”
First Monday 22 (5).
Neff, Gina, and Dawn Nafus. 2016. Self-Tracking. MIT Press.
Neff, G., and P. Nagy, 2016. Talking to Bots: Symbiotic Agency and the
Case of Tay. International Journal of Communication 10: 4915-4931
Olston, Christopher, and Marc Najork. 2010. “Web Crawling.” Founda-
tions and Trends in Information Retrieval 4 (3): 175–246.
Pant, Gautam, Padmini Srinivasan, and Filippo Menczer. 2004. “Crawl-
ing the Web.” In Web Dynamics: Adapting to Change in Content, Size,
Topology and Use, edited by Mark Levene and Alexandra Poulovassilis. Springer
Science & Business Media.
Pearce, Katy E. 2015. “Democratizing Kompromat: The Affordances of
Social Media for State-Sponsored Harassment.” Information, Communica-
tion & Society 18 (10): 1158–74.
Phillips, Whitney. 2015. This Is Why We Can’t Have Nice Things:
Mapping the Relationship Between Online Trolling and Mainstream Culture.
Cambridge, Massachusetts: MIT Press.
Ratkiewicz, Jacob, Michael Conover, Mark Meiss, Bruno Gon¸calves, Sne-
hal Patil, Alessandro Flammini, and Filippo Menczer. 2011. “Truthy: Map-
ping the Spread of Astroturf in Microblog Streams.” In Proceedings of the
20th International Conference Companion on World Wide Web, 249–52.
Read, Max. 2017. “Does Even Mark Zuckerberg Know What Facebook
Is?” New York Magazine.
Rodr´ıguez-G´omez, Rafael A, Gabriel Maci´a-Fern´andez, and Pedro Garc´ıa-
Teodoro. 2013. “Survey and Taxonomy of Botnet Research Through Life-
Cycle.” ACM Computing Surveys (CSUR) 45 (4): 45.
Rosenberg, Yair. 2017. “Opinion Confessions of a Digital Nazi Hunter.”
The New York Times, December.
Sansonnet, Jean-Paul, David Leray, and Jean-Claude Martin. 2006. “Ar-
chitecture of a Framework for Generic Assisting Conversational Agents.”
In Intelligent Virtual Agents, 145–56. Lecture Notes in Computer Science.
Springer, Berlin, Heidelberg.
Sartori, Giovanni. 1970. “Concept Misformation in Comparative Poli-
tics.” American Political Science Review 64 (4): 1033–53.
Savage, Saiph, Andres Monroy-Hernandez, and Tobias Hollerer. 2015.
“Botivist: Calling Volunteers to Action Using Online Bots.” arXiv Preprint
Shane, Scott. 2017. “The Fake Americans Russia Created to Influence
the Election.” The New York Times.
Shao, Chengcheng, Giovanni Luca Ciampaglia, Onur Varol, Alessandro
Flammini, and Filippo Menczer. 2017. “The Spread of Fake News by Social
Bots.” arXiv:1707.07592 [Physics], July.
Stieglitz, Stefan, Florian Brachten, Bj¨orn Ross, and Anna-Katharina
Jung. 2017. “Do Social Bots Dream of Electric Sheep? A Categorisation of
Social Media Bot Accounts.” arXiv:1710.04044 [Cs], October.
Stretch, Colin. 2017. “Facebook to Provide Congress with Ads Linked
to Internet Research Agency.” FB Newsroom.
Su´arez-Serrato, Pablo, Margaret E. Roberts, Clayton Davis, and Filippo
Menczer. 2016. “On the Influence of Social Bots in Online Protests.” In
Social Informatics, 269–78. Lecture Notes in Computer Science. Springer.
Subrahmanian, V. S., Amos Azaria, Skylar Durst, Vadim Kagan, Aram
Galstyan, Kristina Lerman, Linhong Zhu, et al. 2016. The DARPA Twitter
Bot Challenge.” Computer 49 (6): 38–46.
Thomas, Kurt, Chris Grier, and Vern Paxson. 2012. “Adapting Social
Spam Infrastructure for Political Censorship.” In LEET.
Tsvetkova, Milena, Ruth Garc´ıa-Gavilanes, Luciano Floridi, and Taha
Yasseri. 2017. “Even Good Bots Fight: The Case of Wikipedia.” PLOS
ONE 12 (2): e0171774.
Tucker, Joshua A, Yannis Theocharis, Margaret E Roberts, and Pablo
Barber´a. 2017. “From Liberation to Turmoil: Social Media and Democracy.”
Journal of Democracy 28 (4): 46–59.
Twitter Policy. 2017. “Update: Russian Interference in 2016 US Election,
Bots, & Misinformation.”
———. 2018. “Update on Twitter’s Review of the 2016 U.S. Election.”
Tworek, Heidi. 2017. “How Germany Is Tackling Hate Speech.” Foreign
Verkamp, John-Paul, and Minaxi Gupta. 2013. “Five Incidents, One
Theme: Twitter Spam as a Weapon to Drown Voices of Protest.” In FOCI.
Vosoughi, Soroush, Deb Roy, and Sinan Aral. 2018. The spread of true
and false news online. Science 359 (6380): 1146–1151.
Wang, Selina. 2018. “California Would Require Twitter, Facebook to
Disclose Bots.” Bloomberg, April.
Weedon, Jen, William Nuland, and Alex Stamos. 2017. “Information
Operations and Facebook.” Facebook Security White Paper..
Weinstein, Lauren. 2003. “Spam Wars.” Communications of the ACM
46 (8): 136.
Weizenbaum, Joseph. 1966. “ELIZA—a Computer Program for the
Study of Natural Language Communication Between Man and Machine.”
Communications of the ACM 9 (1): 36–45.
Woolley, Samuel C. 2016. “Automating Power: Social Bot Interference
in Global Politics.” First Monday 21 (4).
Woolley, Samuel C., and Philip N. Howard. 2016. “Political Communica-
tion, Computational Propaganda, and Autonomous Agents — Introduction.”
International Journal of Communication 10 (October): 4882–90.
Wojcik, Stefan, Solomon Messing, Aaron Smith, Lee Rainie, and Paul
Hitlin. 2018. “Bots in the Twittersphere.” Pew Research Center.
Yang, Zhi, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y Zhao, and
Yafei Dai. 2014. “Uncovering Social Network Sybils in the Wild.” ACM
Transactions on Knowledge Discovery from Data (TKDD) 8 (1): 1–29.
Yao, Yuanshun, Bimal Viswanath, Jenna Cryan, Haitao Zheng, and Ben
Y. Zhao. 2017. “Automated Crowdturfing Attacks and Defenses in Online
Review Systems.” arXiv:1708.08151 [Cs].
... Consistently, Barbon Jr. et al. (2018) argued that some bots should be thought of as legitimate as they are easy for ordinary users to discern, while others who use sophisticated tactics to disguise themselves as humans should be conceptualized as malicious. Others suggested classifying based on context, for example, in the case of political (Gorwa & Guilbeault, 2020;Woolley & Howard, 2018;Yan et al., 2021), news (Lokot & Diakopoulos, 2016), and stock market (Cresci, Lillo, et al., 2019) bots. ...
... Stieglitz et al. (2017) emphasized that malicious bots are often used in order to fake public support (e.g., astroturfing-the imitation of bottom-up grassroots campaigns). Others suggested categorizing based on contexts (Gorwa & Guilbeault, 2020). For example, one may classify bot accounts as political (Woolley & Howard, 2018;Yan et al., 2021), news distributors (Lokot & Diakopoulos, 2016), or commercial (Ratkiewicz et al., 2011). ...
... Taken together, our review of existing literature suggests that despite important differences between types of bots (Gorwa & Guilbeault, 2020;Hagen et al., 2020), and preliminary findings in regard to differences in discourse among types of bots (Aldayel & Magdy, 2022), most studies to date have relied on binary classifications. Our study builds upon and extends recent work done in the context of climate change (Chen, Shi, et al., 2021), finding bots emphasize specific aspects of the debate, and COVID-19 (Duan et al., 2022), where researchers found topical and volume differences between bots and humans on Twitter during early three months of COVID-19 pandemic. ...
As concerns about social bots online increase, studies have attempted to explore the discourse they produce, and its effects on individuals and the public at large. We argue that the common reliance on aggregated scores of binary classifiers for bot detection may have yielded biased or inaccurate results. To test this possibility, we systematically compare the differences between non-bots and bots using binary and non-binary classifiers (classified into the categories of astroturf, self-declared, spammers, fake followers, and Other). We use two Twitter corpora, about COVID-19 vaccines (N = 1,697,280) and climate change (N = 1,062,522). We find that both in terms of volume and thematic content, the use of binary classifiers may hinder, distort, or mask differences between humans and bots, that could only be discerned when observing specific bot types. We discuss the theoretical and practical implications of these findings.
... By relying on the HMC framework, we investigate the anthropomorphization of chatbots when they are built to act as communication partners. Existing studies have shown how bots are designed to appear human-like and to have personalities (Araujo, 2018; Gorwa and Guilbeault, 2020;Grimme et al., 2017), but our interest is on the process of communication in sociotechnical settings where humans and bots interact. Hence, we ask: Similar questions have previously been explored mainly theoretically or through experimental design, and there are only a few examples of utilizing data of actual, naturally occurring human-machine communication. ...
... The communication and impact of bots have predominantly been studied in the context of social media (e.g., Ferrari et al., 2016;Grimme et al., 2017;Gorwa & Guilbeault, 2020;Neff & Nagy, 2016) and journalism (e.g., Bollmer & Rodley, 2016;Gómez-Zará & Diakopoulos, 2020). Recently, bots have also entered nonpublic arenas; for example, customer service and enterprise social media. ...
Full-text available
This article examines communicative anthropomorphization, that is, assigning of humanlike features, of socialbots in communication between humans and bots. Situated in the field of human-machine communication, the article asks how socialbots are devised as anthropomorphized communication companions and explores the ways in which human users anthropomorphize bots through communication. Through an analysis of two datasets of bots interacting with humans on social media, we find that bots are communicatively anthropomorphized by directly addressing them, assigning agency to them, drawing parallels between humans and bots, and assigning emotions and opinions to bots. We suggest that socialbots inherently have anthropomorphized characteristics and affordances, but their anthropomorphization is completed and actualized by humans through communication. We conceptualize this process as communicative anthropomorphization.
... Finally, the hybrid CNN-LSTM was utilized for detecting fake news for showing its effectiveness. Setiawan et al. (2021) [4] have implemented the Hybrid "Support Vector Machine(SVM)" for detecting fake news, where the data were gathered from the standard dataset that has been subjected to a feature extraction phase through TF-IDF. Then, the classification was performed by hybrid SVM. ...
Conference Paper
Full-text available
There is huge rise in spreading of fake news. The main intention is to divert the truthiness and originality of the news. Our model is grounded with TF-IDF vectorizer for feature extraction, Text Blob for performing sentimental analysis, Matplotlib and Seaborn for visualization and with Naïve bayes, Random Forest and Logistic regression classification algorithms.
... There is also a lack of properly labelled datasets containing human and machine-generated short text in the research community [19]. Researchers in [28], [29] used a tweet dataset containing tweets generated by a wide range of bots like cyborg, social bot, spam bot, and sockpuppet [30]. However, their dataset was human labelled and research claimed that humans are unable to identify machine-generated text. ...
Full-text available
Recent advancements in natural language production provide an additional tool to manipulate public opinion on social media. Furthermore, advancements in language modelling have significantly strengthened the generative capabilities of deep neural models, empowering them with enhanced skills for content generation. Consequently, text-generative models have become increasingly powerful allowing the adversaries to use these remarkable abilities to boost social bots, allowing them to generate realistic deepfake posts and influence the discourse among the general public. To address this problem, the development of reliable and accurate deepfake social media message-detecting methods is important. Under this consideration, current research addresses the identification of machine-generated text on social networks like Twitter. In this study, a simple deep learning model in combination with word embeddings is employed for the classification of tweets as human-generated or bot-generated using a publicly available Tweepfake dataset. A conventional Convolutional Neural Network (CNN) architecture is devised, leveraging FastText word embeddings, to undertake the task of identifying deepfake tweets. To showcase the superior performance of the proposed method, this study employed several machine learning models as baseline methods for comparison. These baseline methods utilized various features, including Term Frequency, Term Frequency-Inverse Document Frequency, FastText, and FastText subword embeddings. Moreover, the performance of the proposed method is also compared against other deep learning models such as Long short-term memory (LSTM) and CNN-LSTM displaying the effectiveness and highlighting its advantages in accurately addressing the task at hand. Experimental results indicate that the design of the CNN architecture coupled with the utilization of FastText embeddings is suitable for efficient and effective classification of the tweet data with a superior 93% accuracy.
... events and ideas) and the motive can be arbitrary, social-engineering attacks usually target a specific individual (who does not need to be a celebrity) often with a financial or security-compromising motive and usually involves impersonation, i.e. pretending to be someone that the victim is familiar with. Social-engineering attacks include phishing [294,295], spams/bots [296,297], impersonating [298, 299] (including deepfake [299]), fake online content [51,300,301,302], and social network manipulation [303,304,305] etc. Almost all types of social-engineering attacks can be enhanced by leveraging LLMs, especially in contextualizing deceptive messages to users. For example, recently people have also shown the possibility of using an LLM to impersonate a person's style of conversation [298]. ...
Full-text available
Ensuring alignment, which refers to making models behave in accordance with human intentions [1,2], has become a critical task before deploying large language models (LLMs) in real-world applications. For instance, OpenAI devoted six months to iteratively aligning GPT-4 before its release [3]. However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations. This obstacle hinders systematic iteration and deployment of LLMs. To address this issue, this paper presents a comprehensive survey of key dimensions that are crucial to consider when assessing LLM trustworthiness. The survey covers seven major categories of LLM trustworthiness: reliability, safety, fairness, resistance to misuse, explainability and reasoning, adherence to social norms, and robustness. Each major category is further divided into several sub-categories, resulting in a total of 29 sub-categories. Additionally, a subset of 8 sub-categories is selected for further investigation, where corresponding measurement studies are designed and conducted on several widely-used LLMs. The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered. This highlights the importance of conducting more fine-grained analyses, testing, and making continuous improvements on LLM alignment. By shedding light on these key dimensions of LLM trustworthiness, this paper aims to provide valuable insights and guidance to practitioners in the field. Understanding and addressing these concerns will be crucial in achieving reliable and ethically sound deployment of LLMs in various applications.
... This dichotomy is usually associated with varying results of bot activity. The first type may spread spam or malicious content but the second is programmed to convey usually explicitly specified content (Brachten et al. 2017;Chu et al. 2012;Ferrara et al. 2016;Gilani et al. 2017;Gorwa and Guilbeault 2018;Stukal et al. 2019). ...
Social media virality is intertwined with content's ability to trigger specific reactions; however, little is known regarding the behavioural component of political information diffusion. This study uses big data to investigate the significance of hyperactive social media use in the retransmission of information produced by political opinion leaders on Twitter during Poland's 2019 European parliamentary election campaign. Using the isolation forest method, the research identified the social bot-like (semi)automated or human-controlled Twitter handles which are referred to as hyperactive accounts. The study finds that the hyperactive accounts produced almost all reactions in our sample (N = 114,036), but their average activity was relatively moderate and did not exceed human capabilities. Although their retweets equal 0.46% of the retweets obtained by political opinion leaders, the random forest regression models suggest that the likes and retweets from hyperactive accounts are the most important determinants of the dissemination of political opinion leaders' tweets. The results draw attention to the alarming phenomenon of the further increase in Twitter's interactional asymmetry.
... When in this state, known as 'Problematic Social Media Use (PSMU), one's social media usage occupies their daily life, to the extent that their other roles and obligations maybe compromised (e.g., family, romance, employment; [1,2]. In that line, PSMU impact has been demonstrated by its significant associations with mood disorder symptoms, low self-esteem, disrupted sleep, reduced physical health and social impairment [3,4]. Given that PSMU prevalence has been estimated to vary globally between 5%-10% of the social media users' population [1,5,6], which exceeds 80% among more developed countries, such as Australia, and has the prospective to rise [7,8], PSMU related mental health concerns present compelling. ...
Full-text available
Background Problematic social media use has been identified as negatively impacting psychological and everyday functioning and has been identified as a possible behavioural addiction (social media addiction; SMA). Whether SMA can be classified as a distinct behavioural addiction has been debated within the literature, with some regarding SMA as a premature pathologisation of ordinary social media use behaviour and suggesting there is little evidence for its use as a category of clinical concern. This study aimed to understand the relationship between proposed symptoms of SMA and psychological distress and examine these over time in a longitudinal network analysis, in order better understand whether SMA warrants classification as a unique pathology unique from general distress. Method N = 462 adults (Mage = 30.8, SDage = 9.23, 69.3% males, 29% females, 1.9% other sex or gender) completed measures of social media addiction (Bergen Social Media Addiction Scale), and psychological distress (DASS-21) at two time points, twelve months apart. Data were analysed using network analysis (NA) to explore SMA symptoms and psychological distress. Specifically, NA allows to assess the ‘influence’ and pathways of influence of each symptom in the network both cross-sectionally at each time point, as well as over time. Results SMA symptoms were found to be stable cross-sectionally over time, and were associated with, yet distinct, from, depression, anxiety and stress. The most central symptoms within the network were tolerance and mood-modification in terms of expected influence and closeness respectively. Depression symptoms appeared to have less of a formative effect on SMA symptoms than anxiety and stress. Conclusions Our findings support the conceptualisation of SMA as a distinct construct occurring based on an underpinning network cluster of behaviours and a distinct association between SMA symptoms and distress. Further replications of these findings, however, are needed to strengthen the evidence for SMA as a unique behavioural addiction.
... Social bots mimic human behavior online, and can be used in ways to amplify certain types of informationsay, climate denialism -, or operate in ways that widen social divisions online (Gorwa and Guilbeault, 2018;Shao et al., 2018). Our own work (Daume et al., 2023) shows that social bots play an unignorable role in climate change conversation. ...
Full-text available
We are in the midst of a transformation of the digital news ecosystem. The expansion of online social networks, the influence of recommender systems, increased automation, and new generative artificial intelligence tools are rapidly changing the speed and the way misinformation about climate change and sustainability issues moves around the world. Policymakers, researchers and the public need to combine forces to address the dangerous combination of opaque social media algorithms, polarizing social bots, and a new generation of AI-generated content. This synthesis brief is the result of a collaboration between Stockholm Resilience Centre at Stockholm University, the Beijer Institute of Ecological Economics at the Royal Swedish Academy of Sciences, the Complexity Science Hub Vienna, and Karolinska Institutet. It has been put together as an independent contribution to the Nobel Prize Summit 2023, Truth, Trust and Hope, Washington D.C., 24th to 26th of May 2023.
Bot detection in social media, particularly on Twitter, has become a crucial issue in recent years due to the increasing use of bots for malicious uses such as the spreading of false information in order to manipulate public opinion. In this paper, we review the most widely available tools for bot detection and the categorization models that exist in the literature. This paper put focus on providing a concise and informative overview of state-of-the-art bot detection on Twitter. This overview can be useful for developing more effective detection methods. Overall, our paper provides valuable insights into the current state of bot detection in social media, suggesting new challenges and possible future trends and research.Keywordsbot detectionTwittersocial mediabotnetmisinformation spread
Full-text available
The economic analysis of the digital economy has been a rapidly developing research area for more than a decade. Through authoritative examination by leading scholars, this publication takes a closer look at particular industries, business practices, and policy issues associated with the digital industry. The volume offers an up-to-date account of key topics, discusses open questions, and provides guidance for future research. It offers a blend of theoretical and empirical works that are central to understanding the digital economy. The articles are presented in four sections, corresponding with four broad themes: infrastructure, standards, and platforms; the transformation of selling, encompassing both the transformation of traditional selling and new, widespread application of tools such as auctions; user-generated content; and threats in the new digital environment. The first section covers infrastructure, standards, and various platform industries that rely heavily on recent developments in electronic data storage and transmission, including software, video games, payment systems, mobile telecommunications, and B2B commerce. The second section takes account of the reduced costs of online retailing that threaten offline retailers, widespread availability of information as it affects pricing and advertising, digital technology as it allows the widespread employment of novel price and non-price strategies (bundling, price discrimination), and auctions. The third section addresses the emergent phenomenon of user-generated content on the Internet, including the functioning of social networks and open source. The fourth section discusses threats arising from digitization and the Internet, namely digital piracy, privacy, and security concerns.
Full-text available
Most users want their Twitter feed, Facebook page, and YouTube comments to be free of harassment and porn. Whether faced with “fake news” or livestreamed violence, “content moderators”-who censor or promote user-posted content-have never been more important. This is especially true when the tools that social media platforms use to curb trolling, ban hate speech, and censor pornography can also silence the speech you need to hear. In this revealing and nuanced exploration, award-winning sociologist and cultural observer Tarleton Gillespie provides an overview of current social media practices and explains the underlying rationales for how, when, and why these policies are enforced. In doing so, Gillespie highlights that content moderation receives too little public scrutiny even as it is shapes social norms and creates consequences for public discourse, cultural production, and the fabric of society. Based on interviews with content moderators, creators, and consumers, this accessible, timely book is a must-read for anyone who’s ever clicked “like” or “retweet.”.
Conference Paper
Full-text available
So-called 'social bots' have garnered a lot of attention lately. Previous research showed that they attempted to influence political events such as the Brexit referendum and the US presidential elections. It remains, however, somewhat unclear what exactly can be understood by the term 'social bot'. This paper addresses the need to better understand the intentions of bots on social media and to develop a shared understanding of how 'social' bots differ from other types of bots. We thus describe a systematic review of publications that researched bot accounts on social media. Based on the results of this literature review, we propose a scheme for categorising bot accounts on social media sites. Our scheme groups bot accounts by two dimensions - Imitation of human behaviour and Intent.
Lies spread faster than the truth There is worldwide concern over false news and the possibility that it can influence political, economic, and social well-being. To understand how false news spreads, Vosoughi et al. used a data set of rumor cascades on Twitter from 2006 to 2017. About 126,000 rumors were spread by ∼3 million people. False news reached more people than the truth; the top 1% of false news cascades diffused to between 1000 and 100,000 people, whereas the truth rarely diffused to more than 1000 people. Falsehood also diffused faster than the truth. The degree of novelty and the emotional reactions of recipients may be responsible for the differences observed. Science , this issue p. 1146
Conference Paper
Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks cheap and therefore more scalable, but they can control rate of content output to eliminate the signature burstiness that makes crowdsourced campaigns easy to detect. Using Yelp reviews as an example platform, we show how a two phased review generation and customization attack can produce reviews that are indistinguishable by state-of-the-art statistical detectors. We conduct a survey-based user study to show these reviews not only evade human detection, but also score high on "usefulness" metrics by users. Finally, we develop novel automated defenses against these attacks, by leveraging the lossy transformation introduced by the RNN training and generation cycle. We consider countermeasures against our mechanisms, show that they produce unattractive cost-benefit tradeoffs for attackers, and that they can be further curtailed by simple constraints imposed by online service providers.