ArticlePDF Available

Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Authors:

Abstract

Social media has become a place for discussion and debate on controversial topics and, thus, provides an opportunity to influence public opinion. This possibility has given rise to a specific behaviour known as trolling, which can be found in almost every discussion that includes emotionally appealing topics. Trolling is a useful tool for any organisation willing to force a discussion off-track when one has no proper facts to back one’s arguments. Previous research has indicated that social media analytics tools can be utilised for automated detection of trolling. This paper provides tools for detecting message automation utilized in trolling.
Journal of
Information
Warfare
Volume 15, Issue 4
Fall 2016
Contents
From the Editor
L Armistead
i
Authors
iii
Rhizomatic Target Audiences of the Cyber Domain
M Sartonen, A-M Huhtinen and M Lehto
1
Exploring the Complexity of Cyberspace Governance: State Sovereignty, Multi-
stakeholderism, and Power Politics
A Liaropoulos
14
Applying Principles of Reflexive Control in Information and Cyber Operations
ML Jaitner and MAJ H Kantola
27
Utilising Journey Mapping and Crime Scripting to Combat Cybercrime and Cyber
Warfare Attacks
T Somer, B Hallaq and T Watson
39
Disinformation in Hybrid Warfare: The Rhizomatic Speed of Social Media in the
Spamosphere
A-M Huhtinen and J Rantapelkonen
50
Security-Information Flow in the South African Public Sector
H Patrick, B van Niekerk and Z Fields
68
South Korea’s Options in Responding to North Korean Cyberattacks
J Park, N Rowe and M Cisneros
86
Understanding the Trolling Phenomenon: The Automated Detection of Bots and
Cyborgs in the Social Media
J Paavola, T Helo, H Jalonen, M Sartonen and A-M Huhtinen
100
Journal of Information Warfare iii
Authors
Captain Maribel Cisneros,
United States Army, is a
Military Intelligence Officer
assigned to the U.S. Army
Cyber Command. She was
commissioned as a Second
Lieutenant in the Military
Intelligence Branch in 2007,
and has served as Assistant
S1, Platoon Leader, Company
Executive Officer, Battalion S2, and Company
Commander. She deployed multiple times to
USSOUTHCOM as a Mission Manager and Battle
Captain, and to USCENTCOM as Task Force OIC.
She earned a master’s degree in computer systems
and operations from the Naval Postgraduate School
and a master’s degree in management and leadership
from Webster University.
Dr. Ziska Fields is an
associate professor and
academic leader at the
University of KwaZulu-Natal,
South Africa. Her research
interests focus on creativity,
entrepreneurship, human
resources, and higher
education. She developed two
theoretical models to measure
creativity in South Africa, focusing on youth and
tertiary education. She has published in
internationally recognized journals and edited books.
She is the editor of the book Incorporating business
models and strategies into social entrepreneurship
and has completed another book titled Collective
creativity for responsible and sustainable business
practice. She is a member of the South African
Institute of Management, the Ethics Institute of South
Africa, and the Institute of People Management.
Bil Hallaq is a cyber security researcher with more
than 15 years of academic, commercial and industrial
experience. He previously spent several years
handling and mitigating against various security
threats and vulnerabilities within commercial
environments. He is delivering on various projects
including: the identification and application of novel
techniques for OSINT, EU E-CRIME Project -
comprised of several European partners including
Interpol where he is working with partners on
understanding criminal structures and mapping
cybercriminal activities to produce and recommend
effective countermeasures. His other applied research
areas include identifying methods and techniques for
cross border cyber attack attribution, mitigation at
scale of complex multi-jurisdictional cyber events,
and, maritime and rail cyber security. He holds
several professional qualifications including:
penetration testing, incident response, malware
investigation, digital forensics investigation amongst
others.
Tuomo Helo is a senior
lecturer with Turku University
of Applied Sciences, Turku,
Finland. He earned a master’s
degree in information systems
and a master’s degree in
economics from the
University of Turku, Finland.
His current research interests
include text analytics and data
mining in general. He has also completed research in
the fields of health economics and the economics of
education.
Dr. Aki-Mauri Huhtinen,
(LTC [GS]) is a military
professor in the Department of
Leadership and Military
Pedagogy at the Finnish
National Defence University,
Helsinki, Finland. His areas of
expertise are military
leadership, command and
control, the philosophy of
science in military organisational research, and the
philosophy of war. He has published peer-reviewed
journal articles, a book chapter, and books on
information warfare and non-kinetic influence in the
battle space. He has also organised and led several
research and development projects in the Finnish
Defence Forces from 2005 to 2015.
Margarita Levin Jaitner is a
researcher in the area of
information warfare and
cyberspacewith a particular
focus on Russian operations
at the Swedish Defence
University, Stockholm,
Sweden. She is also a Fellow
at the Blavatnik
Interdisciplinary Cyber
Research Center. She has previously conducted
research at the Finnish National Defence University
iv Journal of Information Warfare
as well as at the Yuval Ne’eman Workshop for
Security, Science and Technology in Tel Aviv. She
earned a master’s degree in Societal Risk
Management, and a bachelor’s degree in political
science.
Dr. Harri Jalonen is a
principal lecturer and research
group leader (AADI) at the
Turku University of Applied
Sciences, Turku, Finland. He
also holds a position as an
adjunct professor at the
University of Vaasa. He has
research experience dealing
with knowledge and
innovation management and digitalisation issues in
different organisational contexts. He has published
more than 100 articles in these fields. He is one of the
most referred researchers in the field of complexity
thinking in Finland. He has managed or been
involved in many international and national research
projects. In addition, he has guided several thesis
projects, including doctoral theses. He is a reviewer
for many academic journals and a committee member
on international conferences.
Major Harry Kantola
teaches and conducts research
at the Finnish National
Defence University, Helsinki,
Finland. He also is currently
appointed to the Finnish
Defence Command as a Cyber
Defence planner in C5 (J6)
branch. He joined the Finnish
Defence Forces in 1991. He
served in various capacities (CSO, CIO) in the
Finnish Navy, Armoured Signal Coy, and Armoured
Brigade. From 2014 to 2016, he served an
appointment as a researcher at the NATO
Cooperative Cyber Defence Centre of Excellence
(NATO CCD COE), Tallinn, Estonia.
Dr. Martti Lehto (Col.,
retired) works as a cyber
security and cyber defence
professor of practice in the
Department of Mathematical
Information Technology at the
University of Jyväskylä,
Jyväskylä, Finland. He has
more than 30 years of
experience as a developer and
leader of C4ISR Systems in the Finnish Defence
Forces. He has more than 75 publications, research
reports and articles on the areas of C4ISR systems,
cyber security and defence, information warfare, air
power, and defence policy.
Dr. Andrew Liaropoulos is
an assistant professor in the
Department of International
and European Studies at the
University of Piraeus, Greece.
He also teaches in the Joint
Staff War College, the Joint
Military Intelligence College,
the National Security College,
the Air War College, and the
Naval Staff Command College. His research interests
include international security, intelligence reform,
strategy, military transformation, foreign policy
analysis, cyber security, and Greek security policy.
He also serves as a senior analyst in the Research
Institute for European and American Studies
(RIEAS) and as the assistant editor of the Journal of
Mediterranean and Balkan Intelligence.
Dr. Jarkko Paavola is a
research team leader and a
principal lecturer with Turku
University of Applied
Sciences, Turku, Finland. He
earned his doctoral degree in
technology in the field of
wireless communications from
the University of Turku,
Finland. His current research
interests include information security and privacy,
dynamic spectrum sharing, and information security
architectures for systems utilising spectrum sharing.
Major Jimin Park, Republic
of Korea Airforce, is a Cyber
Intel-Ops Officer assigned to
ROK Cyber Command. His
previous assignments have
included the 37th Air
Intelligence Group, and
serving as an Intel-Ops Officer
and an Intel-Watch Officer at
Osan AFB with the U.S. 7th
Airforce. In 2007, he went to Ali Al Salem AFB in
Kuwait with the U.S. Central Command and the
386th Expeditionary Wing as part of Operation Iraqi
Freedom. He earned a master’s degree in computer
science from the U.S. Naval Postgraduate School.
Journal of Information Warfare v
Dr. Harold Patrick is a
forensic investigation
specialist at the University of
KwaZulu-Natal, South Africa.
He completed his doctorate at
University of KwaZulu-Natal
in 2016. His dissertation
focused on information
security, collaboration, and
the flow of security
information. He earned a master’s degree in
information systems and technology and is a
Certified Fraud Examiner.
Dr. Jari Rantapelkonen,
(LTC, retired) is a professor
emeritus at the Finnish
National Defence University,
Helsinki, Finland. His
expertise areas include
operational art and tactics,
military leadership,
information warfare, and the
philosophy of war. He has
served in Afghanistan, the Balkans, and the Middle
East. He is a mayor at the Enontekiö county, Finland
in the Arctic area.
Dr. Neil C. Rowe is a
professor of computer science
at the U.S. Naval Postgraduate
School (Monterey, CA, USA)
where he has been since
1983. He earned a doctorate in
computer science from
Stanford University
(1983). His main research
interests are data mining,
digital forensics, modelling of deception, and cyber
warfare.
Miika Sartonen is a
researcher at the Finnish
Defence Research Agency and
a doctoral student at the
National Defence University.
Tiia Sõmer is an early stage
researcher at Tallinn
University of Technology,
Tallinn, Estonia. Her research
focuses on cyber crime and
cyber forensics, leading TUT
work on the EU E-CRIME
project, a three-year European
Union project, researching the
economic aspects of cyber
crime. In addition, she has taught cyber security at
the strategic level and prepared students for cyber-
defence international policy-level competitions at the
TUT. Before starting an academic career, she served
for more than 20 years in the Estonian defence
forcesincluding teaching at the staff college;
working in diplomatic positions at national, NATO
and EU levels; and, most recently, working at EDF
HQ cyber security branch. Her master’s thesis, titled
“Educational Computer Game for Cyber Security: A
Game Concept”, focused on using games in the
teaching of cyber security. She is currently
completing Ph.D.-level studies, focusing on journey
mapping and its application in understanding and
solving cyber incidents.
Dr. Brett van Niekerk is a
senior security analyst at
Transnet and an Honorary
Research Fellow at the
University of KwaZulu-Natal,
South Africa. He graduated
from the University of
KwaZulu-Natal with his
doctorate in 2012 and has
completed two years of
postdoctoral research into information operations,
information warfare, and critical infrastructure
protection. He serves on the board for ISACA South
Africa and as secretary for the International
Federation of Information Processing’s Working
Group 9.10 on ICT in Peace and War. He has
contributed to the ISO/IEC information security
standards, and multiple presentations, papers, and
book chapters in information security and
information warfare to his name. He earned
bachelor’s and master’s degrees in electronic
engineering.
vi Journal of Information Warfare
Professor Tim Watson is the
Director of the Cyber
Security Centre at WMG
within the University of
Warwick, Coventry, UK. He
has more than 25 years’
experience in the computing
industry and in academia and
has been involved with a wide
range of computer systems on
several high-profile projects. In addition, he has
served as a consultant for some of the largest
telecoms, power, and oil companies. He is an adviser
to various parts of the UK government and to several
professional and standards bodies. His current
research includes EU-funded projects on combating
cyber-crime; UK MoD research into automated
defence, insider threat, and secure remote working;
and, EPSRC-funded research, focusing on the
protection of critical national infrastructure against
cyber-attack. He is a regular media commentator on
digital forensics and cyber security.
Journal of Information Warfare (2016) 15.4: 100-111 100
ISSN 1445-3312 print/ISSN 1445-3347 online
Understanding the Trolling Phenomenon: The Automated Detection of Bots
and Cyborgs in the Social Media
J Paavola1, T Helo1, H Jalonen1, M Sartonen2, A-M Huhtinen2
1Turku University of Applied Sciences
Turku, Finland
E-mail: jarkko.paavola@turkuamk.fi; tuomo.helo@turkuamk.fi; harri.jalonen@turkuamk.fi
2Finnish National Defence University
Helsinki, Finland
E-mail: miika.sartonen@mil.fi; aki.huhtinen@mil.fi
Abstract: Social media has become a place for discussion and debate on controversial topics
and, thus, provides an opportunity to influence public opinion. This possibility has given rise to a
specific behaviour known as trolling, which can be found in almost every discussion that
includes emotionally appealing topics. Trolling is a useful tool for any organisation willing to
force a discussion off-track when one has no proper facts to back one’s arguments. Previous
research has indicated that social media analytics tools can be utilised for automated detection
of trolling. This paper provides tools for detecting message automation utilized in trolling.
Keywords: Social Media, Stakeholder, Trolling, Sentiment Analysis, Bot, Cyborg
Introduction
The current stage in the evolution of information is one in which the unpredictability of its
effects is accelerating. The volume of information is growing, and its structure is becoming
increasingly opaque. Information can no longer be seen as a system or as the extent of one’s
knowledge, but must rather be seen as an entity that has started to live a life of its own. Thus,
information provides its own energy and is its own enemy. In most cases, information is also a
source of beneficial development and can improve people’s quality of life. It is essential,
however, to understand that it can also unleash danger and adversity.
Due to the plethora of information available, people are not always able to determine whether
information is valid, and consequently tend to make hasty presumptions with the data they have.
This tendency is utilized by ‘trolling’, which has come to be equated by the media in recent years
with online harassment. Because of trolling, it is becoming increasingly difficult to pinpoint
where information originates and where it leads (Malgin 2015).
In today’s era of information overload, individuals and groups try to get their messages across by
using forceful language, by engaging in dramatic (even violent) actions, or by posting video clips
or pictures on social media (Nacos, Bloch-Elkon & Shapiro 2011, p. 48). The politically-driven
mass media is most probably behind this information overload on individuals. Aggressive
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 101
behaviour is increasing in social media because of the technical ease with which trolling can be
carried out. In social media, all kinds of values become interwoven with each other.
Information is essentially a product of engineering science. In order to expand the sphere of
understanding to information as a part of human social life, one has to step outside of the ‘hard
sciences’ realm. Social sciences’ viewpoint is especially called for when discussing possible
threats and human fears connected with information. The rise in diverse Internet threats has
opened up the discussion of the possibility of nation states extending their capacity to control
information networks, including citizens’ private communications.
Computer culture theorists have identified the richly interconnected, heterogeneous, and
somewhat anarchic aspect of the Internet as a rhizomic social condition (Coyne 2014). During
the past quarter of a century, the usefulness of the Internet has permeated all domains
(individual, social, political, military, and business). Everyone worldwide can use the Internet
without any specific education, as the skills needed for communicating in the social media are
easy to acquire. At the same time, work-related and official messages run parallel with private
communications. Similarly, emotions and rational thinking may easily become intertwined due
to the ease and immediacy of our communications. Deleuze and Guattari (1983) use the terms
rhizome” and “rhizomatic” to describe this non-hierarchical, nomadic and easy environment,
particularly in relation to how individuals behave in this kind of environment. The current status
of the technological evolution of the Internet can also be said to be based on the rhizome
concept. The rhizome resists the organizational structure of the root-tree system, which charts
causality along chronological lines, and looks for the original source of ‘things’, as well as
toward the pinnacle or conclusion of those ‘things’. Any point on a rhizome can be connected
with any other. A rhizome can be cracked and broken at any point; it starts off again following
one or another of its lines, or even other lines (Deleuze & Guattari 1983, p. 15).
Why is trolling so easy to implement in rhizomatic information networks? The intercontinental
network of communication is not an organized structure: it has no central head or decision-
maker; it has no central command or hierarchies to quell undesired behaviour. The rhizomatic
network is simply too big and diffuse to be managed by a central command. By the same token,
rhizomatic organizations are often highly creative and innovative. The rhizome presents history
and culture as a map, or a wide array of attractions and influences with no specific origin or
genesis, for a “rhizome has no beginning or end; it is always ‘becoming’ in the middle and
between things” (Deleuze & Guattari 1983). One example of the diversity of rhizome networks
is the fakeholder behaviour.
This paper continues the work done by Paavola and Jalonen (2015), who examined whether
sentiment analysis could be utilised in detecting trolling behaviour. Sentiment analysis refers to
the automatic classification of messages into positive, negative, or neutral within a discussion
topic. In that work, Paavola and Jalonen concluded that sentiment analysis as such cannot detect
trolls, but results indicated that social media analytics tools can generally be utilised for this task.
Here, the authors’ goal is to investigate trolling phenomenon further.
To facilitate analysis, a sentiment analysis tool (Paavola & Jalonen 2015) was further developed
to detect message automation, which creates ‘noise’ in social media and makes it difficult to
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
102 Journal of Information Warfare
observe behavioural changes among human users of the social media. Paavola and Jalonen’s
work followed studies performed by Chu et al. (2012), Dickerson, Kagan, and Subrhamanian
(2014), and Clark et al. (2015) in which bot detection systems were devised. Components such
as message timing behaviour, spam detection, account properties, and linguistic attributes were
investigated. Variables were designed based on those components, and they were utilised in
order to categorize message senders as humans, bots, or cyborgs. The bot refers to computer
software that generates messages automatically whereas the cyborg in this context refers either to
a bot-assisted human or to a human-assisted bot.
This paper is organized as follows: First, the social media phenomenon is described, which gives
context to the trolling behaviour. Then, trolling behaviour in the social media is analysed and the
mechanisms characteristic to trolling are discussed. Before algorithms can be trained to detect
trolls, the definition of troll has to be accurate, which is not the case in the current literature.
Therefore, the trolling phenomenon is discussed before proceeding to the automated detection. In
the experimental part of the paper, automated bot and cyborg detection is applied to Twitter
messages. Finally, the discussion section provides future directions for troll detection and how to
defend against the trolling phenomenon.
Social Media as a Public Sphere
The promise of social media is not confined to technology, but involves cultural, societal, and
economic consequences. Social media refers herein to a constellation of Internet-based
applications that derive their value from the participation of users through directly creating
original content, modifying existing material, contributing to a community dialogue, and
integrating various media together to create something unique (Kaplan & Haenlein 2010). Social
media has engendered three changes: 1) the locus of activity shifts from the desktop to the web;
2) the locus of power shifts from the organization to the collective; and 3) the locus of value
creation shifts from the organization to the consumer (Berthon et al. 2012).
Social media has become integrated into the lives of postmodern people. Globally, more than
two billion people use social media on a daily basis. Whether it’s a question of the comments of
statesmen, opposition leaders’ criticism, or celebrities’ publicity tricks, social media offers an
authentic information source and effective communication channel. Social media enables
interaction between friends and strangers at the same time that it lowers the threshold of contact
and personalizes communication. In a way, social media has made the world smaller.
Social media has brought with it ‘media life’, which Deuze (2011) calls ‘“the state where media
has become so inseparable from us that we do not live with media, but in it’” (Karppi 2014, p.
22). In a hyper-connected network society, posts on Twitter cause stock market crashes and
overthrow governments (Pentland 2014). Unsurprisingly, life in social media is as messy as it is
in the real world. Social media provides people exposure to new information and ideas, reflects
their everyday highs and lows, allows for engagement in new friendships and breaking up of old
ones, makes other people delighted or jealous by posting holiday and party photos, praises and
complains about brands, and idolises the achievements of descendants and pets. Stated a bit
simply, users’ behaviour in social media can be categorised into two types: rational/information-
seeking and emotional/affective-seeking behaviours (Jansen et al. 2009). A desire to address a
gap in information concerning events, organisations, or issues is an example of information-
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 103
seeking behaviour in social media, whereas affective-seeking behaviour stands for the expression
of opinion about events, organisations, or issues.
The penetration of the Internet and the growing number of social media platforms that allow
user-generated content has not happened without consequences: on the one hand, Internet-wide
freedom of speech has created bloggers and others who magnetise substantial audiences; and on
the other hand, the Internet has democratised communication by enabling anyone (in theory) to
say anything. Obviously, the consequences can be good or bad. A positive interpretation of
freedom of speech, in turn, is that it enables the emergence of ‘public spheres’ envisioned by
Habermas (1989). Internet-based public spheres enable civic activities and political participation;
that is, “citizens can gather together virtually, irrespective of geographic location, and engage in
information exchange and rational discussion” (Robertson et al. 2013). On the other hand, many
studies have pointed out that social media has become a place for venting negative experiences
and expressing dissatisfaction (Lee & Cude 2012; Bae & Lee 2012). Due to the lack of
gatekeepers (Lewin 1943), social media provokes not only sharing rumours and expressing
conflicting views, but also bullying, harassment, and hate speech. In addition to providing a
forum for sharing information, social media is also a channel for propagating misinformation.
Terrorist organizations, such as ISIS, have quite effectively deployed social media in recruiting
members, disseminating propaganda, and inciting fear. It has also been asked whether
governments use social media for painting black into white. The question is justifiable as the
significance of the change entailed with the emergence of public spheres becomes concrete in
countries which have prohibited or hampered the use of social media.
To connect public discussion theory and trolling, this work is based on a public discussion
stakeholder classification made by Luoma-Aho (2015). The classification includes positively
engaged faith-holders, negatively engaged hateholders, and fakeholders. Trolls can be considered
as either hateholders (humans) or fakeholders (bots or cyborgs). Luoma-Aho states that the
influence of a fakeholder appears larger than it really is in practice, but tools for analysing the
impact are not provided. In order to have a more thorough view of the discussion, it would be
important to know the sources behind the fakeholders’ arguments; but like the artists of black
propaganda, they attempt to hide themselves. It can be hypothesized that the role of fakeholders
increases with subjects whose legitimacy is questioned or challenged, and when the public is
confused about the relevance and significance of the arguments presented in various social media
platforms.
Studies have confirmed what every social media user already knows: virtual public spheres
attract users which Luoma-Aho (2015) has named as hateholders and fakeholders. Social media
provides hateholders with continuously changing targets and stimulus. Hateholders’ behaviour
can be harsh, hurtful, and offensive, and it should therefore be condemned. Although fighting
against hateholders is not an easy task, it is possible because hateholders behaviour is visible.
Hateholders do not typically try to hide. To the contrary, they pursue publicity. Fakeholders, in
turn, act in the shadows. Although their behaviour can also be harsh, hurtful, and offensive, it is
difficult to get hands on them. Acting through fake identities and using sophisticated persona-
management software, fakeholders aim to violate their targets.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
104 Journal of Information Warfare
Trolling as a Phenomenon
During the experiment phase of this research, the authors found that the definitions used for
trolling were not specific enough. Human communication, including that which is malevolent, is
so diverse and contextual that in order to create automatic classification systems for trolling, a
clear definition is needed. Trolling, as a means of either distracting a conversation or simply
provoking an emotional answer can utilise multiple context-based ways of influence. A positive
word or sentence can, in the right context, actually be an insult. To have any success in creating
trolling identification systems, the authors first had to create a specific description of what they
were looking for.
The Online Cambridge dictionary (2016) defines a troll as “someone who leaves an intentionally
annoying message on the Internet, in order to get attention or cause trouble” or as “a message
that someone leaves on the Internet that is intended to annoy people”. Oxford English Dictionary
Online (2016) defines a troll as “a person who makes a deliberately offensive or provocative
online post” or “a deliberately offensive or provocative online post”. These two definitions refer
to two specific characteristics of trolling: that trolling is something that happens online and that
the intention of trolling is to offend someone.
Buckels, Trapnell and Paulhus (2014) studied the motivation behind trolling behaviour from a
psychological viewpoint. They found out that self-reported enjoyment of trolling was positively
correlated with three components of the Dark Tetriad, specifically sadism, psychopathy, and
Machiavellianism. The fourth component, narcissism, had a negative correlation with trolling
enjoyment. To include the different aspects of trolling more comprehensively, a new scale, the
Global Assessment of Internet Trolling (GAIT), was introduced. Using GAIT scores, these
researchers found sadism to have the most robust association with trolling behaviour, to the
excess that sadists tend to troll because they enjoy it (Buckels, Trapnell & Paulhus 2014, p.
101). This study points to the direction of psychological factors behind trolling behaviour.
Hardaker (2010) defines a troller as
a [computer-mediated communication] user who constructs the identity of sincerely
wishing to be part of the group in question, including professing, or conveying pseudo-
sincere intentions, but whose real intention(s) is/are to cause disruption and/or to trigger
or exacerbate conflict for the purposes of their own amusement. (Hardaker 2010, p. 237)
From a dataset of 186,470 social network posts, she identified four interrelated conditions related
to trolling behaviour: aggression, deception, disruption, and success (Hardaker 2010, pp. 225-
36). This definition maintains the view of trolling as offensive and conducted for achievement of
personal goals, but adds deception to it. There may be many reasons for hiding the true intentions
behind ones messages; but in trolling, there are two main reasons for it. First, it prevents the
targets from reasoning against the trolling influence. A straightforward offensive message can be
dismissed more easily than a subtle suggestion framed as a constructive argument. Secondly,
most discussion forums are moderated, and, typically, openly offensive posts are quickly
removed (although the rules and practices may vary), thus reducing their effectiveness.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 105
The previously mentioned articles define trolling as having no apparent instrumental purpose.
One effect of successful trollingreplacing a factual (or at least civilized) online discussion with
a heated emotional debate with strongly emotional argumentscan, however, be used as a tool.
The motivation behind this type of trolling behaviour is different from those who have an
emotional need for trolling. Spruds et al. (2015) identified two major types of trolls in their
study of Latvia’s three major online news portals: classic and hybrid. The definition of classic
trolls is very close to those offered by Hardaker (2010) and Buckels, Trapnell and Paulhus.
(2014), whereas the hybrid troll is seen as a tool of information warfare. The hybrid troll is
distinguished from the classic troll by behavioural factors: intensively reposted messages,
repeated messages posted from different IP addresses, and/or nicknames and republished
information and links (Spruds et al. 2015). The motivation behind this type of trolling is not the
satisfaction of ones psychological needs but the propagation of a (typically political) agenda.
The various definitions of trolls do not fit together very well as patterns of behaviour to look for
with automated classification systems. Trolls seem to have some of the characteristics of both the
hateholders and fakeholders introduced by Luoma-Aho (2015). In addition, there is some overlap
with other behavioural patterns such as cyberbullying, or various means of psychological
influence. Trolling, nevertheless, is a recognized phenomenon that needs to be given clear
definition in order to have a common ground for discussing the subject and building reliable
tools for automatic identification.
What, then, is the essence of trolling? What are its main elements? First, for the purposes of the
current discussion, trolling does not exist without interaction. An offensive post in somebody´s
personal diary or Internet site, not intended to be read by large audiences, is not trolling. Thus,
trolling exists in the interactive communications of Internet users. Secondly, there are many
ways of influencing other peoples views, varying from the objective presentation of facts to
emotional appeals. Emotional appeals, in turn, vary by the feelings they try to arouse. Trolling
uses offensively charged emotional appeals in order to arouse an aggressive response from the
audience. Third, trolling does not target a single individual, but has the intention of appealing to
as many members of the discussion forum as possible. Mihaylov and Nakov (2016) categorize
two types of opinion manipulation trolls: paid trolls which have been revealed from leaked
“reputation management contracts” and “mentioned trolls” which have been called such by
several different people. This dichotomy indicates that reaction and interaction of discussion
participants provides evidence for trolling. Other options to recognize trolls are to utilise
intelligence information, or leaked information, which is rarely available for the general public,
or to systematically compare information provided by suspected trolls to the information
provided by reliable sources.
The authors of the current paper suggest the following definition for trolling: trolling is a
phenomenon that is experienced in the interactions between Internet users, with the aim of
gaining a strong response from as many users as possible by using offensive, emotionally-
charged content. Therefore, identification of trolling requires the interaction to include 1) the
Internet as platform, 2) offensive and emotional content, and 3) an intended strong response from
the audience.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
106 Journal of Information Warfare
Deceptive practices are excluded from this definition on purpose. A troll may hide or fake his/her
identity or present untruthful information, but the authors suggest trolling to be a pattern of
interaction, regardless of the motivation. This definition thus allows for different types of trolls
with variable motivations and reliability. In other words, trolling is a recognizable technique
rather than an interaction with an offensive intention.
Tools for the Detection of Message Automation
Goals of the experimental case study were 1) to analyse how to detect fakeholders, and 2) to
develop the sentiment analysis tool (Paavola & Jalonen 2015) to detect message automation. The
essential issue of this study is to determine the properties indicating that a message is sent by a
bot or by a cyborg. The first step is to tag these messages manually in order to use classification
models on them. This procedure provides ‘ground truth’, the set of user accounts reliably
classified by a human as bot or cyborg. As the number of analysed user accounts has to be in the
scale of several thousands, this part is time consuming.
Here, case study data consists of Finnish language Twitter messages discussing the Syrian
refugee crisis. Twitter is a microblog service, where, on average, 500 million messages called
tweets (140 characters maximum) are posted on a daily basis. The openness of the Twitter
platform allows, and actually promotes, the automatic sending of messages. It is increasingly
common to send automated messages to human users in an attempt to influence them and to
manipulate sentiment analyses (Clark et al. 2015). Social media analyses can be skewed by bots
that try to dilute legitimate public opinion.
Simple bot detection mechanisms analyse account activity and the related user network
properties. Chu et al. (2012) utilised tweeting timing behaviour, account properties, and spam
detection. An example of a more advanced study is provided by Dickerson, Kagan, and
Subrhamanian (2014). Their aim was to find out the most influential Twitter users in a discussion
about an election in India. To make this kind of assessment requires the exclusion of bots from
the analysis. The authors created a very complex model with tens of variables in order to decide
whether any given user is a human or a bot. Nineteen of those variables were sentiment based.
The authors’ main findings were that bots flip-flop their sentiment less frequently than humans,
that humans express stronger sentiments, and that humans tend to disagree more with the general
sentiment of the discussion.
Sophisticated bot algorithms emulate human behaviour, and the bot detection must be performed
from linguistic attributes (Clark et al. 2015). The authors used three linguistic variables to
determine whether the user is a bot: the average URL count per tweet, the average pairwise
lexical dissimilarity between a user's tweets, and the word introduction rate decay parameter of
the user. With these parameters the authors were able to classify users as humans, bots, cyborgs,
or spammers. The authors concluded that for users, these three attributes are densely clustered,
but can vary greatly for automated user accounts.
Case Study
The development of our automatic bot detection systems was started with cross-topic Twitter
dataset, which was collected from 17 September 2015 to 24 September 2015. It covered more
than 977,000 tweets in Finnish, which were sent by more than 343,000 users.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 107
To develop an automatic classification system, a ground truth dataset needed to be created.
Among the collected data, 2,000 userswho had each sent at least 10 tweets during that one
week periodwere randomly chosen. The sample set contained 83,937 total tweets. Also the
profile data of those 2,000 Twitter users was extracted. Tweets in languages other than Finnish
were excluded from the dataset.
For each sampled user, the following procedure was used. The text content of each tweet was
carefully checked. Other properties such as the tweeting application used, the number of friends
and followers, and in some cases, the user’s homepage were also checked and recorded. In short,
the user was labelled as a human or a bot based on the text of tweets, other information carried
by tweets, the information contained in the user profile, and in some cases, external data. It took
more than a minute on average to classify a user.
The automatic bot detection system was further developed and applied to refugee-related Twitter
messages. The difference between bot and cyborg classification was the level of automation. If
all messages sent by the user were interpreted as automated messages, the classification was
ruled a bot. If only part of the messages seemed to originate from the automation, the
classification was ruled a cyborg. However, if less than one-quarter of the messages were
automated, the classification was ruled a human.
The collection of the refugee crisis tweets was based on Finnish keywords and abbreviations.
The free Twitter search API was used. The refugee dataset was collected from 6 December 2015
to 3 February 2016. The complete dataset contained 59,491 tweets from 15,504 users. The
dataset also contained tweets in other languages, but those were excluded from the dataset based
on the results of the Language Detection Library (Shuyo 2016). After that, Twitter users who had
fewer than 10 tweets in the dataset were also excluded.
The final refugee dataset contained 31,092 tweets from 855 Twitter users. Visualisations were
generated to perform qualitative analysis. Visualisations included the most commonly appearing
hashtags as a word cloud, and locations where tweets had been posted. The latter data was
available only if users had allowed geolocation data to be included.
The system was used to classify each user as either an automated user or a human user. The
automated user can be a cyborg or a bot. A weighted Random Forest algorithm was used. The
refugee dataset was divided into a training set (684 users) and a test set (171 users)to allow
researchers to evaluate the performance of the automatic classifier.
The test set confusion matrix of 171 Twitter users is presented in Table 1, below. The recall is
80 percent24 out of 30 Twitter users that were manually classified by humans as bots or
cyborgs were found as automated users in the test set by the pilot system. On the other hand, the
precision is 86 percent, which indicates that most of the users the system classified as automated
were manually classified as bots or cyborgs.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
108 Journal of Information Warfare
Classified by humans
Bot or Cyborg
Classified by the
pilot system
Human
6
Bot or Cyborg
24
Table 1: Confusion matrix showing very good accuracy and low number of false positive automatic classifications
Some of the features the Random Forest algorithm found most important were the number of
other users mentioned in the user’s tweet on average, the number of links in the user’s messages
on average, and the type categories of the sending application (social media application, mobile
application, or automation application). Also some sentiment-related features were near the top
of the list of about 30 features tested.
Figure 1, below, visually depicts changes in the dataset after automated messages were
excluded. A notable difference can be seen in the most active Twitter accounts. After the
exclusion of automated messages, active human participants in the discussion can be identified.
Thus, it can be concluded that the algorithm was able to remove bot and cyborg accounts. In
tweet locations, no change was observed. Thus, automated accounts do not reveal this
information. Only minor changes were seen in the hashtag list and sentiment result, and thus
figures are not presented here.
Figure 1: Word cloud visualisations of the most active user accounts of the collected dataset and reduced data after
excluding bots and cyborgs. (Translations: pakolaiset = refugees; turvapaikka = asylum; maahanmuutto =
immigration)
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 109
Discussion
This case study offers a solid base for further development. The main weakness of the pilot
system is that it is quite heavily dependent on many Twitter-specific features. It may also be the
case that some of the used features are effective only when the bots and cyborgs are not trying to
conceal themselves. In the future work, it is important to extract and create new and more
complex features, especially from the messages’ text content, such as how much the text content
is changing from one message to another in messages of the same user. This process is guided by
published research (Chu et al. 2012; Clark et al. 2015; Dickerson, Kagan, & Subrhamanian
2014). The abovementioned features based on the text content might have the advantage of being
more easily applicable on the other social media forums outside Twitter, and being more likely to
identify concealed bots and cyborgs.
Future work will investigate how to defend against trolling phenomenon once they can be
automatically identified. The one general psycho-sociological way to deal with trolls who are
systematically spamming information is to limit reactions to those trolls to reminding others not
to respond to them. In a rhizome meshwork, the only course of action is not to feed the trolls. A
troll can disrupt the discussion in a newsgroup, disseminate bad advice, and damage the feeling
of trust in the community.
Focusing on the findings of actors who try to expose these trolling entities may be one way to
detect trolling behaviour. According to Weiss (2016), these actors may be called “elves”. One
aim of a trolling information operation is to break peoples will to defend their knowledge and
beliefs. The lack of trust in one’s information and knowledge creates a favourable environment
for possible hostile information intervention. To keep one’s own information environment
coherent, organisations responsible for ‘ground truth’ must participate in online groups and share
experiences about how to fight against the trolls on social media. Civic activists (elves) must
commit to knocking down disinformation and sharing their experiences in order to grow the
number of new elf-participants. They must not try to be propagandists in reverse, but rather
expose the disinformation by using humour against the trolls. They may post a link that all the
members can go to and leave their comments and reactions (for example, liking something,
disliking it). When trying to figure out the identities of the trolls, elves may begin by at least
locating the country or town from which they come (Weiss 2016). Automated troll detection is
feasible. However, more research work is required to understand trolling mechanisms in order to
train algorithms accordingly.
References
Bae, Y & Lee, H 2012, ‘Sentiment analysis of Twitter audiences: measuring the positive or
negative influence of popular Twitterers’, Journal of the American Society for Information
Science & Technology, vol. 63, no. 12, pp. 2521-35.
Berthon, PR, Pitt, LF, Plangger, K & Shapiro, D 2012, ‘Marketing meets Web 2.0, social media,
and creative consumers: implications for international marketing strategy’, Business Horizons,
vol. 55, no. 3, pp. 261-71.
Buckels, EE, Trapnell, PD & Paulhus, DL 2014, ‘Trolls just want to have fun’, Personality and
Individual Differences, vol. 67, pp. 97-102.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
110 Journal of Information Warfare
Chu Z, Gianvecchio, S, Haining, W, & Sushil, J 2012, ‘Detecting automation of Twitter
accounts: are you a human, bot, or cyborg?, IEEE Transactions on Dependable and Secure
Computing, vol. 9, no. 6, pp. 811-24.
Clark, EM, Williams, JR, Jones CA, Galbraith, RA, Danforth, CM, Dodds, PS 2015, ‘Sifting
robotic from organix text: a natural language approach for detecting automation on Twitter’,
Journal of Computational Science, vol. 16, pp. 1-7.
Coyne, R 2014, The net effect: design, the rhizome, and complex philosophy, viewed 20 August
2016, <http://www.casa.ucl.ac.uk/cupumecid_site/download/Coyne.pdf>.
Deuze, M 2011, ‘Media life’, Media, Culture and Society, vol. 33, no. 1, pp. 137-48.
Deleuze, G & Guattari, F 1983, On the line, MIT Press, New York, NY, U.S.A.
Dickerson, JP, Kagan, V, & Subrhamanian, VS 2014, ‘Using sentiment to detect bots in Twitter:
are humans more opinionated than bots?’, Proceedings of 2014 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing,
China, August.
Habermas, J 1989, The structural transformation of the public sphere: an inquiry into a category
of bourgeois society, MIT Press, Cambridge, MA, U.S.A.
Hardaker, C 2010, ‘Trolling in asynchronous computer-mediated communication: from user
discussions to academic definitions’, Journal of Politeness Research, vol. 6, pp. 215-42.
Jansen, BJ, Zhang, M, Sobel, K & Chowdury, A 2009, ‘Twitter power: tweets as electronic word
of mouth’, Journal of the American Society for Information Science and Technology, vol. 60, no.
11, pp. 2169-88.
Kaplan, AM & Haenlein, M 2010, ‘Users of the world, unite! The challenges and opportunities
of social media’, Business Horizons, vol. 53, pp. 59-68.
Karppi, T 2014, Disconnect me: user engagement and Facebook, Doctoral Dissertation, Annales
Universitatis Turkuensis, Ser. B Tom. 376, Humaniora, University of Turku, Finland.
Lee, S & Cude, BJ 2012, ‘Consumer complaint channel choice in online and off-line purchases’,
International Journal of Consumer Studies, vol. 36, pp. 90-6.
Lewin, K 1943, ‘Forces behind food habits and methods of change’, Bulletin of the National
Research Council, vol. 108, pp. 35-65.
Luoma-Aho, V 2015, ‘Understanding stakeholder engagement: faith-holders, hateholders &
fakeholders’, Research Journal of the Institute for Public Relations, vol. 2, no. 1.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 111
Malgin, A 2015, Kremlin troll army shows Russia isn't Charlie Hebdo, The Moscow Times,
viewed 20 August 2016, <http://www.themoscowtimes.com/opinion/article/russia-is-not-
charlie/514369.html>.
Mihaylov, T, Nakov, P 2016, ‘Hunting for troll comments in news community forums’, The
54th Annual Meeting of the Association for Computational Linguistics Proceedings of the
Conference, vol. 2 (Short Papers), pp. 399-405.
Nacos, BL, Bloch-Elkon, Y & Shapiro RY 2011, Selling fear: counterterrorism, the media, and
public opinion, The University of Chicago Press, Chicago, IL, U.S.A.
Paavola, J & Jalonen, H 2015, ‘An approach to detect and analyze the impact of biased
information sources in the social media’, Proceedings of the 14th European Conference on
Cyber Warfare and Security ECCWS-2015, Academic Conferences and Publishing International
Limited, London, UK, July.
Pentland, Al 2014, Social physics: how good ideas spread--the lessons from a new science, The
Penguin Press, New York, NY, U.S.A.
Robertson, SP, Douglas, S, Maruyma, M & Semaan, B, 2013, Political discourse on social
networking sites: sentiment, in-group/out-group orientation and rationality’, Information Polity:
The International Journal of Government & Democracy in the Information Age, vol. 18, no. 2,
pp. 107-26.
Shuyo, N 2016, language-detection, viewed 20 August 2016, <https://github.com/shuyo>.
Spruds, A, Rožukalne, A, Sedlenieks, K, Daugulis, M, Potjomkina, D, Tölgyesi, B & Bruge, I
2015, Internet trolling as a hybrid warfare tool: the case of Latvia’, NATO STRATCOM Centre
of Excellence Publication.
‘Troll’ 2016, Online Cambridge dictionary, viewed 20 August 2016,
<http://dictionary.cambridge.org/dictionary/english/troll>.
‘Troll’ 2016, Oxford English dictionary online, viewed 20 August 2016,
<http://www.oxforddictionaries.com/definition/english/troll>.
Weiss, M, 2016, The Baltic elves taking on pro-Russian trolls, The Daily Beast, viewed 21
March 2016, <http://www.thedailybeast.com/articles/2016/03/20/the-baltic-elves-taking-on-pro-
russian-trolls.html>.
... Seemingly parallel scholarship, meanwhile, operationalizes trolling in relation to the state-sponsored influence campaigns (Im et al., 2020;Zannettou et al., 2019a), or their orchestration by professional marketing firms (Ong, 2020;Ong & Cabañes, 2019). Still others have proposed understandings of trolling that apply the label distinctly to human actors instead of bots (Bastos & Mercea, 2018;Broniatowski et al., 2018), while conversely some studies suggest trolling may also be automated by bots (Paavola et al., 2016). ...
... Some have jointly examined the activity of bots and trolls, typically by defining trolls as human IRA actors who are assisted by automated bots to amplify their messages (Alsmadi & O'Brien, 2020;Badawy et al., 2018;Bastos & Mercea, 2018;Broniatowski et al., 2018), even when such state-sponsored agents are not ''trolling'' in the behavioral sense. Others, meanwhile, have suggested that trolls may be automated, which blurs this distinction (Paavola et al., 2016). Mixed empirical evidence has likewise indicated that while bots may engage in abusive language in certain cases (Stella et al., 2018;Uyheng & Carley, 2021a), they may also be uncorrelated with abuse in others . ...
... 4. As a strategic behavior, trolling may be a common tactic employed in state-sponsored information operations, but not all state-sponsored accounts may necessarily engage in trolling per se (Zannettou et al., 2019a(Zannettou et al., , 2019b. 5. As a language-based behavior, trolling is conceptually distinct from automation and may therefore be uncorrelated from the likelihood that a trolling account is automated; in other words, trolling may potentially be observed among both bots and humans (Paavola et al., 2016;. ...
Article
This paper posits and tests a social cybersecurity framework to detect and characterize online trolling. Using a dataset of online trolling obtained through active learning, we empirically find that troll messages are significantly associated with more abusive language (p<.001), lower cognitive complexity (p<.01), and greater targeting of named entities (p<.05) and identities (p<.05). These effects are robust to the likelihood that these messages come from bots. We then train and evaluate TrollHunter, a theory-driven and interpretable machine learning model using the derived psycholinguistic features. TrollHunter achieves 89% accuracy and F1 score in detecting trolling messages, with an average 12.25% improvement in performance when relationally modeling conversational context. Explorations of convergent and discriminant validity reveal that our measure of trolling is more closely related to non-hateful offensive speech over hate speech, aggressive over non-aggressive speech, and that Chinese state-sponsored accounts engage in higher levels of trolling than Russian state-sponsored accounts (p<.001). Finally, we apply TrollHunter in a field study to compare the media targets of trolling activity compared to bots as a reference group. Bots dominate replies to exclusive right-leaning media outlets like Breitbart and Newsmax, while trolls disproportionately target outlets with mixed partisan trust like BBC and ABC. This bifurcation suggests that not only are trolls and bots different entities, but they also have different impacts in relation to driving polarization and disinformation in society. Echoing recent calls for interdisciplinary approaches that link computational models with social theory, we conclude with implications for platform regulation and policy-making to curtail the actions of diverse agents of disinformation.
... I. Alieva et al. a bot aided by a human. A "troll" uses social media to intentionally provoke an emotional reaction from as many users as possible by posting offensive and emotionally charged content (Paavola, Helo, Jalonen, Sartonen, & Huhtinen, 2016). The diverse nature of the accounts highlights why relying solely on computational methods is not always possible. ...
Article
Full-text available
The recent COVID-19 outbreak has highlighted the importance of effective communication strategies to control the spread of the virus and debunk misinformation. By using accurate narratives, both online and offline, we can motivate communities to follow preventive measures and shape attitudes toward them. However, the abundance of misinformation stories can lead to vaccine hesitancy, obstructing the timely implementation of preventive measures, such as vaccination. Therefore, it is crucial to create appropriate and community-centered solutions based on regional data analysis to address mis/disinformation narratives and implement effective countermeasures specific to the particular geographic area.In this case study, we have attempted to create a research pipeline to analyze local narratives on social media, particularly Twitter, to identify misinformation spread locally, using the state of Pennsylvania as an example. Our proposed methodology pipeline identifies main communication trends and misinformation stories for the major cities and counties in southwestern PA, aiming to assist local health officials and public health specialists in instantly addressing pandemic communication issues, including misinformation narratives. Additionally, we investigated anti-vax actors' strategies in promoting harmful narratives. Our pipeline includes data collection, Twitter influencer analysis, Louvain clustering, BEND maneuver analysis, bot identification, and vaccine stance detection. Public health organizations and community-centered entities can implement this data-driven approach to health communication to inform their pandemic strategies.
... Different 16 aspects of online social media disinformation have been actively researched over the recent 17 years through qualitative and quantitative methods with the purpose of the detection of 18 malevolent accounts and their discernment from one another. The most studied troll 19 characteristics include linguistic proĄles (e.g., linguistic complexity and emotional charge of 20 the language; Addawood, Badawy, Lerman, & Ferrara, 2019;Monakhov, 2020;Lundberg & 21 Laitinen, 2020; Uyheng & Carley, 2021b;Golino, Christensen, Moulder, Kim, & Boker, 22 2022; Uyheng, Moffitt, & Carley, 2022;Zannettou, CaulĄeld, Setzer, et al., 2019), 23 automation levels (human trolls, bot trolls, or human-bot trolls; Achimescu & Sultanescu,24 2020; Alsmadi & OŠBrien, 2020;Badawy, Ferrara, & Lerman, 2018;Bastos & Mercea, 2018; 25 Broniatowski et al., 2018;Chu, Gianvecchio, Wang, & Jajodia, 2010Chu, 26 Gianvecchio, & Wang, 2018;Paavola, Helo, Jalonen, Sartonen, & Huhtinen, 2016;Uyheng 1 et al., 2022), sponsorship origins (e.g., Russian, Chinese, Iranian state-sponsored, or private 2 entity sponsored; Badawy et al., 2018;Im et al., 2020;Ong & Cabaijes, 2019;Tan, 2020;3 Tapsell, 2021;Uyheng et al., 2022;Zannettou, CaulĄeld, De Cristofaro, et al., 2019;4 Zannettou, CaulĄeld, Setzer, et al., 2019), and potential taxonomies (e.g., provocators, 5 fearmongers, and hashtag gamers; Berghel & Berleant, 2018;Gorwa & Guilbeault, 2020;6 Linvill & Warren, 2020b; Luceri, Deb, Orabi, Mouheb, 7 Al Aghbari, & Kamel, 2020; Shao et al., 2018;Stella, Ferrara, & De Domenico, 2018). 8 Studies of these characteristics provide valuable insight into the general functioning of 9 social media trolls. ...
Preprint
Full-text available
Trump supporting Twitter posting activity from right-wing Russian trolls active during the 2016 United States presidential election was analyzed at multiple timescales using a recently developed procedure for separating linear and nonlinear components of time series. Trump supporting topics were extracted with DynEGA (Dynamic Exploratory Graph Analysis) and analyzed with Hankel Alternative View of Koopman (HAVOK) procedure. HAVOK is an exploratory and predictive technique that extracts a linear model for the time series and a corresponding nonlinear time series that is used as a forcing term for the linear model. Together, this forced linear model can produce surprisingly accurate reconstructions of nonlinear and chaotic dynamics. Using the R package havok, Russian troll data yielded well-fitting models at several timescales, not producing well-fitting models at others, suggesting that only a few timescales were important for representing the dynamics of the troll factory. We identified system features that were timescale-universal versus timescale-specific. Timescale-universal features included cycles inherent to troll factory governance, which identified their work-day and work-week organization, later confirmed from published insider interviews. Cycles were captured by eigen-vector basis components resembling Fourier modes, rather than Legendre polynomials typical for HAVOK. This may be interpreted as the troll factory having intrinsic dynamics that are highly coupled to nearly stationary cycles. Forcing terms were timescale-specific. They represented external events that precipitated major changes in the time series and aligned with major events during the political campaign. HAVOK models specified interactions between the discovered components allowing to reverse-engineer the operation of Russian troll factory. Steps and decision points in the HAVOK analysis are presented and the results are described in detail.
... Actors Agents [1], [34], [93], [116] Affiliation [92], [94] Offensive bots [18], [58], [67], [72] Patterns cyborgs [73], [78], [88] copypasta [100] trolls [25], [60], [83], [104] hijacking [49], [69], [106] Deceptive pseudoentities [63], [107], [120] Patterns astroturfing [42], [76] pseudocontent [24], [46], [111], [112] seed-invite-amplify [2], [116] mainstream [34], [39], [92] Evasive gaming heuristics [45] Patterns ML poisoning attack [48], [75] Channels social media [93], [98], [119] web [15], [45] news [7], [129] messaging [28] Target demographic [13], [32] TABLE VI: Towards Automation: a selection of framework components for which technical approaches with automation potential are actively researched and developed. Determination of other components requires active human-in-the-loop involvement or manual off-platform investigations. ...
Article
Full-text available
This research explores how to identify extreme messages during a hybrid media event happening in a small language area by utilizing natural language processing (NLP), a type of artificial intelligence (AI). A hybrid media event gathers attention all sides of the media environment: mainstream media, social media, instant messaging apps and fringe communities. Hybrid media events call attention for participation and activities both in the physical world and online. On the darker side of media events, the media landscape can act as a channel for all kinds of disinformation, hate speech and conspiracy theories. In addition, fringe communities such as 4chan also spread hate speech and duplicated content during hybrid media events. From theoretical point of view, this connection between the physical world and information networks can be seen as rhizomatic in nature, because information spreads without regard to a traditional hierarchy. The result is that when individuals participate in a big media event, there is a viral awareness of different viewpoints and all kind of topics may be posted online for discussion. In addition, in rhizomatic context different kind of arguments can twist each other, “copy and paste”, and create very diversity meanings of new comments. The role of extremist speech in online spaces can have effects in physical world. The focus of this paper is to present the findings of a case study on messages posted online by three different actor groups who participated in demonstrations organized on Finnish Independence Day. In this research, two data sets were collected from Twitter and Telegram and Natural Language Processing (NLP) was used to classify messages using extremist media index labels. Three actor groups were identified as participating in the demonstrations, and they were labelled as: far-right, antifascists and conspiracists. Computational analysis was done by using NLP to categorize the messages based upon the definitions provided by the extremist media index. The analysis shows how AI technology can help identifying messages which include extremist content and approve the use of violence in a small language area. The model of rhizome was valid in making the connections between fringe, extremist content and moderate discussion visible. This article is part of larger project related to extremist networks and criminality in online darknet environments.
Article
Full-text available
Sentiment analysis and opinion mining are essential tasks with many prominent application areas, e.g., when researching popular opinions on products or brands. Sentiments expressed in social media can be used in brand name monitoring and indicating fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publication aims to remedy this shortcoming by introducing a 27,000-sentence data set annotated independently with sentiment polarity by three native annotators. We had three annotators annotate the whole data set, which provides a unique opportunity for further studies of annotator behavior over the sample annotation order. We analyze their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.
Article
Full-text available
A prominent recurring theme in social comparison is the concept that individuals are not indifferent to the results that others achieve, and typically seek pleasure while avoiding pain. However, in some cases they behave atypically–counter to this principle. The purpose of this research is to investigate one atypical response, namely gluckschmerz–a negative response to information about others’ success (feeling bad at others’ fortunes). To advance objectives, a mixed-mode of two studies were conducted using a combination of primary and secondary analyzes, and qualitative and quantitative methods. Findings reveal that this aversive feeling encourages consumers to share online “positive” information with others but using negative malicious word-of-mouth narratives. They provide compelling evidence supporting the theory that some of the positive commercial information conveyed through electronic media triggers negative word-of mouth in the form of online firestorms driven by the discordant atypical sentiment of gluckschmerz.
Article
During the Tokyo 2020 Olympics, Team USA athlete Simone Biles withdrew from several gymnastics events midcompetition, citing mental health issues. Biles, one of the most recognizable stars of the Games, faced intense scrutiny from both the world’s media and the general public in the immediate aftermath. The purpose of this study was to analyze the Facebook narrative surrounding Biles’s withdrawal within the theoretical context of framing, as crafted through user comments on various public high-profile Facebook pages. A total of 87,714 user comments were collected and analyzed using the qualitative software Leximancer. The themes emerging from the data suggested a polarizing narrative, with many users supporting Biles, engaging in the wider discussion surrounding athlete mental health, while others condemned her action, suggesting she quit on the biggest sporting stage.
Article
Full-text available
Research on modern virtual communication focuses on human communicative behaviour in social networks. For the purposes of successful biometric expertise, linguistic personology is chosen as the most reliable approach. As a complex autonomous interdisciplinary synergistic paradigm, and as a research compendium that uses a wide variety of approaches and methods of modern linguistics, linguistic personology is designed to study the variety of human speech behaviour manifested in the speech of a person. It is also used to identify "bright diagnostic spots" that characterize the verbal messages of any individual linguistic personality in social networks with the purpose of its precise identification. Personality identification, enhanced by linguistic personology methods, is presented as efficient methodology, since it creates conditions for reliability and trust in the process of information exchange, and is based on truthfulness and legitimacy. The compendium of sciences for analyzing speech behaviour and for describing the speech repertoire of the virtual personality is based, on the one hand, on the study of formal discursive characteristics based on style (style-based features), while, on the other, it relies on the study of informal content characteristics (content-based features). Шуменский университет им. Епископа Константина Преславского, г. Шумен, Болгария Аннотация. При изучении современной виртуальной коммуникации фокус внимания исследователя направлен на рассмотрение проявлений коммуникативного поведения человека в социальных сетях. Для целей успешного биометрического анализа речевого поведения личности наиболее надежна лингвистическая пер-сонология как комплексная автономная интердисциплинарная синергетическая парадигма и как исследова-тельский компендиум, опирающийся на разнообразные методы современного языкознания. Лингвистическая персонология, изучая многообразие вариантов речевого поведения человека, устанавливает «яркие диагнос
Conference Paper
Full-text available
Abstract: The paper presumes that social media is an environment where local and small events may escalate into bigger and even global ones in a very short period of time. This is because social media offers opportunities for discussion of shared interest in way which cannot be controlled: everything that can be exposed will be exposed – for all intents and purposes. This possibility has also changed the landscape of discussions of controversial issue, such as foreign and security policy. Compared to traditional mass media, social media enable disclosing opinions without censorship. Nowadays people have access to online discussions, blogs and even websites entirely devoted to sharing negative information. It has been seen that, during crisis situations social media has become a major way of affecting people’s opinions. Consequently we are witnessing the rise of trolls – individual who shares inflammatory, extraneous or off-topic messages in social media, with the primary intent of provoking readers into an emotional response or of otherwise disrupting normal on-topic discussion. Based on the lack of censorship, on the one hand, and trolling behaviour, on the other, the paper aims to understand the rise and diffusion of extreme opinions in Twitter. This is a case study paper, where the analysed case is Twitter messages on Ukrainian crisis during 2014 written in Finnish language. The aim is to utilize sentiment analysis for the automatic detection of trolling behaviour. Sentiment analysis provides tools for strategic communications for the automatic analysis of social media discussions and to recognize opportunities for participating in the discussion at the most effective stage.
Article
Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. Due to the increasing popularity of Twitter, its perceived potential for exerting social influence has led to the rise of a diverse community of automatons, commonly referred to as bots. These inorganic and semi-organic Twitter entities can range from the benevolent (e.g., weather-update bots, help-wanted-alert bots) to the malevolent (e.g., spamming messages, advertisements, or radical opinions). Existing detection algorithms typically leverage meta-data (time between tweets, number of followers, etc.) to identify robotic accounts. Here, we present a powerful classification scheme that exclusively uses the natural language text from organic users to provide a criterion for identifying accounts posting automated messages. Since the classifier operates on text alone, it is flexible and may be applied to any textual data beyond the Twitter-sphere.
Article
Twitter is a popular microblogging service that is used to read and write millions of short messages on any topic within a 140-character limit. Popular or influential users tweet their status and are retweeted, mentioned, or replied to by their audience. Sentiment analysis of the tweets by popular users and their audience reveals whether the audience is favorable to popular users. We analyzed over 3,000,000 tweets mentioning or replying to the 13 most influential users to determine audience sentiment. Twitter messages reflect the landscape of sentiment toward its most popular users. We used the sentiment analysis technique as a valid popularity indicator or measure. First, we distinguished between the positive and negative audiences of popular users. Second, we found that the sentiments expressed in the tweets by popular users influenced the sentiment of their audience. Third, from the above two findings we developed a positive-negative measure for this influence. Finally, using a Granger causality analysis, we found that the time-series-based positive-negative sentiment change of the audience was related to the real-world sentiment landscape of popular users. We believe that the positive-negative influence measure between popular users and their audience provides new insights into the influence of a user and is related to the real world.
Article
Whilst computer-mediated communication (CMC) can benefit users by providing quick and easy communication between those separated by time and space, it can also provide varying degrees of anonymity that may en-courage a sense of impunity and freedom from being held accountable for inappropriate online behaviour. As such, CMC is a fertile ground for study-ing impoliteness, whether it occurs in response to perceived threat (flam-ing), or as an end in its own right (trolling). Currently, first and second-order definitions of terms such as im/politeness (Brown and Levinson 1987; Bousfield 2008; Culpeper 2008; Terkourafi 2008), in-civility (Lakoff 2005), rudeness (Beebe 1995, Kienpointner 1997, 2008), and etiquette (Coulmas 1992), are subject to much discussion and debate, yet the CMC phenomenon of trolling is not adequately captured by any of these terms. Following Bousfield (in press), Culpeper (2010) and others, this paper suggests that a definition of trolling should be informed first and foremost by user discussions. Taking examples from a 172-million-word, asynchro-nous CMC corpus, four interrelated conditions of aggression, deception, disruption, and success are discussed. Finally, a working definition of troll-ing is presented.
Article
This research explored the influence of the purchase environment on the choice of complaint channel. The study was based on responses from 480 undergraduate students who participated in a 2 (purchase environment: offline vs. online) × 2 (the degree of dissatisfaction: weak vs. strong) online experiment. Consumers who purchased online were more likely to complain online than those who made their purchase offline. Online complaining among online purchasers increased with the degree of dissatisfaction. The research suggests that future researchers should include consumer complaint channel choices when examining consumer complaining behaviour.