Content uploaded by Harri Jalonen
Author content
All content in this area was uploaded by Harri Jalonen on Dec 31, 2016
Content may be subject to copyright.
Journal of
Information
Warfare
Volume 15, Issue 4
Fall 2016
Contents
From the Editor
L Armistead
i
Authors
iii
Rhizomatic Target Audiences of the Cyber Domain
M Sartonen, A-M Huhtinen and M Lehto
1
Exploring the Complexity of Cyberspace Governance: State Sovereignty, Multi-
stakeholderism, and Power Politics
A Liaropoulos
14
Applying Principles of Reflexive Control in Information and Cyber Operations
ML Jaitner and MAJ H Kantola
27
Utilising Journey Mapping and Crime Scripting to Combat Cybercrime and Cyber
Warfare Attacks
T Somer, B Hallaq and T Watson
39
Disinformation in Hybrid Warfare: The Rhizomatic Speed of Social Media in the
Spamosphere
A-M Huhtinen and J Rantapelkonen
50
Security-Information Flow in the South African Public Sector
H Patrick, B van Niekerk and Z Fields
68
South Korea’s Options in Responding to North Korean Cyberattacks
J Park, N Rowe and M Cisneros
86
Understanding the Trolling Phenomenon: The Automated Detection of Bots and
Cyborgs in the Social Media
J Paavola, T Helo, H Jalonen, M Sartonen and A-M Huhtinen
100
Journal of Information Warfare iii
Authors
Captain Maribel Cisneros,
United States Army, is a
Military Intelligence Officer
assigned to the U.S. Army
Cyber Command. She was
commissioned as a Second
Lieutenant in the Military
Intelligence Branch in 2007,
and has served as Assistant
S1, Platoon Leader, Company
Executive Officer, Battalion S2, and Company
Commander. She deployed multiple times to
USSOUTHCOM as a Mission Manager and Battle
Captain, and to USCENTCOM as Task Force OIC.
She earned a master’s degree in computer systems
and operations from the Naval Postgraduate School
and a master’s degree in management and leadership
from Webster University.
Dr. Ziska Fields is an
associate professor and
academic leader at the
University of KwaZulu-Natal,
South Africa. Her research
interests focus on creativity,
entrepreneurship, human
resources, and higher
education. She developed two
theoretical models to measure
creativity in South Africa, focusing on youth and
tertiary education. She has published in
internationally recognized journals and edited books.
She is the editor of the book Incorporating business
models and strategies into social entrepreneurship
and has completed another book titled Collective
creativity for responsible and sustainable business
practice. She is a member of the South African
Institute of Management, the Ethics Institute of South
Africa, and the Institute of People Management.
Bil Hallaq is a cyber security researcher with more
than 15 years of academic, commercial and industrial
experience. He previously spent several years
handling and mitigating against various security
threats and vulnerabilities within commercial
environments. He is delivering on various projects
including: the identification and application of novel
techniques for OSINT, EU E-CRIME Project -
comprised of several European partners including
Interpol where he is working with partners on
understanding criminal structures and mapping
cybercriminal activities to produce and recommend
effective countermeasures. His other applied research
areas include identifying methods and techniques for
cross border cyber attack attribution, mitigation at
scale of complex multi-jurisdictional cyber events,
and, maritime and rail cyber security. He holds
several professional qualifications including:
penetration testing, incident response, malware
investigation, digital forensics investigation amongst
others.
Tuomo Helo is a senior
lecturer with Turku University
of Applied Sciences, Turku,
Finland. He earned a master’s
degree in information systems
and a master’s degree in
economics from the
University of Turku, Finland.
His current research interests
include text analytics and data
mining in general. He has also completed research in
the fields of health economics and the economics of
education.
Dr. Aki-Mauri Huhtinen,
(LTC [GS]) is a military
professor in the Department of
Leadership and Military
Pedagogy at the Finnish
National Defence University,
Helsinki, Finland. His areas of
expertise are military
leadership, command and
control, the philosophy of
science in military organisational research, and the
philosophy of war. He has published peer-reviewed
journal articles, a book chapter, and books on
information warfare and non-kinetic influence in the
battle space. He has also organised and led several
research and development projects in the Finnish
Defence Forces from 2005 to 2015.
Margarita Levin Jaitner is a
researcher in the area of
information warfare and
cyberspace—with a particular
focus on Russian operations—
at the Swedish Defence
University, Stockholm,
Sweden. She is also a Fellow
at the Blavatnik
Interdisciplinary Cyber
Research Center. She has previously conducted
research at the Finnish National Defence University
iv Journal of Information Warfare
as well as at the Yuval Ne’eman Workshop for
Security, Science and Technology in Tel Aviv. She
earned a master’s degree in Societal Risk
Management, and a bachelor’s degree in political
science.
Dr. Harri Jalonen is a
principal lecturer and research
group leader (AADI) at the
Turku University of Applied
Sciences, Turku, Finland. He
also holds a position as an
adjunct professor at the
University of Vaasa. He has
research experience dealing
with knowledge and
innovation management and digitalisation issues in
different organisational contexts. He has published
more than 100 articles in these fields. He is one of the
most referred researchers in the field of complexity
thinking in Finland. He has managed or been
involved in many international and national research
projects. In addition, he has guided several thesis
projects, including doctoral theses. He is a reviewer
for many academic journals and a committee member
on international conferences.
Major Harry Kantola
teaches and conducts research
at the Finnish National
Defence University, Helsinki,
Finland. He also is currently
appointed to the Finnish
Defence Command as a Cyber
Defence planner in C5 (J6)
branch. He joined the Finnish
Defence Forces in 1991. He
served in various capacities (CSO, CIO) in the
Finnish Navy, Armoured Signal Coy, and Armoured
Brigade. From 2014 to 2016, he served an
appointment as a researcher at the NATO
Cooperative Cyber Defence Centre of Excellence
(NATO CCD COE), Tallinn, Estonia.
Dr. Martti Lehto (Col.,
retired) works as a cyber
security and cyber defence
professor of practice in the
Department of Mathematical
Information Technology at the
University of Jyväskylä,
Jyväskylä, Finland. He has
more than 30 years of
experience as a developer and
leader of C4ISR Systems in the Finnish Defence
Forces. He has more than 75 publications, research
reports and articles on the areas of C4ISR systems,
cyber security and defence, information warfare, air
power, and defence policy.
Dr. Andrew Liaropoulos is
an assistant professor in the
Department of International
and European Studies at the
University of Piraeus, Greece.
He also teaches in the Joint
Staff War College, the Joint
Military Intelligence College,
the National Security College,
the Air War College, and the
Naval Staff Command College. His research interests
include international security, intelligence reform,
strategy, military transformation, foreign policy
analysis, cyber security, and Greek security policy.
He also serves as a senior analyst in the Research
Institute for European and American Studies
(RIEAS) and as the assistant editor of the Journal of
Mediterranean and Balkan Intelligence.
Dr. Jarkko Paavola is a
research team leader and a
principal lecturer with Turku
University of Applied
Sciences, Turku, Finland. He
earned his doctoral degree in
technology in the field of
wireless communications from
the University of Turku,
Finland. His current research
interests include information security and privacy,
dynamic spectrum sharing, and information security
architectures for systems utilising spectrum sharing.
Major Jimin Park, Republic
of Korea Airforce, is a Cyber
Intel-Ops Officer assigned to
ROK Cyber Command. His
previous assignments have
included the 37th Air
Intelligence Group, and
serving as an Intel-Ops Officer
and an Intel-Watch Officer at
Osan AFB with the U.S. 7th
Airforce. In 2007, he went to Ali Al Salem AFB in
Kuwait with the U.S. Central Command and the
386th Expeditionary Wing as part of Operation Iraqi
Freedom. He earned a master’s degree in computer
science from the U.S. Naval Postgraduate School.
Journal of Information Warfare v
Dr. Harold Patrick is a
forensic investigation
specialist at the University of
KwaZulu-Natal, South Africa.
He completed his doctorate at
University of KwaZulu-Natal
in 2016. His dissertation
focused on information
security, collaboration, and
the flow of security
information. He earned a master’s degree in
information systems and technology and is a
Certified Fraud Examiner.
Dr. Jari Rantapelkonen,
(LTC, retired) is a professor
emeritus at the Finnish
National Defence University,
Helsinki, Finland. His
expertise areas include
operational art and tactics,
military leadership,
information warfare, and the
philosophy of war. He has
served in Afghanistan, the Balkans, and the Middle
East. He is a mayor at the Enontekiö county, Finland
in the Arctic area.
Dr. Neil C. Rowe is a
professor of computer science
at the U.S. Naval Postgraduate
School (Monterey, CA, USA)
where he has been since
1983. He earned a doctorate in
computer science from
Stanford University
(1983). His main research
interests are data mining,
digital forensics, modelling of deception, and cyber
warfare.
Miika Sartonen is a
researcher at the Finnish
Defence Research Agency and
a doctoral student at the
National Defence University.
Tiia Sõmer is an early stage
researcher at Tallinn
University of Technology,
Tallinn, Estonia. Her research
focuses on cyber crime and
cyber forensics, leading TUT
work on the EU E-CRIME
project, a three-year European
Union project, researching the
economic aspects of cyber
crime. In addition, she has taught cyber security at
the strategic level and prepared students for cyber-
defence international policy-level competitions at the
TUT. Before starting an academic career, she served
for more than 20 years in the Estonian defence
forces—including teaching at the staff college;
working in diplomatic positions at national, NATO
and EU levels; and, most recently, working at EDF
HQ cyber security branch. Her master’s thesis, titled
“Educational Computer Game for Cyber Security: A
Game Concept”, focused on using games in the
teaching of cyber security. She is currently
completing Ph.D.-level studies, focusing on journey
mapping and its application in understanding and
solving cyber incidents.
Dr. Brett van Niekerk is a
senior security analyst at
Transnet and an Honorary
Research Fellow at the
University of KwaZulu-Natal,
South Africa. He graduated
from the University of
KwaZulu-Natal with his
doctorate in 2012 and has
completed two years of
postdoctoral research into information operations,
information warfare, and critical infrastructure
protection. He serves on the board for ISACA South
Africa and as secretary for the International
Federation of Information Processing’s Working
Group 9.10 on ICT in Peace and War. He has
contributed to the ISO/IEC information security
standards, and multiple presentations, papers, and
book chapters in information security and
information warfare to his name. He earned
bachelor’s and master’s degrees in electronic
engineering.
vi Journal of Information Warfare
Professor Tim Watson is the
Director of the Cyber
Security Centre at WMG
within the University of
Warwick, Coventry, UK. He
has more than 25 years’
experience in the computing
industry and in academia and
has been involved with a wide
range of computer systems on
several high-profile projects. In addition, he has
served as a consultant for some of the largest
telecoms, power, and oil companies. He is an adviser
to various parts of the UK government and to several
professional and standards bodies. His current
research includes EU-funded projects on combating
cyber-crime; UK MoD research into automated
defence, insider threat, and secure remote working;
and, EPSRC-funded research, focusing on the
protection of critical national infrastructure against
cyber-attack. He is a regular media commentator on
digital forensics and cyber security.
Journal of Information Warfare (2016) 15.4: 100-111 100
ISSN 1445-3312 print/ISSN 1445-3347 online
Understanding the Trolling Phenomenon: The Automated Detection of Bots
and Cyborgs in the Social Media
J Paavola1, T Helo1, H Jalonen1, M Sartonen2, A-M Huhtinen2
1Turku University of Applied Sciences
Turku, Finland
E-mail: jarkko.paavola@turkuamk.fi; tuomo.helo@turkuamk.fi; harri.jalonen@turkuamk.fi
2Finnish National Defence University
Helsinki, Finland
E-mail: miika.sartonen@mil.fi; aki.huhtinen@mil.fi
Abstract: Social media has become a place for discussion and debate on controversial topics
and, thus, provides an opportunity to influence public opinion. This possibility has given rise to a
specific behaviour known as trolling, which can be found in almost every discussion that
includes emotionally appealing topics. Trolling is a useful tool for any organisation willing to
force a discussion off-track when one has no proper facts to back one’s arguments. Previous
research has indicated that social media analytics tools can be utilised for automated detection
of trolling. This paper provides tools for detecting message automation utilized in trolling.
Keywords: Social Media, Stakeholder, Trolling, Sentiment Analysis, Bot, Cyborg
Introduction
The current stage in the evolution of information is one in which the unpredictability of its
effects is accelerating. The volume of information is growing, and its structure is becoming
increasingly opaque. Information can no longer be seen as a system or as the extent of one’s
knowledge, but must rather be seen as an entity that has started to live a life of its own. Thus,
information provides its own energy and is its own enemy. In most cases, information is also a
source of beneficial development and can improve people’s quality of life. It is essential,
however, to understand that it can also unleash danger and adversity.
Due to the plethora of information available, people are not always able to determine whether
information is valid, and consequently tend to make hasty presumptions with the data they have.
This tendency is utilized by ‘trolling’, which has come to be equated by the media in recent years
with online harassment. Because of trolling, it is becoming increasingly difficult to pinpoint
where information originates and where it leads (Malgin 2015).
In today’s era of information overload, individuals and groups try to get their messages across by
using forceful language, by engaging in dramatic (even violent) actions, or by posting video clips
or pictures on social media (Nacos, Bloch-Elkon & Shapiro 2011, p. 48). The politically-driven
mass media is most probably behind this information overload on individuals. Aggressive
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 101
behaviour is increasing in social media because of the technical ease with which trolling can be
carried out. In social media, all kinds of values become interwoven with each other.
Information is essentially a product of engineering science. In order to expand the sphere of
understanding to information as a part of human social life, one has to step outside of the ‘hard
sciences’ realm. Social sciences’ viewpoint is especially called for when discussing possible
threats and human fears connected with information. The rise in diverse Internet threats has
opened up the discussion of the possibility of nation states’ extending their capacity to control
information networks, including citizens’ private communications.
Computer culture theorists have identified the richly interconnected, heterogeneous, and
somewhat anarchic aspect of the Internet as a rhizomic social condition (Coyne 2014). During
the past quarter of a century, the usefulness of the Internet has permeated all domains
(individual, social, political, military, and business). Everyone worldwide can use the Internet
without any specific education, as the skills needed for communicating in the social media are
easy to acquire. At the same time, work-related and official messages run parallel with private
communications. Similarly, emotions and rational thinking may easily become intertwined due
to the ease and immediacy of our communications. Deleuze and Guattari (1983) use the terms
“rhizome” and “rhizomatic” to describe this non-hierarchical, nomadic and easy environment,
particularly in relation to how individuals behave in this kind of environment. The current status
of the technological evolution of the Internet can also be said to be based on the rhizome
concept. The rhizome resists the organizational structure of the root-tree system, which charts
causality along chronological lines, and looks for the original source of ‘things’, as well as
toward the pinnacle or conclusion of those ‘things’. Any point on a rhizome can be connected
with any other. A rhizome can be cracked and broken at any point; it starts off again following
one or another of its lines, or even other lines (Deleuze & Guattari 1983, p. 15).
Why is trolling so easy to implement in rhizomatic information networks? The intercontinental
network of communication is not an organized structure: it has no central head or decision-
maker; it has no central command or hierarchies to quell undesired behaviour. The rhizomatic
network is simply too big and diffuse to be managed by a central command. By the same token,
rhizomatic organizations are often highly creative and innovative. The rhizome presents history
and culture as a map, or a wide array of attractions and influences with no specific origin or
genesis, for a “rhizome has no beginning or end; it is always ‘becoming’ in the middle and
between things” (Deleuze & Guattari 1983). One example of the diversity of rhizome networks
is the fakeholder behaviour.
This paper continues the work done by Paavola and Jalonen (2015), who examined whether
sentiment analysis could be utilised in detecting trolling behaviour. Sentiment analysis refers to
the automatic classification of messages into positive, negative, or neutral within a discussion
topic. In that work, Paavola and Jalonen concluded that sentiment analysis as such cannot detect
trolls, but results indicated that social media analytics tools can generally be utilised for this task.
Here, the authors’ goal is to investigate trolling phenomenon further.
To facilitate analysis, a sentiment analysis tool (Paavola & Jalonen 2015) was further developed
to detect message automation, which creates ‘noise’ in social media and makes it difficult to
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
102 Journal of Information Warfare
observe behavioural changes among human users of the social media. Paavola and Jalonen’s
work followed studies performed by Chu et al. (2012), Dickerson, Kagan, and Subrhamanian
(2014), and Clark et al. (2015) in which bot detection systems were devised. Components such
as message timing behaviour, spam detection, account properties, and linguistic attributes were
investigated. Variables were designed based on those components, and they were utilised in
order to categorize message senders as humans, bots, or cyborgs. The bot refers to computer
software that generates messages automatically whereas the cyborg in this context refers either to
a bot-assisted human or to a human-assisted bot.
This paper is organized as follows: First, the social media phenomenon is described, which gives
context to the trolling behaviour. Then, trolling behaviour in the social media is analysed and the
mechanisms characteristic to trolling are discussed. Before algorithms can be trained to detect
trolls, the definition of troll has to be accurate, which is not the case in the current literature.
Therefore, the trolling phenomenon is discussed before proceeding to the automated detection. In
the experimental part of the paper, automated bot and cyborg detection is applied to Twitter
messages. Finally, the discussion section provides future directions for troll detection and how to
defend against the trolling phenomenon.
Social Media as a Public Sphere
The promise of social media is not confined to technology, but involves cultural, societal, and
economic consequences. Social media refers herein to a constellation of Internet-based
applications that derive their value from the participation of users through directly creating
original content, modifying existing material, contributing to a community dialogue, and
integrating various media together to create something unique (Kaplan & Haenlein 2010). Social
media has engendered three changes: 1) the locus of activity shifts from the desktop to the web;
2) the locus of power shifts from the organization to the collective; and 3) the locus of value
creation shifts from the organization to the consumer (Berthon et al. 2012).
Social media has become integrated into the lives of postmodern people. Globally, more than
two billion people use social media on a daily basis. Whether it’s a question of the comments of
statesmen, opposition leaders’ criticism, or celebrities’ publicity tricks, social media offers an
authentic information source and effective communication channel. Social media enables
interaction between friends and strangers at the same time that it lowers the threshold of contact
and personalizes communication. In a way, social media has made the world smaller.
Social media has brought with it ‘media life’, which Deuze (2011) calls ‘“the state where media
has become so inseparable from us that we do not live with media, but in it’” (Karppi 2014, p.
22). In a hyper-connected network society, posts on Twitter cause stock market crashes and
overthrow governments (Pentland 2014). Unsurprisingly, life in social media is as messy as it is
in the real world. Social media provides people exposure to new information and ideas, reflects
their everyday highs and lows, allows for engagement in new friendships and breaking up of old
ones, makes other people delighted or jealous by posting holiday and party photos, praises and
complains about brands, and idolises the achievements of descendants and pets. Stated a bit
simply, users’ behaviour in social media can be categorised into two types: rational/information-
seeking and emotional/affective-seeking behaviours (Jansen et al. 2009). A desire to address a
gap in information concerning events, organisations, or issues is an example of information-
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 103
seeking behaviour in social media, whereas affective-seeking behaviour stands for the expression
of opinion about events, organisations, or issues.
The penetration of the Internet and the growing number of social media platforms that allow
user-generated content has not happened without consequences: on the one hand, Internet-wide
freedom of speech has created bloggers and others who magnetise substantial audiences; and on
the other hand, the Internet has democratised communication by enabling anyone (in theory) to
say anything. Obviously, the consequences can be good or bad. A positive interpretation of
freedom of speech, in turn, is that it enables the emergence of ‘public spheres’ envisioned by
Habermas (1989). Internet-based public spheres enable civic activities and political participation;
that is, “citizens can gather together virtually, irrespective of geographic location, and engage in
information exchange and rational discussion” (Robertson et al. 2013). On the other hand, many
studies have pointed out that social media has become a place for venting negative experiences
and expressing dissatisfaction (Lee & Cude 2012; Bae & Lee 2012). Due to the lack of
gatekeepers (Lewin 1943), social media provokes not only sharing rumours and expressing
conflicting views, but also bullying, harassment, and hate speech. In addition to providing a
forum for sharing information, social media is also a channel for propagating misinformation.
Terrorist organizations, such as ISIS, have quite effectively deployed social media in recruiting
members, disseminating propaganda, and inciting fear. It has also been asked whether
governments use social media for painting black into white. The question is justifiable as the
significance of the change entailed with the emergence of public spheres becomes concrete in
countries which have prohibited or hampered the use of social media.
To connect public discussion theory and trolling, this work is based on a public discussion
stakeholder classification made by Luoma-Aho (2015). The classification includes positively
engaged faith-holders, negatively engaged hateholders, and fakeholders. Trolls can be considered
as either hateholders (humans) or fakeholders (bots or cyborgs). Luoma-Aho states that the
influence of a fakeholder appears larger than it really is in practice, but tools for analysing the
impact are not provided. In order to have a more thorough view of the discussion, it would be
important to know the sources behind the fakeholders’ arguments; but like the artists of black
propaganda, they attempt to hide themselves. It can be hypothesized that the role of fakeholders
increases with subjects whose legitimacy is questioned or challenged, and when the public is
confused about the relevance and significance of the arguments presented in various social media
platforms.
Studies have confirmed what every social media user already knows: virtual public spheres
attract users which Luoma-Aho (2015) has named as hateholders and fakeholders. Social media
provides hateholders with continuously changing targets and stimulus. Hateholders’ behaviour
can be harsh, hurtful, and offensive, and it should therefore be condemned. Although fighting
against hateholders is not an easy task, it is possible because hateholders’ behaviour is visible.
Hateholders do not typically try to hide. To the contrary, they pursue publicity. Fakeholders, in
turn, act in the shadows. Although their behaviour can also be harsh, hurtful, and offensive, it is
difficult to get hands on them. Acting through fake identities and using sophisticated persona-
management software, fakeholders aim to violate their targets.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
104 Journal of Information Warfare
Trolling as a Phenomenon
During the experiment phase of this research, the authors found that the definitions used for
trolling were not specific enough. Human communication, including that which is malevolent, is
so diverse and contextual that in order to create automatic classification systems for trolling, a
clear definition is needed. Trolling, as a means of either distracting a conversation or simply
provoking an emotional answer can utilise multiple context-based ways of influence. A positive
word or sentence can, in the right context, actually be an insult. To have any success in creating
trolling identification systems, the authors first had to create a specific description of what they
were looking for.
The Online Cambridge dictionary (2016) defines a troll as “someone who leaves an intentionally
annoying message on the Internet, in order to get attention or cause trouble” or as “a message
that someone leaves on the Internet that is intended to annoy people”. Oxford English Dictionary
Online (2016) defines a troll as “a person who makes a deliberately offensive or provocative
online post” or “a deliberately offensive or provocative online post”. These two definitions refer
to two specific characteristics of trolling: that trolling is something that happens online and that
the intention of trolling is to offend someone.
Buckels, Trapnell and Paulhus (2014) studied the motivation behind trolling behaviour from a
psychological viewpoint. They found out that self-reported enjoyment of trolling was positively
correlated with three components of the Dark Tetriad, specifically sadism, psychopathy, and
Machiavellianism. The fourth component, narcissism, had a negative correlation with trolling
enjoyment. To include the different aspects of trolling more comprehensively, a new scale, the
Global Assessment of Internet Trolling (GAIT), was introduced. Using GAIT scores, these
researchers found sadism to have the most robust association with trolling behaviour, to the
excess that “sadists tend to troll because they enjoy it” (Buckels, Trapnell & Paulhus 2014, p.
101). This study points to the direction of psychological factors behind trolling behaviour.
Hardaker (2010) defines a troller as
a [computer-mediated communication] user who constructs the identity of sincerely
wishing to be part of the group in question, including professing, or conveying pseudo-
sincere intentions, but whose real intention(s) is/are to cause disruption and/or to trigger
or exacerbate conflict for the purposes of their own amusement. (Hardaker 2010, p. 237)
From a dataset of 186,470 social network posts, she identified four interrelated conditions related
to trolling behaviour: aggression, deception, disruption, and success (Hardaker 2010, pp. 225-
36). This definition maintains the view of trolling as offensive and conducted for achievement of
personal goals, but adds deception to it. There may be many reasons for hiding the true intentions
behind one’s messages; but in trolling, there are two main reasons for it. First, it prevents the
targets from reasoning against the trolling influence. A straightforward offensive message can be
dismissed more easily than a subtle suggestion framed as a constructive argument. Secondly,
most discussion forums are moderated, and, typically, openly offensive posts are quickly
removed (although the rules and practices may vary), thus reducing their effectiveness.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 105
The previously mentioned articles define trolling as having no apparent instrumental purpose.
One effect of successful trolling—replacing a factual (or at least civilized) online discussion with
a heated emotional debate with strongly emotional arguments—can, however, be used as a tool.
The motivation behind this type of trolling behaviour is different from those who have an
emotional need for trolling. Spruds et al. (2015) identified two major types of trolls in their
study of Latvia’s three major online news portals: classic and hybrid. The definition of classic
trolls is very close to those offered by Hardaker (2010) and Buckels, Trapnell and Paulhus.
(2014), whereas the hybrid troll is seen as a tool of information warfare. The hybrid troll is
distinguished from the classic troll by behavioural factors: intensively reposted messages,
repeated messages posted from different IP addresses, and/or nicknames and republished
information and links (Spruds et al. 2015). The motivation behind this type of trolling is not the
satisfaction of one’s psychological needs but the propagation of a (typically political) agenda.
The various definitions of trolls do not fit together very well as patterns of behaviour to look for
with automated classification systems. Trolls seem to have some of the characteristics of both the
hateholders and fakeholders introduced by Luoma-Aho (2015). In addition, there is some overlap
with other behavioural patterns such as cyberbullying, or various means of psychological
influence. Trolling, nevertheless, is a recognized phenomenon that needs to be given clear
definition in order to have a common ground for discussing the subject and building reliable
tools for automatic identification.
What, then, is the essence of trolling? What are its main elements? First, for the purposes of the
current discussion, trolling does not exist without interaction. An offensive post in somebody´s
personal diary or Internet site, not intended to be read by large audiences, is not trolling. Thus,
trolling exists in the interactive communications of Internet users. Secondly, there are many
ways of influencing other people’s views, varying from the objective presentation of facts to
emotional appeals. Emotional appeals, in turn, vary by the feelings they try to arouse. Trolling
uses offensively charged emotional appeals in order to arouse an aggressive response from the
audience. Third, trolling does not target a single individual, but has the intention of appealing to
as many members of the discussion forum as possible. Mihaylov and Nakov (2016) categorize
two types of opinion manipulation trolls: “paid trolls” which have been revealed from leaked
“reputation management contracts” and “mentioned trolls” which have been called such by
several different people. This dichotomy indicates that reaction and interaction of discussion
participants provides evidence for trolling. Other options to recognize trolls are to utilise
intelligence information, or leaked information, which is rarely available for the general public,
or to systematically compare information provided by suspected trolls to the information
provided by reliable sources.
The authors of the current paper suggest the following definition for trolling: trolling is a
phenomenon that is experienced in the interactions between Internet users, with the aim of
gaining a strong response from as many users as possible by using offensive, emotionally-
charged content. Therefore, identification of trolling requires the interaction to include 1) the
Internet as platform, 2) offensive and emotional content, and 3) an intended strong response from
the audience.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
106 Journal of Information Warfare
Deceptive practices are excluded from this definition on purpose. A troll may hide or fake his/her
identity or present untruthful information, but the authors suggest trolling to be a pattern of
interaction, regardless of the motivation. This definition thus allows for different types of trolls
with variable motivations and reliability. In other words, trolling is a recognizable technique
rather than an interaction with an offensive intention.
Tools for the Detection of Message Automation
Goals of the experimental case study were 1) to analyse how to detect fakeholders, and 2) to
develop the sentiment analysis tool (Paavola & Jalonen 2015) to detect message automation. The
essential issue of this study is to determine the properties indicating that a message is sent by a
bot or by a cyborg. The first step is to tag these messages manually in order to use classification
models on them. This procedure provides ‘ground truth’, the set of user accounts reliably
classified by a human as bot or cyborg. As the number of analysed user accounts has to be in the
scale of several thousands, this part is time consuming.
Here, case study data consists of Finnish language Twitter messages discussing the Syrian
refugee crisis. Twitter is a microblog service, where, on average, 500 million messages called
tweets (140 characters maximum) are posted on a daily basis. The openness of the Twitter
platform allows, and actually promotes, the automatic sending of messages. It is increasingly
common to send automated messages to human users in an attempt to influence them and to
manipulate sentiment analyses (Clark et al. 2015). Social media analyses can be skewed by bots
that try to dilute legitimate public opinion.
Simple bot detection mechanisms analyse account activity and the related user network
properties. Chu et al. (2012) utilised tweeting timing behaviour, account properties, and spam
detection. An example of a more advanced study is provided by Dickerson, Kagan, and
Subrhamanian (2014). Their aim was to find out the most influential Twitter users in a discussion
about an election in India. To make this kind of assessment requires the exclusion of bots from
the analysis. The authors created a very complex model with tens of variables in order to decide
whether any given user is a human or a bot. Nineteen of those variables were sentiment based.
The authors’ main findings were that bots flip-flop their sentiment less frequently than humans,
that humans express stronger sentiments, and that humans tend to disagree more with the general
sentiment of the discussion.
Sophisticated bot algorithms emulate human behaviour, and the bot detection must be performed
from linguistic attributes (Clark et al. 2015). The authors used three linguistic variables to
determine whether the user is a bot: the average URL count per tweet, the average pairwise
lexical dissimilarity between a user's tweets, and the word introduction rate decay parameter of
the user. With these parameters the authors were able to classify users as humans, bots, cyborgs,
or spammers. The authors concluded that for users, these three attributes are densely clustered,
but can vary greatly for automated user accounts.
Case Study
The development of our automatic bot detection systems was started with cross-topic Twitter
dataset, which was collected from 17 September 2015 to 24 September 2015. It covered more
than 977,000 tweets in Finnish, which were sent by more than 343,000 users.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 107
To develop an automatic classification system, a ground truth dataset needed to be created.
Among the collected data, 2,000 users—who had each sent at least 10 tweets during that one
week period—were randomly chosen. The sample set contained 83,937 total tweets. Also the
profile data of those 2,000 Twitter users was extracted. Tweets in languages other than Finnish
were excluded from the dataset.
For each sampled user, the following procedure was used. The text content of each tweet was
carefully checked. Other properties such as the tweeting application used, the number of friends
and followers, and in some cases, the user’s homepage were also checked and recorded. In short,
the user was labelled as a human or a bot based on the text of tweets, other information carried
by tweets, the information contained in the user profile, and in some cases, external data. It took
more than a minute on average to classify a user.
The automatic bot detection system was further developed and applied to refugee-related Twitter
messages. The difference between bot and cyborg classification was the level of automation. If
all messages sent by the user were interpreted as automated messages, the classification was
ruled a bot. If only part of the messages seemed to originate from the automation, the
classification was ruled a cyborg. However, if less than one-quarter of the messages were
automated, the classification was ruled a human.
The collection of the refugee crisis tweets was based on Finnish keywords and abbreviations.
The free Twitter search API was used. The refugee dataset was collected from 6 December 2015
to 3 February 2016. The complete dataset contained 59,491 tweets from 15,504 users. The
dataset also contained tweets in other languages, but those were excluded from the dataset based
on the results of the Language Detection Library (Shuyo 2016). After that, Twitter users who had
fewer than 10 tweets in the dataset were also excluded.
The final refugee dataset contained 31,092 tweets from 855 Twitter users. Visualisations were
generated to perform qualitative analysis. Visualisations included the most commonly appearing
hashtags as a word cloud, and locations where tweets had been posted. The latter data was
available only if users had allowed geolocation data to be included.
The system was used to classify each user as either an automated user or a human user. The
automated user can be a cyborg or a bot. A weighted Random Forest algorithm was used. The
refugee dataset was divided into a training set (684 users) and a test set (171 users)—to allow
researchers to evaluate the performance of the automatic classifier.
The test set confusion matrix of 171 Twitter users is presented in Table 1, below. The recall is
80 percent—24 out of 30 Twitter users that were manually classified by humans as bots or
cyborgs were found as automated users in the test set by the pilot system. On the other hand, the
precision is 86 percent, which indicates that most of the users the system classified as automated
were manually classified as bots or cyborgs.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
108 Journal of Information Warfare
Classified by humans
Human
Bot or Cyborg
Classified by the
pilot system
Human
137
6
Bot or Cyborg
4
24
Table 1: Confusion matrix showing very good accuracy and low number of false positive automatic classifications
Some of the features the Random Forest algorithm found most important were the number of
other users mentioned in the user’s tweet on average, the number of links in the user’s messages
on average, and the type categories of the sending application (social media application, mobile
application, or automation application). Also some sentiment-related features were near the top
of the list of about 30 features tested.
Figure 1, below, visually depicts changes in the dataset after automated messages were
excluded. A notable difference can be seen in the most active Twitter accounts. After the
exclusion of automated messages, active human participants in the discussion can be identified.
Thus, it can be concluded that the algorithm was able to remove bot and cyborg accounts. In
tweet locations, no change was observed. Thus, automated accounts do not reveal this
information. Only minor changes were seen in the hashtag list and sentiment result, and thus
figures are not presented here.
Figure 1: Word cloud visualisations of the most active user accounts of the collected dataset and reduced data after
excluding bots and cyborgs. (Translations: pakolaiset = refugees; turvapaikka = asylum; maahanmuutto =
immigration)
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 109
Discussion
This case study offers a solid base for further development. The main weakness of the pilot
system is that it is quite heavily dependent on many Twitter-specific features. It may also be the
case that some of the used features are effective only when the bots and cyborgs are not trying to
conceal themselves. In the future work, it is important to extract and create new and more
complex features, especially from the messages’ text content, such as how much the text content
is changing from one message to another in messages of the same user. This process is guided by
published research (Chu et al. 2012; Clark et al. 2015; Dickerson, Kagan, & Subrhamanian
2014). The abovementioned features based on the text content might have the advantage of being
more easily applicable on the other social media forums outside Twitter, and being more likely to
identify concealed bots and cyborgs.
Future work will investigate how to defend against trolling phenomenon once they can be
automatically identified. The one general psycho-sociological way to deal with trolls who are
systematically spamming information is to limit reactions to those trolls to reminding others not
to respond to them. In a rhizome meshwork, the only course of action is not to feed the trolls. A
troll can disrupt the discussion in a newsgroup, disseminate bad advice, and damage the feeling
of trust in the community.
Focusing on the findings of actors who try to expose these trolling entities may be one way to
detect trolling behaviour. According to Weiss (2016), these actors may be called “elves”. One
aim of a trolling information operation is to break peoples’ will to defend their knowledge and
beliefs. The lack of trust in one’s information and knowledge creates a favourable environment
for possible hostile information intervention. To keep one’s own information environment
coherent, organisations responsible for ‘ground truth’ must participate in online groups and share
experiences about how to fight against the trolls on social media. Civic activists (elves) must
commit to knocking down disinformation and sharing their experiences in order to grow the
number of new elf-participants. They must not try to be propagandists in reverse, but rather
expose the disinformation by using humour against the trolls. They may post a link that all the
members can go to and leave their comments and reactions (for example, liking something,
disliking it). When trying to figure out the identities of the trolls, elves may begin by at least
locating the country or town from which they come (Weiss 2016). Automated troll detection is
feasible. However, more research work is required to understand trolling mechanisms in order to
train algorithms accordingly.
References
Bae, Y & Lee, H 2012, ‘Sentiment analysis of Twitter audiences: measuring the positive or
negative influence of popular Twitterers’, Journal of the American Society for Information
Science & Technology, vol. 63, no. 12, pp. 2521-35.
Berthon, PR, Pitt, LF, Plangger, K & Shapiro, D 2012, ‘Marketing meets Web 2.0, social media,
and creative consumers: implications for international marketing strategy’, Business Horizons,
vol. 55, no. 3, pp. 261-71.
Buckels, EE, Trapnell, PD & Paulhus, DL 2014, ‘Trolls just want to have fun’, Personality and
Individual Differences, vol. 67, pp. 97-102.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
110 Journal of Information Warfare
Chu Z, Gianvecchio, S, Haining, W, & Sushil, J 2012, ‘Detecting automation of Twitter
accounts: are you a human, bot, or cyborg?’, IEEE Transactions on Dependable and Secure
Computing, vol. 9, no. 6, pp. 811-24.
Clark, EM, Williams, JR, Jones CA, Galbraith, RA, Danforth, CM, Dodds, PS 2015, ‘Sifting
robotic from organix text: a natural language approach for detecting automation on Twitter’,
Journal of Computational Science, vol. 16, pp. 1-7.
Coyne, R 2014, The net effect: design, the rhizome, and complex philosophy, viewed 20 August
2016, <http://www.casa.ucl.ac.uk/cupumecid_site/download/Coyne.pdf>.
Deuze, M 2011, ‘Media life’, Media, Culture and Society, vol. 33, no. 1, pp. 137-48.
Deleuze, G & Guattari, F 1983, On the line, MIT Press, New York, NY, U.S.A.
Dickerson, JP, Kagan, V, & Subrhamanian, VS 2014, ‘Using sentiment to detect bots in Twitter:
are humans more opinionated than bots?’, Proceedings of 2014 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing,
China, August.
Habermas, J 1989, The structural transformation of the public sphere: an inquiry into a category
of bourgeois society, MIT Press, Cambridge, MA, U.S.A.
Hardaker, C 2010, ‘Trolling in asynchronous computer-mediated communication: from user
discussions to academic definitions’, Journal of Politeness Research, vol. 6, pp. 215-42.
Jansen, BJ, Zhang, M, Sobel, K & Chowdury, A 2009, ‘Twitter power: tweets as electronic word
of mouth’, Journal of the American Society for Information Science and Technology, vol. 60, no.
11, pp. 2169-88.
Kaplan, AM & Haenlein, M 2010, ‘Users of the world, unite! The challenges and opportunities
of social media’, Business Horizons, vol. 53, pp. 59-68.
Karppi, T 2014, Disconnect me: user engagement and Facebook, Doctoral Dissertation, Annales
Universitatis Turkuensis, Ser. B Tom. 376, Humaniora, University of Turku, Finland.
Lee, S & Cude, BJ 2012, ‘Consumer complaint channel choice in online and off-line purchases’,
International Journal of Consumer Studies, vol. 36, pp. 90-6.
Lewin, K 1943, ‘Forces behind food habits and methods of change’, Bulletin of the National
Research Council, vol. 108, pp. 35-65.
Luoma-Aho, V 2015, ‘Understanding stakeholder engagement: faith-holders, hateholders &
fakeholders’, Research Journal of the Institute for Public Relations, vol. 2, no. 1.
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media
Journal of Information Warfare 111
Malgin, A 2015, ‘Kremlin troll army shows Russia isn't Charlie Hebdo’, The Moscow Times,
viewed 20 August 2016, <http://www.themoscowtimes.com/opinion/article/russia-is-not-
charlie/514369.html>.
Mihaylov, T, Nakov, P 2016, ‘Hunting for troll comments in news community forums’, The
54th Annual Meeting of the Association for Computational Linguistics Proceedings of the
Conference, vol. 2 (Short Papers), pp. 399-405.
Nacos, BL, Bloch-Elkon, Y & Shapiro RY 2011, Selling fear: counterterrorism, the media, and
public opinion, The University of Chicago Press, Chicago, IL, U.S.A.
Paavola, J & Jalonen, H 2015, ‘An approach to detect and analyze the impact of biased
information sources in the social media’, Proceedings of the 14th European Conference on
Cyber Warfare and Security ECCWS-2015, Academic Conferences and Publishing International
Limited, London, UK, July.
Pentland, Al 2014, Social physics: how good ideas spread--the lessons from a new science, The
Penguin Press, New York, NY, U.S.A.
Robertson, SP, Douglas, S, Maruyma, M & Semaan, B, 2013, ‘Political discourse on social
networking sites: sentiment, in-group/out-group orientation and rationality’, Information Polity:
The International Journal of Government & Democracy in the Information Age, vol. 18, no. 2,
pp. 107-26.
Shuyo, N 2016, language-detection, viewed 20 August 2016, <https://github.com/shuyo>.
Spruds, A, Rožukalne, A, Sedlenieks, K, Daugulis, M, Potjomkina, D, Tölgyesi, B & Bruge, I
2015, ‘Internet trolling as a hybrid warfare tool: the case of Latvia’, NATO STRATCOM Centre
of Excellence Publication.
‘Troll’ 2016, Online Cambridge dictionary, viewed 20 August 2016,
<http://dictionary.cambridge.org/dictionary/english/troll>.
‘Troll’ 2016, Oxford English dictionary online, viewed 20 August 2016,
<http://www.oxforddictionaries.com/definition/english/troll>.
Weiss, M, 2016, The Baltic elves taking on pro-Russian trolls, The Daily Beast, viewed 21
March 2016, <http://www.thedailybeast.com/articles/2016/03/20/the-baltic-elves-taking-on-pro-
russian-trolls.html>.