Conference PaperPDF Available

Abstract and Figures

Social media are widely used among terrorists to communicate and disseminate their activities. User-to-user interaction (e.g. mentions, follows) leads to the formation of complex networks, with topology that reveals key-players and key-communities in the terrorism domain. Both the administrators of social media platforms and Law Enforcement Agencies seek to identify not only single users but groups of terrorism-related users so that they can reduce the impact of their information exchange efforts. To this end, we propose a novel framework that combines community detection with key-player identification to retrieve communities of terrorism-related social media users. Experiments show that most of the members of each retrieved key-community are already suspended by Twitter, violating its terms, and are hence associated with terrorism-oriented content with high probability.
Content may be subject to copyright.
Detection of Terrorism-related Twier Communities using
Centrality Scores
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Symeon Papadopoulos, Stefanos
Vrochidis and Ioannis Kompatsiaris
Information Technologies Institute
Centre for Research and Technology Hellas
essaloniki, Greece 57001
{heliasgj,kalpakis,theodora.tsikrika,papadop,stefanos,ikom}@iti.gr
ABSTRACT
Social media are widely used among terrorists to communicate and
disseminate their activities. User-to-user interaction (e.g. mentions,
follows) leads to the formation of complex networks, with topology
that reveals key-players and key-communities in the terrorism
domain. Both the administrators of social media platforms and Law
Enforcement Agencies seek to identify not only single users but
groups of terrorism-related users so that they can reduce the impact
of their information exchange eorts. To this end, we propose a
novel framework that combines community detection with key-
player identication to retrieve communities of terrorism-related
social media users. Experiments show that most of the members of
each retrieved key-community are already suspended by Twier,
violating its terms, and are hence associated with terrorism-oriented
content with high probability.
CCS CONCEPTS
Information systems Information retrieval; Test collec-
tions;
Web searching and information discovery; Multimedia and
multimodal retrieval;
Human-centered computing Social
networking sites; Networks Social media networks;
KEYWORDS
Social Network Analysis, key-player identication, community
detection, terrorism-oriented social media mining, Twier
ACM Reference format:
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Symeon Pa-
padopoulos, Stefanos Vrochidis and Ioannis Kompatsiaris. 2017. Detec-
tion of Terrorism-related Twier Communities using Centrality Scores. In
Proceedings of MFSec’17, Bucharest, Romania, June 06, 2017, 5 pages.
DOI: hp://dx.doi.org/10.1145/3078897.3080534
1 INTRODUCTION
e rapid growth of the Internet has resulted in modern forms
of communication and exchange of information, realized mainly
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permied. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from permissions@acm.org.
MFSec’17, Bucharest, Romania
©
2017 Copyright held by the owner/author(s). Publication rights licensed to ACM.
978-1-4503-5034-1/17/06. . . $15.00
DOI: hp://dx.doi.org/10.1145/3078897.3080534
through the use of social media networking platforms (e.g. Twier,
Facebook, etc.), which have dominated the online world during
the past few years. Social media networks have made possible
the communication among people across nationalities, religions,
cultures or residences; however, their great power and reach has
become an aractive feature for their use by terrorist and extremist
organizations for disseminating their propaganda, recruiting and
radicalizing new members, raising funds, organizing operations,
and publishing information and instructions exploited by lone-wolf
terrorists when preparing and commiing acts of terror [27–29].
Due to its nature that permits the inexpensive communication of
multimedia messages (i.e. tweets) to users worldwide, Twier has
been used primarily for promoting and spreading their propaganda
typically using a top-down approach, with a core group of members
spreading the group’s messages, which are then re-shared by other
aliated accounts. Both the administrators of the social media
networking platform itself (Twier), on the one hand, and the
Law Enforcement Agencies (LEAs), on the other, are interested in
monitoring terrorism-related activities taking place through the
platform. In the former case, the goal is to detect material that
violates the platform’s terms and conditions regarding extremist
content, while in the laer case such information may be very useful
in investigations for prosecuting the perpetrators of terrorist aacks.
In both cases, it is of vital signicance to detect the communities
in the social networks and their most prominent users (i.e. key
players) who disseminate terrorism-related information, so as to
prevent terrorist groups from spreading their propaganda (to the
extent possible), by shuing down accounts who are found to play
a central role in this information exchange.
Over the past two decades, several research eorts have dis-
cussed the network structure of terrorist organizations. One of the
early eorts examined the network structure of the 9/11 hijackers
along with their accomplices and detected the ring leaders of the
terrorist aacks based on their social associations [
15
]. Later work
focused on using social network analysis for examining the basic
characteristics of terrorist groups or organizations [
26
]. More re-
cent research has examined the survival mechanisms of the Global
Sala Jihad (GSJ) terrorist network, even aer being severely dam-
aged by the authorities, by analyzing its network structure and
topology [
30
]. In addition, several works have been conducted for
studying the use of social media, and especially Twier, by terrorist
organizations. Specically, a work has examined the signicant role
of Twier in facilitating terrorists to execute their aack in Mumbai
(November 2008), by monitoring and exploiting situational infor-
mation which was broadcast through Twier [
19
]. More recent
MFSec’17, June 06, 2017, Bucharest, Romania I. Gialampoukidis et al.
research has studied the Islamic State’s (IS) strategy for communi-
cating their propaganda for radicalizing and recruiting Twier users
[
6
]. Furthermore, the signicant role played by feeder accounts of
terrorist organizations for exchanging information from the Syria
insurgency zone is pointed out in [
14
]. Key player identication in
complex networks, on the other hand, has been mainly addressed
through the use of dierent centrality measures; e.g. recent work
[
10
] has used several centrality measures to rank terrorism-related
Twier accounts based on their location in the network and the
topology of the network of user-to-user mentions.
is work aims at identifying groups of terrorism-related users
exchanging information through social media platforms by detect-
ing the key players of a social media network and the interrelated
communities of users interacting with them. To this end, we extend
the approach of [
10
] and propose a hybrid framework which rst
retrieves the key network players and then enriches the retrieved
results by adding the members of a user’s detected community
based on the combination of centrality scores with community
detection algorithms. ese centrality measures, which aim to iden-
tify key-players in the terrorism domain, are estimated on social
media networks based on user mentions and are compared with
other popularity measures (i.e. number of followers, number of
friends) used for identifying very important users within the struc-
ture of these networks. is work also presents a case study on a
social media network formed by Twier accounts based on a set of
terrorism-related Arabic keywords provided by LEAs and domain
experts, for demonstrating the performance of our proposed frame-
work based on evidence related to the suspension of the majority
of the retrieved Twier accounts.
2 KEY TERRORISM COMMUNITY
DETECTION FRAMEWORK
In this work, entropy-based centrality measures are exploited to
rst retrieve a list of key-players and then a community detection
algorithm to enrich the initial set of results. Our framework is
presented in Figure 1, where keyword-based search provides a set of
social media posts. Based on this, a network of mentions is created,
using the user-to-user interactions contained in the corresponding
posts. In the resulting network of users, each user is represented by
a node and a link between two users
(i,k)
exists if user
ni
mentions
or is mentioned by user nk.
On the network of mentions, we use entropy-based centrality
measures to, rst, identify key-players [
10
] and we then extend the
method by associating key-players with their community.
2.1 Centrality-based key player identication
We denote by
G(N,L)
the network of mentions with
N
nodes (users
accounts) and
L
links. e network is unweighted and undirected
capturing only the user-to-user interactions in Twier or any other
social media domain. e degree of a node
nk
is denoted by
deд(nk)
,
and is equal to the number of its adjacent links. e degree is
normalized to dene the degree centrality as follows [9]:
DCk=deд(nk)
N1(1)
e degree simply counts the number of nodes and is not aected
by the position of a hub in the network. However, the betweenness
centrality [
9
] of a node
nk
is based on the number of paths
дij (nk)
from node
ni
to node
nj
that pass through node
nk
, divided by the
number of all paths
дij
from node
ni
to node
nj
, summed over all
pairs of nodes (ni,nj)and normalized by its maximum value:
BCk=
2ÍN
i<j
дij (nk)
дij
N23N+2(2)
Nodes with high betweenness centrality are very important for the
communication in a network [
1
] , due to the fact that their removal
strongly aects the network connectivity and robustness. Other
centrality measures have also been proposed, based on the mutual
distances of all nodes (closeness centrality) [
9
], on the inuence of
a node (eigenvector centrality) [
4
], or motivated by the importance
of a Web page (PageRank) [5].
In the context of this work, we propose the use of entropy-based
centrality measures, such as the Mapping Entropy (ME) and the
Mapping Entropy Betweenness (MEB), taking also into account the
neighborhood
N(nk)
of a node
nk
.Mapping Entropy centrality [
18
]
is dened as a function of the degree centrality:
MEk=DCkÕ
ni∈N (nk)
log DCi(3)
whereas Mapping Entropy Betweenness centrality [
10
] is dened as
a function of betweenness centrality:
MEBk=BCkÕ
ni∈N (nk)
log BCi(4)
Intuitively, to interpret Equations (3) and (4), one may think of
a random walker on the network, standing at node
nk
, who picks
his/her next step with probability
DCi
(
BCi
). en, the weight
log DCi
(
log BCi
) is interpreted as the Shannon information of
the event that the random walker picked node
ni
, and is summed
over all neighbors of node
nk
. ese two measures consider the
information that is communicated through nodes who act as a hub
(bridge), i.e. those with high values of degree (betweenness) cen-
trality between any two members. In particular, the MEB centrality
considers the betweenness centrality of a node and exploits local
information from its neighborhood; hence, high MEB values indi-
cate that a particular node can act as a bridge for disseminating
information, even if their degree centrality is low [22].
In the following, we combine the key-player identication meth-
ods with community detection approaches that are able to cluster
the network into communities of densely connected user accounts.
2.2 Community detection around key players
In parallel to the key-player identication, a community detection
algorithm is used to divide the network into groups of users (com-
munities). e top-ranked key-player is used to enrich the retrieved
results, which is achieved by searching for the community where
the key-player belongs to.
Community detection in complex networks aims to identify
groups of nodes that are more densely connected to each other
within a group than to the rest of the network outside of the group
[
20
]. e groups are communities of users in the social media do-
main, sharing a common property or playing similar roles within
the network [
8
]. Community structure is very popular in many
elds, including sociology and biology [
12
], as well as computer
Detection of Terrorism-related Twier Communities using Centrality Scores MFSec’17, June 06, 2017, Bucharest, Romania
Search by Keyword Network of
Mentions Mapping Entropy
Betweenness
Community
Detection
Figure 1: Key terrorism-related community detection on the network of Twitter mentions.
science [
17
], and in any domain where systems or items admit a
network representation. Detecting communities in complex net-
works is oen viewed as a graph partitioning problem, where all
nodes are assigned to a community, but density-based approaches
leave out noise, i.e. do not assign all nodes to communities. In our
experiments, we shall present and compare both approaches.
Several community detection algorithms have been proposed
(e.g. [
2
,
8
,
12
,
13
,
16
,
21
,
23
,
25
]). e network is partitioned into
communities using either the maximization of modularity [
2
,
17
],
the minimization of codelength [
24
] or density-based approaches
[
11
]. We present in the experiments the key-community, dened as
the community that the key-player belongs to, as provided by the
algorithms FastGreedy [
7
], Walktrap [
21
], Infomap [
3
,
24
,
25
], Lou-
vain [
2
] and DBSCAN*-Martingale [
11
]. e most popular methods
are those aiming at the maximization of modularity, dened as [
7
]:
Q=1
2m
c
Õ
i=1
(eii α2
i)(5)
where
ei j
is the fraction of links between a node in community
i
and a node in community
j
,
αi
is the fraction of links between
two members of the community
i
,
m=Íkdeд(nk)
, and
c
is the
number of communities. We adopt the modularity maximization
community detection approach as a fast and scalable approach
that admits hierarchical and iterative methods [
2
,
20
] to maximize
the objective function of Equation 5. Assuming the key-player is
a member of the
k
-th community, our framework returns all its
members
nk1,nk2, . . . ,nkl
, all of which are marked as the nal list
of accounts with suspicious activity.
3 EXPERIMENTS
We evaluate our framework in a network consisting of terrorism-
related Twier accounts formed based on user mentions.
As ground-truth we make use of information from Twier, which
marks user accounts as suspended, given that the suspension pro-
cess is applied when an account violates Twier rules by exhibit-
ing abusive behavior, including posting content related to violent
threats and hate speech (Twier has suspended 360,000 terrorism-
related accounts from mid-2015 until August 2016
1
). Our data were
collected by executing queries on the Twier API
2
based on a set
of ve Arabic keywords related to terrorist propaganda. ese
keywords were provided by LEAs and domain experts and are re-
lated to the Caliphate State, its news, publications, and photos from
the Caliphate area. e collected dataset consists of 9,528 Twier
posts by 4,400 users. e top-100 user accounts are retrieved in the
key-player identication step using the ranking methods of Table 1
and are then combined with the community detection approaches
of Table 2. e evaluation is performed by assessing whether these
accounts are suspended, active or no longer exist (i.e. accounts
which have been temporarily or permanently deactivated).
e rst part of our framework evaluates several centrality mea-
sures, including the proposed Mapping Entropy and Mapping En-
tropy Betweenness, as well as popularity measures, such as the
number of friends and followers, in terms of their ability to re-
trieve suspended users. e results in Table 1 indicate that the
entropy-based centralities ME and MEB are able to retrieve the rst
suspended user at position 16, while PageRank follows at position
19. Other centrality and popularity measures, such as closeness,
eigenvector and number of followers do not nd any suspended
user at the top-100 positions of their retrieved users. We observe
that the network is very spread with many bridges and a diam-
eter equal to 27, so key-players are expected to be positioned in
1
hps://blog.twier.com/2016/an-update-on-our-eorts-to-combat-violent-
extremism
2hps://dev.twier.com/
MFSec’17, June 06, 2017, Bucharest, Romania I. Gialampoukidis et al.
Table 1: Comparison among several ranking methods
Ranking Method Position Reciprocal Rank
Degree centrality 20 5.00%
Betweenness centrality 35 2.86%
Closeness centrality >100 <1.00%
Eigenvector centrality >100 <1.00%
Num of followers >100 <1.00%
Num of friends 31 3.23%
PageRank 19 5.26%
Mapping Entropy 16 6.25%
Mapping Entropy Betweenness 16 6.25%
K=1 K=2 K=3
K=5 K=10 Largest connected component
Figure 2: First, second, third, h and tenth order neighbor-
hoods of the suspended user and the largest component.
between many pairs of nodes in the network, exploiting also their
neighborhood’s high betweenness centrality.
e
K
-th order neighborhood
NK
of node
nj
is the set of all
nodes that are reachable from
nj
within
K
1 intermediate nodes:
NK(nj)={nN
:
d(n,nj) ≤ K}
, where
d(n,nj)
is the network
distance of any two nodes. In Figure 2 we show the rst
(K=
1
)
,
second
(K=
2
)
, third
(K=
3
)
, h
(K=
5
)
and tenth
(K=
10
)
order neighborhoods of the rst suspended user and the largest
connected component. Although the ME and MEB centralities both
retrieve a suspended user at rank 16, the user does not correspond
to the same Twier account. In fact, the Twier user at the 16
th
position of ME centrality leads to a disconnected component of
two users, where one of them is suspended and the other is not.
However, the neighborhood of the suspended user (Figure 2) from
the MEB centrality is part of the largest connected component of
the network with 1,334 accounts. erefore, we proceed to the next
step by considering the MEB centrality measure and not ME.
Given the rst identied suspended user in the MEB ranking,
we explore the community where the user belongs to. e re-
sults are reported in Table 2, along with the community size per
community detection method. We observe that in all cases exam-
ined, the majority of accounts are already suspended and some of
Table 2: Comparison of community detection methods
Method
Community
size
Active
users
Do not
exist
Suspended
accounts
FastGreedy 58 4 (6.9%)
6 (10.3%)
48 (82.8%)
Walktrap 33 3 (9.1%)
4 (12.1%)
26 (78.8%)
Louvain 58 4 (6.9%)
6 (10.3%)
48 (82.8%)
Infomap 20
4 (20.0%)
3 (15.0%)
13 (65.0%)
DBSCAN*-
Martingale
23 2 (8.7%)
3 (13.0%)
18 (78.3%)
Figure 3: A sample set of images uploaded by key-players
with militaristic or nationalistic content. Faces are redacted
so as to avoid the inclusion of sensitive information.
them no longer exist. In particular, the modularity maximization
methods (FastGreedy, Louvain) are able to retrieve the largest com-
munities and thus more accounts with potentially illegal activity.
e percentage of suspended users is 82.76% for the modularity
maximization approaches and 78% for the Walktrap and DBSCAN*-
Martingale, indicating a marginal advantage for the former. e
community provided by Infomap is very small, compared to the
other community sizes, but still the number of active accounts (not
yet suspended) is only 20%. Figure 3 depicts sample content from
such active accounts that have not been marked as suspended by
Twier. One may note that their content is military-themed or
extremist, indicating potentially suspicious user activity even in
non-suspended accounts.
4 CONCLUSIONS
We proposed a hybrid model that combines MEB centrality and com-
munity detection that retrieves groups of social media user accounts
that are key-players in the terrorism domain. We found that central-
ity measures on the network of mentions perform beer than other
popularity measures (number of followers or friends) in nding
key-players in the terrorism domain. Given a terrorism-related user,
his/her network community reveals a group of additional terrorism-
related users, exploiting the outcome of a community detection
method, with modularity maximization methods outperforming
density-based and other methods.
ACKNOWLEDGMENTS
is work was supported by the project TENSOR (H2020-700024),
funded by the European Commission.
Detection of Terrorism-related Twier Communities using Centrality Scores MFSec’17, June 06, 2017, Bucharest, Romania
REFERENCES
[1]
Ala Berzinji, Lisa Kaati, and Ahmed Rezine. 2012. Detecting key players in
terrorist networks. In Intelligence and Security Informatics Conference (EISIC),
2012 European. IEEE, 297–302.
[2]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambioe, and Etienne Lefeb-
vre. 2008. Fast unfolding of communities in large networks. Journal of statistical
mechanics: theory and experiment 2008, 10 (2008), P10008.
[3]
Ludvig Bohlin, Daniel Edler, Andrea Lancichinei, and Martin Rosvall. 2014.
Community detection and visualization of networks with the map equation
framework. In Measuring Scholarly Impact. Springer, 3–34.
[4]
Phillip Bonacich and Paulee Lloyd. 2001. Eigenvector-like measures of centrality
for asymmetric relations. Social networks 23, 3 (2001), 191.
[5]
Sergey Brin and Lawrence Page. 2012. Reprint of: e anatomy of a large-scale
hypertextual web search engine. Computer networks 56, 18 (2012), 3825–3833.
[6]
Akemi Takeoka Chateld, Christopher G Reddick, and Uuf Brajawidagda. 2015.
Tweeting propaganda, radicalization and recruitment: Islamic state supporters
multi-sided twier networks. In Proceedings of the 16th Annual International
Conference on Digital Government Research. ACM, 239–249.
[7]
Aaron Clauset, Mark EJ Newman, and Cristopher Moore. 2004. Finding commu-
nity structure in very large networks. Physical review E 70, 6 (2004), 066111.
[8]
Santo Fortunato. 2010. Community detection in graphs. Physics reports 486, 3
(2010), 75–174.
[9]
Linton C Freeman. 1978. Centrality in social networks conceptual clarication.
Social networks 1, 3 (1978), 215–239.
[10]
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Stefanos Vrochidis,
and Ioannis Kompatsiaris. 2016. Key player identication in terrorism-related
social media networks using centrality measures. In European Intelligence and
Security Informatics Conference (EISIC 2016), August. 17–19.
[11]
Ilias Gialampoukidis, eodora Tsikrika, Stefanos Vrochidis, and Ioannis Kom-
patsiaris. 2016. Community detection in complex networks based on DBSCAN*
and a Martingale process. In Semantic and Social Media Adaptation and Personal-
ization (SMAP), 2016 11th International Workshop on. IEEE, 1–6.
[12]
Michelle Girvan and Mark EJ Newman. 2002. Community structure in social
and biological networks. Proceedings of the national academy of sciences 99, 12
(2002), 7821–7826.
[13]
Steve Harenberg, Gonzalo Bello, L Gjeltema, StephenRanshous, Jitendra Harlalka,
Ramona Seay, Kanchana Padmanabhan, and Nagiza Samatova. 2014. Community
detection in large-scale networks: a survey and empirical evaluation. Wiley
Interdisciplinary Reviews: Computational Statistics 6, 6 (2014), 426–439.
[14]
Jye Klausen. 2015. Tweeting the Jihad: Social media networks of Western
foreign ghters in Syria and Iraq. Studies in Conict & Terrorism 38, 1 (2015),
1–22.
[15] Valdis Krebs. 2002. Uncloaking terrorist networks. First Monday 7, 4 (2002).
[16]
Fragkiskos D Malliaros and Michalis Vazirgiannis. 2013. Clustering and com-
munity detection in directed networks: A survey. Physics Reports 533, 4 (2013),
95–142.
[17]
ME Newman and M Girvan. 2004. Finding and evaluating community structure
in networks. Physical Review E 69, 2 (2004), 026113.
[18]
Tingyuan Nie, Zheng Guo, Kun Zhao, and Zhe-Ming Lu. 2016. Using mapping
entropy to identify node centrality in complex networks. Physica A: Statistical
Mechanics and its Applications 453 (2016), 290–297.
[19]
Onook Oh, Manish Agrawal, and H Raghav Rao. 2011. Information control and
terrorism: Tracking the Mumbai terrorist aack through twier. Information
Systems Frontiers 13, 1 (2011), 33–43.
[20]
Symeon Papadopoulos, Yiannis Kompatsiaris, Athena Vakali, and Ploutarchos
Spyridonos. 2012. Community detection in social media. Data Mining and
Knowledge Discovery 24, 3 (2012), 515–554.
[21]
Pascal Pons and Mahieu Latapy. 2006. Computing communities in large net-
works using random walks. J. Graph Algorithms Appl. 10, 2 (2006), 191–218.
[22]
Jialun Qin, Jennifer J Xu, Daning Hu, Marc Sageman, and Hsinchun Chen. 2005.
Analyzing terrorist networks: A case study of the global sala jihad network. In
Intelligence and security informatics. Springer, 287–304.
[23]
Usha Nandini Raghavan, R
´
eka Albert, and Soundar Kumara. 2007. Near linear
time algorithm to detect community structures in large-scale networks. Physical
Review E 76, 3 (2007), 036106.
[24]
Martin Rosvall, Daniel Axelsson, and Carl T Bergstrom. 2009. e map equation.
e European Physical Journal Special Topics 178, 1 (2009), 13–23.
[25]
Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex
networks reveal community structure. Proceedings of the National Academy of
Sciences 105, 4 (2008), 1118–1123.
[26]
Sudhir Saxena, K Santhanam, and Aparna Basu. 2004. Application of social
network analysis (SNA) to terrorist networks in Jammu & Kashmir. Strategic
Analysis 28, 1 (2004), 84–101.
[27]
Robin L ompson. 2011. Radicalization and the use of social media. Journal of
strategic security 4, 4 (2011), 167.
[28]
Robyn Torok. 2010. “Make A Bomb In Your Mums Kitchen”: Cyber Recruiting
And Socialisation of ‘White Moors’ and Home Grown Jihadists. (2010).
[29]
Ines Von Behr. 2013. Radicalisation in the digital era: e use of the Internet in
15 cases of terrorism and extremism. (2013).
[30]
Jie Xu, Daning Hu, and Hsinchun Chen. 2009. e dynamics of terrorist net-
works: Understanding the survival mechanisms of Global Sala Jihad. Journal
of Homeland Security and Emergency Management 6, 1 (2009), 1–15.
... Graph algorithm shows the interconnection between different entities. Researchers leverage this property of graph techniques to identify the extremists and their interconnections on the social network [50]. Fig. 5 provides the details of the Network or Graph-based approach in extremism detection research. ...
... The similarities obtained by comparing subgraphs are used to find extremists and their communities [50]. Some researchers use graph techniques to extract semantic and structural similarities from text datasets for extremism. ...
... These behavioural profiles are collected using simple graphs. The author uses Multinomial Naïve Bayes to classify collected extremist data into these behaviour profiles Gialampoukidis et al. [50] use five Arabic keywords to collect the data from Twitter, which gives a total of 4400 user accounts and 9,528 Twitter posts. The authors use a network / graph-based approach to automatically identify extremist communities. ...
Article
Full-text available
Social media platforms are popular for expressing personal views, emotions and beliefs. Social media platforms are influential for propagating extremist ideologies for group-building, fund-raising, and recruitment. To monitor and control the outreach of extremists on social media, detection of extremism in social media is necessary. The existing extremism detection literature on social media is limited by specific ideology, subjective validation methods, and binary or tertiary classification. A comprehensive and comparative survey of datasets, classification techniques, validation methods with online extremism detection tool is essential. The systematic literature review methodology (PRISMA) was used. Sixty-four studies on extremism research were collected, including 31 from SCOPUS, Web of Science (WoS), ACM, IEEE, and 33 thesis, technical and analytical reports using Snowballing technique. The survey highlights the role of social media in propagating online radicalization and the need for extremism detection on social media platforms. The review concludes lack of publicly available, class-balanced, and unbiased datasets for better detection and classification of social-media extremism. Lack of validation techniques to evaluate correctness and quality of custom data sets without human interventions, was found. The information retrieval unveiled that contemporary research work is prejudiced towards ISIS ideology. We investigated that deep learning based automated extremism detection techniques outperform other techniques. The review opens the research opportunities for developing an online, publicly available automated tool for extremism data collection and detection. The survey results in conceptualization of architecture for construction of multi-ideology extremism text dataset with robust data validation techniques for multiclass classification of extremism text.
... The methodological breakthroughs in network science, in fact, benefited the study of criminal and terrorist networks by providing empirical insights that have supported old theories or contributed to the generation of new ones [29][30][31][32]. In terrorism research, the network paradigm has been applied among other things to study Islamist or jihadism organizations [33][34][35][36], alliances between actors in the global scenario [37] and support and radicalization through social media platform [38][39][40]. Yet, relational perspectives to the study of such social phenomena fail to go beyond the tangible connections between individuals (or groups of individuals). With very few exceptions, however, the literature on networks and crime and networks and terrorism have not considered other types of relationships, e.g., those between events or characteristics of events, which may reveal underlying knowledge structures that escape the traditional methodologies embedded in traditional Euclidean spaces generally employed to study actors or their behaviours. ...
Article
Behaviours across terrorist groups differ based on a variety of factors, such as groups' resources or objectives. We here show that organizations can also be distinguished by network representations of their operations. We provide evidence in this direction in the frame of a computational methodology organized in two steps, exploiting data on attacks plotted by Al Shabaab, Boko Haram, the Islamic State and the Taliban in the 2013-2018 period. First, we present LabeledSparseStruct, a graph embedding approach, to predict the group associated with each operational meta-graph. Second, we introduce SparseStruct-Explanation, an algorithmic explainer based on LabeledSparseStruct, that disentangles characterizing features for each organization, enhancing interpretability at the dyadic level. We demonstrate that groups can be discriminated according to the structure and topology of their operational meta-graphs, and that each organization is characterized by the recurrence of specific dyadic interactions among event features.
... (2) Network-based analysis: Exploit metadata and online interactions (e.g. likes, re-tweets, comments, mentions, re-blogging and hyperlinks to other pages) to detect communities, social leaders or controllers [16,17]. ...
Article
Purpose Social networks (SNs) have recently evolved from a means of connecting people to becoming a tool for social engineering, radicalization, dissemination of propaganda and recruitment of terrorists. It is no secret that the majority of the Islamic State in Iraq and Syria (ISIS) members are Arabic speakers, and even the non-Arabs adopt Arabic nicknames. However, the majority of the literature researching the subject deals with non-Arabic languages. Moreover, the features involved in identifying radical Islamic content are shallow and the search or classification terms are common in daily chatter among people of the region. The authors aim at distinguishing normal conversation, influenced by the role religion plays in daily life, from terror-related content. Design/methodology/approach This article presents the authors' experience and the results of collecting, analyzing and classifying Twitter data from affiliated members of ISIS, as well as sympathizers. The authors used artificial intelligence (AI) and machine learning classification algorithms to categorize the tweets, as terror-related, generic religious, and unrelated. Findings The authors report the classification accuracy of the K-nearest neighbor (KNN), Bernoulli Naive Bayes (BNN) and support vector machine (SVM) [one-against-all (OAA) and all-against-all (AAA)] algorithms. The authors achieved a high classification F1 score of 83\%. The work in this paper will hopefully aid more accurate classification of radical content. Originality/value In this paper, the authors have collected and analyzed thousands of tweets advocating and promoting ISIS. The authors have identified many common markers and keywords characteristic of ISIS rhetoric. Moreover, the authors have applied text processing and AI machine learning techniques to classify the tweets into one of three categories: terror-related, non-terror political chatter and news and unrelated data-polluting tweets.
... Studies have investigated the detection of different kinds of hate speech such as detecting cyberbullying [11,12,13], offensive language [14,15], or targeted hate speech in general by distinguishing between types of hate speech and neutral expressions [16,17,18]. Others have dealt with the problem by detecting a specific types of hate speech, such as anti-religion [19,20], jihadist [21,22,23,24], sexist, and racist [25,26,27]. However, less attention has been given to detecting white supremacist content in particular, with only one study that uses white supremacist data [28]. ...
Article
Full-text available
White supremacist hate speech is one of the most recently observed harmful content on social media. The critical influence of these radical groups is no longer limited to social media and can negatively affect society by promoting racial hatred and violence. Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information and the implicit nature of hate speech. Therefore, it is necessary to detect such speech automatically and in a timely manner. This research investigates the feasibility of automatically detecting white supremacist hate speech on Twitter using deep learning and natural language processing techniques. Two deep learning models are investigated in this research. The first approach utilizes a bidirectional Long Short-Term Memory (BiLSTM) model along with domain-specific word embeddings extracted from white supremacist corpus to capture the semantic of white supremacist slangs and coded words. The second approach utilizes one of the most recent language models, which is Bidirectional Encoder Representations from Transformers (BERT). The BiLSTM model achieved 0.75 F1-score and BERT reached a 0.80 F1-score. Both models are tested on a balanced dataset combined from Twitter and a Stormfront dataset compiled from white supremacist forum.
... The methodology [60] starts by denoting the social network as ( , ), where nodes represent the Twitter user accounts and the links between them; a link ( , ) means that user mentions or is mentioned by user . The next step is to apply a community detection algorithm in order to divide the network into groups of users who are more densely connected to each other within the group rather than to the rest of the network. ...
Article
Social media play an important role in the daily life of people around the globe and users have emerged as an active part of news distribution as well as production. The threatening pandemic of COVID-19 has been the lead subject in online discussions and posts, resulting to large amounts of related social media data, which can be utilised to reinforce the crisis management in several ways. Towards this direction, we propose a novel framework to collect, analyse, and visualise Twitter posts, which has been tailored to specifically monitor the virus spread in severely affected Italy. We present and evaluate a deep learning localisation technique that geotags posts based on the locations mentioned in their text, a face detection algorithm to estimate the number of people appearing in posted images, and a community detection approach to identify communities of Twitter users. Moreover, we propose further analysis of the collected posts to predict their reliability and to detect trending topics and events. Finally, we demonstrate an online platform that comprises an interactive map to display and filter analysed posts, utilising the outcome of the localisation technique, and a visual analytics dashboard that visualises the results of the topic, community, and event detection methodologies.
... The authors of (Ovelgonne, Kang, Sawant, & Subrahmanian, 2012) proposed the use of a covertness centrality measure that maps centrality scores and communication frequency of a new user with the known covert nodes (Ovelgonne et al., 2012). In the same manner, some researchers have proposed adding scores of the extremist neighborhood within the centrality measures (Gialampoukidis et al., 2017;Wadhwa & Bhatia, 2016;Wei & Singh, 2017). Similarly, authors in (Wei, Singh, & Martin, 2016) used user ties with the extremist group as a feature in the machine learning algorithms (Wei et al., 2016). ...
Article
During the last two decades, the number of incidents from extremists have increased, so as the use of social media. Research suggests that extremists use social media for reaching their purposes like recruitment, fund raising, and propaganda. Limited research is available to identify rebel users on social media platforms. Therefore, we propose a Supervised Rebel Identification (SRI) framework to identify rebels on Twitter. The framework consists of a novel mechanism to structure the users’ tweets into a directed user graph. This user graph links predicates (verbs) with the subject and object words to understand semantics of the underlying data. We convert the user graph into graph embedding to use these semantics within the machine learning algorithms. Apart from the user graph and its embedding, we propose fourteen other features belonging to tweets’ contents and users’ profiles. For evaluation, we present the first multicultural and multiregional dataset of rebels affiliated with nine rebel movements belonging to five countries. We evaluate the proposed SRI framework against two state-of-the-art baselines. The results show that the SRI framework outperforms the baselines with high accuracy.
... Dari penelitian tersebut diketahui bahwa aktor Ryan Restu merupakan aktor yang berpengaruh karena memiliki nilai degree centrality tertinggi dan aktor Tirto ID merupakan aktor yang paling popular karena memiliki nilai follower rank yang tertinggi. Selain itu, beberapa penelitian sebelumnya [4] [5] [6] [7] juga telah membuktikan bahwa dengan menggunakan algoritma closeness centrality dapat menunjukkan nilai sentralitas yang berfungsi untuk mengetahui tingkat popularitas dan pengaruh node (aktor) pada sebuah jaringan. ...
Article
Full-text available
Perkembangan teknologi web yang semakin pesat membuat manusia semakin mudah dalam mengakses berbagai informasi. Generasi ketiga dari layanan internet berbasis web (Web 3.0) telah memperkenalkan Semantic Web yang bertujuan untuk memungkinakan konten pada web agar dapat dipahami oleh komputer. Penerapan Semantic Web dapat dilakukan untuk mengambil dataset dari DBpedia Indonesia, yaitu daftar film Indonesia, untuk selanjutnya dilakukan analisis data. Tujuan analisis data adalah untuk mengetahui aktor dan sutradara terpopuler dalam industri perfilman Indonesia. Penelitian ini menggunakan algoritma closeness centrality dan Node2vec untuk menentukan tingkat popularitas aktor. Selain itu, penelitian ini juga menggunakan density graph untuk mengetahui sutradara yang berpengaruh di industri perfilman Indonesia. Hasil dari perhitungan algoritma tersebut divisualisasikan menggunakan Neo4j, Networkx dan tSNE yang mana berupa graf. Pada penelitian ini ditemukan bahwa Rima Melati merupakan aktor terpopuler karena nilai closeness centralitynya tertinggi. Hal ini juga dapat diartikan bahwa Rima Melati merupakan aktor yang membintangi judul film terbanyak. Sedangkan pada perhitungan density graph, Sophan Sophiaan merupakan sutradara yang paling berpengaruh karena menyutradarai paling banyak judul film. Kata Kunci: closeness centrality, dbpedia, density graph, networkx, node2vec
... Recent works have examined the use of social media platforms by terrorist groups and organizations (Chatfield et al. 2015;Klausen 2015). Moreover, key player and key community identification in terrorism-related Twitter networks has been addressed through the use of different centrality measures and community detection algorithms (Gialampoukidis et al. 2016(Gialampoukidis et al. , 2017. Complementary to the aforementioned research efforts, our paper analyzes several textual, spatial, temporal and social network features which, when combined, are capable of characterizing the terrorism-related nature of Twitter accounts. ...
Book
Full-text available
This book is open access under a CC BY 4.0 license.
Preprint
Full-text available
White supremacists embrace a radical ideology that considers white people superior to people of other races. The critical influence of these groups is no longer limited to social media; they also have a significant effect on society in many ways by promoting racial hatred and violence. White supremacist hate speech is one of the most recently observed harmful content on social media.Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information, and therefore, it is necessary to find an automatic way to detect such speech in a timely manner. This research investigates the viability of automatically detecting white supremacist hate speech on Twitter by using deep learning and natural language processing techniques. Through our experiments, we used two approaches, the first approach is by using domain-specific embeddings which are extracted from white supremacist corpus in order to catch the meaning of this white supremacist slang with bidirectional Long Short-Term Memory (LSTM) deep learning model, this approach reached a 0.74890 F1-score. The second approach is by using the one of the most recent language model which is BERT, BERT model provides the state of the art of most NLP tasks. It reached to a 0.79605 F1-score. Both approaches are tested on a balanced dataset given that our experiments were based on textual data only. The dataset was combined from dataset created from Twitter and a Stormfront dataset compiled from that white supremacist forum.
Conference Paper
Full-text available
Community detection is a valuable tool for analyzing complex networks. This work investigates the community detection problem based on the density-based algorithm DBSCAN*. This algorithm requires, though, a lower bound for the community size to be determined a priori, a challenging task. To this end, this work proposes the application of a Martingale process to DBSCAN* that progressively detects communities at various levels of granularity. The proposed DBSCAN*-Martingale community detection algorithm corresponds to an iterative process that progressively lowers the threshold of the size of the acceptable communities, while maintaining the communities detected for higher thresholds. Evaluation experiments are performed based on four realistic benhmark networks and the results indicate improvements in the effectiveness of the proposed DBSCAN*-Martingale community detection algorithm in terms of the Normalized Mutual Information and the RAND metrics against several state-of-the-art community detection approaches.
Chapter
Large networks contain plentiful information about the organization of a system. The challenge is to extract useful information buried in the structure of myriad nodes and links. Therefore, powerful tools for simplifying and highlighting important structures in networks are essential for comprehending their organization. Such tools are called community-detection methods and they are designed to identify strongly intraconnected modules that often correspond to important functional units. Here we describe one such method, known as the map equation, and its accompanying algorithms for finding, evaluating, and visualizing the modular organization of networks. The map equation framework is very flexible and can identify two-level, multi-level, and overlapping organization in weighted, directed, and multiplex networks with its search algorithm Infomap. Because the map equation framework operates on the flow induced by the links of a network, it naturally captures flow of ideas and citation flow, and is therefore well-suited for analysis of bibliometric networks.
Conference Paper
Monitoring terrorist groups and their suspicious activities in social media is a challenging task, given the large amounts of data involved and the need to identify the most influential users in a smart way. To this end, many efforts have focused on using centrality measures for the identification of the key players in terrorism-related social media networks, so that their suspension/removal leads to severe disruption in the connectivity of the network. This work proposes a novel centrality measure, Mapping Entropy Betweenness (MEB), and assesses its effectiveness for key player identification on a dataset of terrorism-related Twitter user accounts by simulating targeted attacks that remove the most central nodes of the network. The results indicate that the MEB affects the robustness of this terrorist network more than well-established centrality measures, in the largest part of the attack process.
Article
The intuitive background for measures of structural centrality in social networks is reviewed and existing measures are evaluated in terms of their consistency with intuitions and their interpretability.
Conference Paper
Islamic State (IS) terrorist networks in Syria and Iraq pose threats to national security. IS' exploitation of social media and digital strategy plays a key role in its global dissemination of propaganda, radicalization, and recruitment. However, systematic research on Islamic terrorist communication via social media is limited. Our research investigates the question: How do IS members/supporters use Twitter for terrorism communication: propaganda, radicalization, and recruitment? Theoretically, we drew on microeconomic network theories to develop a theoretical framework for multi-sided Twitter networks in the global Islamic terrorist communication environment. Empirically, we collected 3,039 tweets posted by @shamiwitness who was identified in prior research as "an information disseminator" for the IS cause. Methodologically, we performed social network analysis, trend and content analyses of the tweet data. We find strong evidence for Shamiwitness-intermediated multi-sided Twitter networks of international mass media, regional Arabic mass media, IS fighters, and IS sympathizers, supporting the framework's utility.
Article
The problem of finding the best strategy to attack network or immunize population with a minimal number of nodes attracts current research interest. The assessment of node importance has been a fundamental issue in such research of complex networks. In this paper, we propose a new concept called mapping entropy (ME) to identify the importance of a node in the complex network. The concept is established according to the local information which considers the correlation among all neighbours of a node. We evaluate the efficiency of the centrality by static attacks and dynamic attacks on standard network models and real-world networks. The simulation result shows that the new centrality is more efficient than traditional attack strategies, no matter in static manner or dynamic manner.