Conference PaperPDF Available

Abstract and Figures

Social media are widely used among terrorists to communicate and disseminate their activities. User-to-user interaction (e.g. mentions, follows) leads to the formation of complex networks, with topology that reveals key-players and key-communities in the terrorism domain. Both the administrators of social media platforms and Law Enforcement Agencies seek to identify not only single users but groups of terrorism-related users so that they can reduce the impact of their information exchange efforts. To this end, we propose a novel framework that combines community detection with key-player identification to retrieve communities of terrorism-related social media users. Experiments show that most of the members of each retrieved key-community are already suspended by Twitter, violating its terms, and are hence associated with terrorism-oriented content with high probability.
Content may be subject to copyright.
Detection of Terrorism-related Twier Communities using
Centrality Scores
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Symeon Papadopoulos, Stefanos
Vrochidis and Ioannis Kompatsiaris
Information Technologies Institute
Centre for Research and Technology Hellas
essaloniki, Greece 57001
Social media are widely used among terrorists to communicate and
disseminate their activities. User-to-user interaction (e.g. mentions,
follows) leads to the formation of complex networks, with topology
that reveals key-players and key-communities in the terrorism
domain. Both the administrators of social media platforms and Law
Enforcement Agencies seek to identify not only single users but
groups of terrorism-related users so that they can reduce the impact
of their information exchange eorts. To this end, we propose a
novel framework that combines community detection with key-
player identication to retrieve communities of terrorism-related
social media users. Experiments show that most of the members of
each retrieved key-community are already suspended by Twier,
violating its terms, and are hence associated with terrorism-oriented
content with high probability.
Information systems Information retrieval; Test collec-
Web searching and information discovery; Multimedia and
multimodal retrieval;
Human-centered computing Social
networking sites; Networks Social media networks;
Social Network Analysis, key-player identication, community
detection, terrorism-oriented social media mining, Twier
ACM Reference format:
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Symeon Pa-
padopoulos, Stefanos Vrochidis and Ioannis Kompatsiaris. 2017. Detec-
tion of Terrorism-related Twier Communities using Centrality Scores. In
Proceedings of MFSec’17, Bucharest, Romania, June 06, 2017, 5 pages.
DOI: hp://
e rapid growth of the Internet has resulted in modern forms
of communication and exchange of information, realized mainly
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permied. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from
MFSec’17, Bucharest, Romania
2017 Copyright held by the owner/author(s). Publication rights licensed to ACM.
978-1-4503-5034-1/17/06. . . $15.00
DOI: hp://
through the use of social media networking platforms (e.g. Twier,
Facebook, etc.), which have dominated the online world during
the past few years. Social media networks have made possible
the communication among people across nationalities, religions,
cultures or residences; however, their great power and reach has
become an aractive feature for their use by terrorist and extremist
organizations for disseminating their propaganda, recruiting and
radicalizing new members, raising funds, organizing operations,
and publishing information and instructions exploited by lone-wolf
terrorists when preparing and commiing acts of terror [27–29].
Due to its nature that permits the inexpensive communication of
multimedia messages (i.e. tweets) to users worldwide, Twier has
been used primarily for promoting and spreading their propaganda
typically using a top-down approach, with a core group of members
spreading the group’s messages, which are then re-shared by other
aliated accounts. Both the administrators of the social media
networking platform itself (Twier), on the one hand, and the
Law Enforcement Agencies (LEAs), on the other, are interested in
monitoring terrorism-related activities taking place through the
platform. In the former case, the goal is to detect material that
violates the platform’s terms and conditions regarding extremist
content, while in the laer case such information may be very useful
in investigations for prosecuting the perpetrators of terrorist aacks.
In both cases, it is of vital signicance to detect the communities
in the social networks and their most prominent users (i.e. key
players) who disseminate terrorism-related information, so as to
prevent terrorist groups from spreading their propaganda (to the
extent possible), by shuing down accounts who are found to play
a central role in this information exchange.
Over the past two decades, several research eorts have dis-
cussed the network structure of terrorist organizations. One of the
early eorts examined the network structure of the 9/11 hijackers
along with their accomplices and detected the ring leaders of the
terrorist aacks based on their social associations [
]. Later work
focused on using social network analysis for examining the basic
characteristics of terrorist groups or organizations [
]. More re-
cent research has examined the survival mechanisms of the Global
Sala Jihad (GSJ) terrorist network, even aer being severely dam-
aged by the authorities, by analyzing its network structure and
topology [
]. In addition, several works have been conducted for
studying the use of social media, and especially Twier, by terrorist
organizations. Specically, a work has examined the signicant role
of Twier in facilitating terrorists to execute their aack in Mumbai
(November 2008), by monitoring and exploiting situational infor-
mation which was broadcast through Twier [
]. More recent
MFSec’17, June 06, 2017, Bucharest, Romania I. Gialampoukidis et al.
research has studied the Islamic State’s (IS) strategy for communi-
cating their propaganda for radicalizing and recruiting Twier users
]. Furthermore, the signicant role played by feeder accounts of
terrorist organizations for exchanging information from the Syria
insurgency zone is pointed out in [
]. Key player identication in
complex networks, on the other hand, has been mainly addressed
through the use of dierent centrality measures; e.g. recent work
] has used several centrality measures to rank terrorism-related
Twier accounts based on their location in the network and the
topology of the network of user-to-user mentions.
is work aims at identifying groups of terrorism-related users
exchanging information through social media platforms by detect-
ing the key players of a social media network and the interrelated
communities of users interacting with them. To this end, we extend
the approach of [
] and propose a hybrid framework which rst
retrieves the key network players and then enriches the retrieved
results by adding the members of a user’s detected community
based on the combination of centrality scores with community
detection algorithms. ese centrality measures, which aim to iden-
tify key-players in the terrorism domain, are estimated on social
media networks based on user mentions and are compared with
other popularity measures (i.e. number of followers, number of
friends) used for identifying very important users within the struc-
ture of these networks. is work also presents a case study on a
social media network formed by Twier accounts based on a set of
terrorism-related Arabic keywords provided by LEAs and domain
experts, for demonstrating the performance of our proposed frame-
work based on evidence related to the suspension of the majority
of the retrieved Twier accounts.
In this work, entropy-based centrality measures are exploited to
rst retrieve a list of key-players and then a community detection
algorithm to enrich the initial set of results. Our framework is
presented in Figure 1, where keyword-based search provides a set of
social media posts. Based on this, a network of mentions is created,
using the user-to-user interactions contained in the corresponding
posts. In the resulting network of users, each user is represented by
a node and a link between two users
exists if user
or is mentioned by user nk.
On the network of mentions, we use entropy-based centrality
measures to, rst, identify key-players [
] and we then extend the
method by associating key-players with their community.
2.1 Centrality-based key player identication
We denote by
the network of mentions with
nodes (users
accounts) and
links. e network is unweighted and undirected
capturing only the user-to-user interactions in Twier or any other
social media domain. e degree of a node
is denoted by
and is equal to the number of its adjacent links. e degree is
normalized to dene the degree centrality as follows [9]:
e degree simply counts the number of nodes and is not aected
by the position of a hub in the network. However, the betweenness
centrality [
] of a node
is based on the number of paths
дij (nk)
from node
to node
that pass through node
, divided by the
number of all paths
from node
to node
, summed over all
pairs of nodes (ni,nj)and normalized by its maximum value:
дij (nk)
Nodes with high betweenness centrality are very important for the
communication in a network [
] , due to the fact that their removal
strongly aects the network connectivity and robustness. Other
centrality measures have also been proposed, based on the mutual
distances of all nodes (closeness centrality) [
], on the inuence of
a node (eigenvector centrality) [
], or motivated by the importance
of a Web page (PageRank) [5].
In the context of this work, we propose the use of entropy-based
centrality measures, such as the Mapping Entropy (ME) and the
Mapping Entropy Betweenness (MEB), taking also into account the
of a node
.Mapping Entropy centrality [
is dened as a function of the degree centrality:
ni∈N (nk)
log DCi(3)
whereas Mapping Entropy Betweenness centrality [
] is dened as
a function of betweenness centrality:
ni∈N (nk)
log BCi(4)
Intuitively, to interpret Equations (3) and (4), one may think of
a random walker on the network, standing at node
, who picks
his/her next step with probability
). en, the weight
log DCi
log BCi
) is interpreted as the Shannon information of
the event that the random walker picked node
, and is summed
over all neighbors of node
. ese two measures consider the
information that is communicated through nodes who act as a hub
(bridge), i.e. those with high values of degree (betweenness) cen-
trality between any two members. In particular, the MEB centrality
considers the betweenness centrality of a node and exploits local
information from its neighborhood; hence, high MEB values indi-
cate that a particular node can act as a bridge for disseminating
information, even if their degree centrality is low [22].
In the following, we combine the key-player identication meth-
ods with community detection approaches that are able to cluster
the network into communities of densely connected user accounts.
2.2 Community detection around key players
In parallel to the key-player identication, a community detection
algorithm is used to divide the network into groups of users (com-
munities). e top-ranked key-player is used to enrich the retrieved
results, which is achieved by searching for the community where
the key-player belongs to.
Community detection in complex networks aims to identify
groups of nodes that are more densely connected to each other
within a group than to the rest of the network outside of the group
]. e groups are communities of users in the social media do-
main, sharing a common property or playing similar roles within
the network [
]. Community structure is very popular in many
elds, including sociology and biology [
], as well as computer
Detection of Terrorism-related Twier Communities using Centrality Scores MFSec’17, June 06, 2017, Bucharest, Romania
Search by Keyword Network of
Mentions Mapping Entropy
Figure 1: Key terrorism-related community detection on the network of Twitter mentions.
science [
], and in any domain where systems or items admit a
network representation. Detecting communities in complex net-
works is oen viewed as a graph partitioning problem, where all
nodes are assigned to a community, but density-based approaches
leave out noise, i.e. do not assign all nodes to communities. In our
experiments, we shall present and compare both approaches.
Several community detection algorithms have been proposed
(e.g. [
]). e network is partitioned into
communities using either the maximization of modularity [
the minimization of codelength [
] or density-based approaches
]. We present in the experiments the key-community, dened as
the community that the key-player belongs to, as provided by the
algorithms FastGreedy [
], Walktrap [
], Infomap [
], Lou-
vain [
] and DBSCAN*-Martingale [
]. e most popular methods
are those aiming at the maximization of modularity, dened as [
(eii α2
ei j
is the fraction of links between a node in community
and a node in community
is the fraction of links between
two members of the community
, and
is the
number of communities. We adopt the modularity maximization
community detection approach as a fast and scalable approach
that admits hierarchical and iterative methods [
] to maximize
the objective function of Equation 5. Assuming the key-player is
a member of the
-th community, our framework returns all its
nk1,nk2, . . . ,nkl
, all of which are marked as the nal list
of accounts with suspicious activity.
We evaluate our framework in a network consisting of terrorism-
related Twier accounts formed based on user mentions.
As ground-truth we make use of information from Twier, which
marks user accounts as suspended, given that the suspension pro-
cess is applied when an account violates Twier rules by exhibit-
ing abusive behavior, including posting content related to violent
threats and hate speech (Twier has suspended 360,000 terrorism-
related accounts from mid-2015 until August 2016
). Our data were
collected by executing queries on the Twier API
based on a set
of ve Arabic keywords related to terrorist propaganda. ese
keywords were provided by LEAs and domain experts and are re-
lated to the Caliphate State, its news, publications, and photos from
the Caliphate area. e collected dataset consists of 9,528 Twier
posts by 4,400 users. e top-100 user accounts are retrieved in the
key-player identication step using the ranking methods of Table 1
and are then combined with the community detection approaches
of Table 2. e evaluation is performed by assessing whether these
accounts are suspended, active or no longer exist (i.e. accounts
which have been temporarily or permanently deactivated).
e rst part of our framework evaluates several centrality mea-
sures, including the proposed Mapping Entropy and Mapping En-
tropy Betweenness, as well as popularity measures, such as the
number of friends and followers, in terms of their ability to re-
trieve suspended users. e results in Table 1 indicate that the
entropy-based centralities ME and MEB are able to retrieve the rst
suspended user at position 16, while PageRank follows at position
19. Other centrality and popularity measures, such as closeness,
eigenvector and number of followers do not nd any suspended
user at the top-100 positions of their retrieved users. We observe
that the network is very spread with many bridges and a diam-
eter equal to 27, so key-players are expected to be positioned in
MFSec’17, June 06, 2017, Bucharest, Romania I. Gialampoukidis et al.
Table 1: Comparison among several ranking methods
Ranking Method Position Reciprocal Rank
Degree centrality 20 5.00%
Betweenness centrality 35 2.86%
Closeness centrality >100 <1.00%
Eigenvector centrality >100 <1.00%
Num of followers >100 <1.00%
Num of friends 31 3.23%
PageRank 19 5.26%
Mapping Entropy 16 6.25%
Mapping Entropy Betweenness 16 6.25%
K=1 K=2 K=3
K=5 K=10 Largest connected component
Figure 2: First, second, third, h and tenth order neighbor-
hoods of the suspended user and the largest component.
between many pairs of nodes in the network, exploiting also their
neighborhood’s high betweenness centrality.
-th order neighborhood
of node
is the set of all
nodes that are reachable from
1 intermediate nodes:
d(n,nj) ≤ K}
, where
is the network
distance of any two nodes. In Figure 2 we show the rst
, third
, h
and tenth
order neighborhoods of the rst suspended user and the largest
connected component. Although the ME and MEB centralities both
retrieve a suspended user at rank 16, the user does not correspond
to the same Twier account. In fact, the Twier user at the 16
position of ME centrality leads to a disconnected component of
two users, where one of them is suspended and the other is not.
However, the neighborhood of the suspended user (Figure 2) from
the MEB centrality is part of the largest connected component of
the network with 1,334 accounts. erefore, we proceed to the next
step by considering the MEB centrality measure and not ME.
Given the rst identied suspended user in the MEB ranking,
we explore the community where the user belongs to. e re-
sults are reported in Table 2, along with the community size per
community detection method. We observe that in all cases exam-
ined, the majority of accounts are already suspended and some of
Table 2: Comparison of community detection methods
Do not
FastGreedy 58 4 (6.9%)
6 (10.3%)
48 (82.8%)
Walktrap 33 3 (9.1%)
4 (12.1%)
26 (78.8%)
Louvain 58 4 (6.9%)
6 (10.3%)
48 (82.8%)
Infomap 20
4 (20.0%)
3 (15.0%)
13 (65.0%)
23 2 (8.7%)
3 (13.0%)
18 (78.3%)
Figure 3: A sample set of images uploaded by key-players
with militaristic or nationalistic content. Faces are redacted
so as to avoid the inclusion of sensitive information.
them no longer exist. In particular, the modularity maximization
methods (FastGreedy, Louvain) are able to retrieve the largest com-
munities and thus more accounts with potentially illegal activity.
e percentage of suspended users is 82.76% for the modularity
maximization approaches and 78% for the Walktrap and DBSCAN*-
Martingale, indicating a marginal advantage for the former. e
community provided by Infomap is very small, compared to the
other community sizes, but still the number of active accounts (not
yet suspended) is only 20%. Figure 3 depicts sample content from
such active accounts that have not been marked as suspended by
Twier. One may note that their content is military-themed or
extremist, indicating potentially suspicious user activity even in
non-suspended accounts.
We proposed a hybrid model that combines MEB centrality and com-
munity detection that retrieves groups of social media user accounts
that are key-players in the terrorism domain. We found that central-
ity measures on the network of mentions perform beer than other
popularity measures (number of followers or friends) in nding
key-players in the terrorism domain. Given a terrorism-related user,
his/her network community reveals a group of additional terrorism-
related users, exploiting the outcome of a community detection
method, with modularity maximization methods outperforming
density-based and other methods.
is work was supported by the project TENSOR (H2020-700024),
funded by the European Commission.
Detection of Terrorism-related Twier Communities using Centrality Scores MFSec’17, June 06, 2017, Bucharest, Romania
Ala Berzinji, Lisa Kaati, and Ahmed Rezine. 2012. Detecting key players in
terrorist networks. In Intelligence and Security Informatics Conference (EISIC),
2012 European. IEEE, 297–302.
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambioe, and Etienne Lefeb-
vre. 2008. Fast unfolding of communities in large networks. Journal of statistical
mechanics: theory and experiment 2008, 10 (2008), P10008.
Ludvig Bohlin, Daniel Edler, Andrea Lancichinei, and Martin Rosvall. 2014.
Community detection and visualization of networks with the map equation
framework. In Measuring Scholarly Impact. Springer, 3–34.
Phillip Bonacich and Paulee Lloyd. 2001. Eigenvector-like measures of centrality
for asymmetric relations. Social networks 23, 3 (2001), 191.
Sergey Brin and Lawrence Page. 2012. Reprint of: e anatomy of a large-scale
hypertextual web search engine. Computer networks 56, 18 (2012), 3825–3833.
Akemi Takeoka Chateld, Christopher G Reddick, and Uuf Brajawidagda. 2015.
Tweeting propaganda, radicalization and recruitment: Islamic state supporters
multi-sided twier networks. In Proceedings of the 16th Annual International
Conference on Digital Government Research. ACM, 239–249.
Aaron Clauset, Mark EJ Newman, and Cristopher Moore. 2004. Finding commu-
nity structure in very large networks. Physical review E 70, 6 (2004), 066111.
Santo Fortunato. 2010. Community detection in graphs. Physics reports 486, 3
(2010), 75–174.
Linton C Freeman. 1978. Centrality in social networks conceptual clarication.
Social networks 1, 3 (1978), 215–239.
Ilias Gialampoukidis, George Kalpakis, eodora Tsikrika, Stefanos Vrochidis,
and Ioannis Kompatsiaris. 2016. Key player identication in terrorism-related
social media networks using centrality measures. In European Intelligence and
Security Informatics Conference (EISIC 2016), August. 17–19.
Ilias Gialampoukidis, eodora Tsikrika, Stefanos Vrochidis, and Ioannis Kom-
patsiaris. 2016. Community detection in complex networks based on DBSCAN*
and a Martingale process. In Semantic and Social Media Adaptation and Personal-
ization (SMAP), 2016 11th International Workshop on. IEEE, 1–6.
Michelle Girvan and Mark EJ Newman. 2002. Community structure in social
and biological networks. Proceedings of the national academy of sciences 99, 12
(2002), 7821–7826.
Steve Harenberg, Gonzalo Bello, L Gjeltema, StephenRanshous, Jitendra Harlalka,
Ramona Seay, Kanchana Padmanabhan, and Nagiza Samatova. 2014. Community
detection in large-scale networks: a survey and empirical evaluation. Wiley
Interdisciplinary Reviews: Computational Statistics 6, 6 (2014), 426–439.
Jye Klausen. 2015. Tweeting the Jihad: Social media networks of Western
foreign ghters in Syria and Iraq. Studies in Conict & Terrorism 38, 1 (2015),
[15] Valdis Krebs. 2002. Uncloaking terrorist networks. First Monday 7, 4 (2002).
Fragkiskos D Malliaros and Michalis Vazirgiannis. 2013. Clustering and com-
munity detection in directed networks: A survey. Physics Reports 533, 4 (2013),
ME Newman and M Girvan. 2004. Finding and evaluating community structure
in networks. Physical Review E 69, 2 (2004), 026113.
Tingyuan Nie, Zheng Guo, Kun Zhao, and Zhe-Ming Lu. 2016. Using mapping
entropy to identify node centrality in complex networks. Physica A: Statistical
Mechanics and its Applications 453 (2016), 290–297.
Onook Oh, Manish Agrawal, and H Raghav Rao. 2011. Information control and
terrorism: Tracking the Mumbai terrorist aack through twier. Information
Systems Frontiers 13, 1 (2011), 33–43.
Symeon Papadopoulos, Yiannis Kompatsiaris, Athena Vakali, and Ploutarchos
Spyridonos. 2012. Community detection in social media. Data Mining and
Knowledge Discovery 24, 3 (2012), 515–554.
Pascal Pons and Mahieu Latapy. 2006. Computing communities in large net-
works using random walks. J. Graph Algorithms Appl. 10, 2 (2006), 191–218.
Jialun Qin, Jennifer J Xu, Daning Hu, Marc Sageman, and Hsinchun Chen. 2005.
Analyzing terrorist networks: A case study of the global sala jihad network. In
Intelligence and security informatics. Springer, 287–304.
Usha Nandini Raghavan, R
eka Albert, and Soundar Kumara. 2007. Near linear
time algorithm to detect community structures in large-scale networks. Physical
Review E 76, 3 (2007), 036106.
Martin Rosvall, Daniel Axelsson, and Carl T Bergstrom. 2009. e map equation.
e European Physical Journal Special Topics 178, 1 (2009), 13–23.
Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex
networks reveal community structure. Proceedings of the National Academy of
Sciences 105, 4 (2008), 1118–1123.
Sudhir Saxena, K Santhanam, and Aparna Basu. 2004. Application of social
network analysis (SNA) to terrorist networks in Jammu & Kashmir. Strategic
Analysis 28, 1 (2004), 84–101.
Robin L ompson. 2011. Radicalization and the use of social media. Journal of
strategic security 4, 4 (2011), 167.
Robyn Torok. 2010. “Make A Bomb In Your Mums Kitchen”: Cyber Recruiting
And Socialisation of ‘White Moors’ and Home Grown Jihadists. (2010).
Ines Von Behr. 2013. Radicalisation in the digital era: e use of the Internet in
15 cases of terrorism and extremism. (2013).
Jie Xu, Daning Hu, and Hsinchun Chen. 2009. e dynamics of terrorist net-
works: Understanding the survival mechanisms of Global Sala Jihad. Journal
of Homeland Security and Emergency Management 6, 1 (2009), 1–15.
... Regarding the last examples, some works in the literature proposed SNA approaches tailored for the detection of propaganda activities about terrorism in social networks. In [27], the authors describe SNA as a tool to fight this problem, and highlight the main tasks investigated in the counter-terrorism field, such as key-player identification [28,29], community discovery [30,31], link analysis [32,33] and dynamic network analysis [34,35]. ...
Full-text available
The massive adoption of social networks increased the need to analyze users’ data and interactions to detect and block the spread of propaganda and harassment behaviors, as well as to prevent actions influencing people towards illegal or immoral activities. In this paper, we propose HURI, a method for social network analysis that accurately classifies users as safe or risky, according to their behavior in the social network. Specifically, the proposed hybrid approach leverages both the topology of the network of interactions and the semantics of the content shared by users, leading to an accurate classification also in the presence of noisy data, such as users who may appear to be risky due to the topic of their posts, but are actually safe according to their relationships. The strength of the proposed approach relies on the full and simultaneous exploitation of both aspects, giving each of them equal consideration during the combination phase. This characteristic makes HURI different from other approaches that fully consider only a single aspect and graft partial or superficial elements of the other into the first. The achieved performance in the analysis of a real-world Twitter dataset shows that the proposed method offers competitive performance with respect to eight state-of-the-art approaches.
... In this case, terrorism is observed as an international threat negatively impacting foreign security and causing the destruction of life (Dwiwarno, 2018;Subagyo, 2021). Social media platforms such as Facebook and Twitter are also frequently implemented to quickly spread terrorism information (Tundis et al., 2019), leading to public awareness (Gialampoukidis et al., 2017a). ...
Full-text available
Social Media and Terrorism are often studied together and have become the focus of many authors in recent years. Therefore, this study aims to evaluate international publication trends on social media and terrorism, using the Scopus database through bibliometric analysis from 2009 to 2022. Data visualization and analysis were conducted using Microsoft Excel and VOSviewer. The results showed that the international publications trend reached a peak in 2018, with 103 publications emphasizing various topics, such as social media, terrorism, Twitter, terrorist attacks, and several issues related to terrorist activities and digital platforms. The United States was also the most common country of publication with the highest number of affiliated authors. In addition, the authors with the most published documents were Tsikrika T. and Vrochidis S., with the majority of reports prioritizing social sciences. These results are expected to contribute to the novelty of previous studies on social media and terrorism.
... Gialampoukidis et al. [47] collected ISIS-related data by searching fve keywords provided by law enforcement agencies and domain experts. So, this resulted in 9,528 tweets from 4,400 suspected ISIS-supporting users. ...
Full-text available
Social media platforms play a key role in fostering the outreach of extremism by influencing the views, opinions, and perceptions of people. These platforms are increasingly exploited by extremist elements for spreading propaganda, radicalizing, and recruiting youth. Hence, research on extremism detection on social media platforms is essential to curb its influence and ill effects. A study of existing literature on extremism detection reveals that it is restricted to a specific ideology, binary classification with limited insights on extremism text, and manual data validation methods to check data quality. In existing research studies, researchers have used datasets limited to a single ideology. As a result, they face serious issues such as class imbalance, limited insights with class labels, and a lack of automated data validation methods. A major contribution of this work is a balanced extremism text dataset, versatile with multiple ideologies verified by robust data validation methods for classifying extremism text into popular extremism types such as propaganda, radicalization, and recruitment. The presented extremism text dataset is a generalization of multiple ideologies such as the standard ISIS dataset, GAB White Supremacist dataset, and recent Twitter tweets on ISIS and white supremacist ideology. The dataset is analyzed to extract features for the three focused classes in extremism with TF-IDF unigram, bigrams, and trigrams features. Additionally, pretrained word2vec features are used for semantic analysis. The extracted features in the proposed dataset are evaluated using machine learning classification algorithms such as multinomial Naïve Bayes, support vector machine, random forest, and XGBoost algorithms. The best results were achieved by support vector machine using the TF-IDF unigram model confirming 0.67 F1 score. The proposed multi-ideology and multiclass dataset shows comparable performance to the existing datasets limited to single ideology and binary labels.
... Several research studies have attempted to solve the problem of detecting hate speech in general by differentiating hate and non-hate speech (Djuric et al. 2015;Ribeiro et al. 2017). Others have tackled the issue of recognizing certain types of hate speech, such as anti-religious hate speech (Albadi, Kurdi, and Mishra 2018;Zhang, Robinson, and Tepper 2018), jihadist (Ferrara et al. 2016;Gialampoukidis et al. 2017;Smedt, Tom, and Van Ostaeyen 2018;Wei, Singh, and Martin 2016), sexist, and racist (Badjatiya et al. 2017;Gambäck and Kumar Sikdar 2017;Pitsilis, Ramampiaro, and Langseth 2018). The problem has been addressed from different points of view seeking to achieve a state-ofthe-art result which is not yet been achieved. ...
Full-text available
There is an increased demand for detecting online hate speech, especially with the recent changing policies of hate content and free-of-speech right of online social media platforms. Detecting hate speech will reduce its negative impact on social media users. A lot of effort in the Natural Language Processing (NLP) field aimed to detect hate speech in general or detect specific hate speech such as religion, race, gender, or sexual orientation. Hate communities tend to use abbreviations, intentional spelling mistakes, and coded words in their communication to evade detection, which adds more challenges to hate speech detection tasks. Word representation from its domain will play an increasingly pivotal role in detecting hate speech. This paper investigates the feasibility of leveraging domain-specific word embedding as features and a bidirectional LSTM-based deep model as a classifier to automatically detect hate speech. This approach guarantees that the word is assigned its negative meaning, which is a very helpful technique to detect coded words. Furthermore, we investigate the use of the transfer learning language model (BERT) on the hate speech problem as a binary classification task as it provides high-performance results for many NLP tasks. The experiments showed that domain-specific word embedding with the bidirectional LSTM-based deep model achieved a 93% f1-score, while BERT achieved 96% f1-score on a combined balanced dataset from available hate speech datasets. The results proved that the performance of pre-trained models is influenced by the size of the trained data. Although there is a huge variation in the corpus size, the first approach achieved a very close result compared to BERT, which is trained on a huge data corpus, this is because it is trained on data related to the same domain. The first approach was very helpful to detect coded words while the second approach achieved better performance because it is trained on much larger data. To conclude, it is very helpful to build large pre-trained models from rich domains specific content in current social media platforms.
... In Gialampoukidis et al. (2017), researchers suggested a hybrid model that mixes Mapping Entropy Betweenness and community detection which recovers communities' accounts of using social media that present managers in the terrorism area. They show that the measure of centrality perform on the network of mentions performs better than another following measure (number of friends or followers ) to find key players in the terrorism area. ...
Full-text available
In recent decades, the world of social media became the most popular. Social media have transformed the world. The rapid and large choice of these technologies is transforming how we find communities, how we get information from the news. According to this growth of social media, cyber terrorism has become an international issue that threatens world peace. Cyberterrorism is becoming more famous on social media now. While the Internet grows more pervasive in every area interested users or organizations can use the anonymity provided by cyberspace to terrorize citizens, communities, specific groups, and entire countries, without the internal threat of capture, damage, or death to the criminal that being physically existing would begin. Besides, Social network analysis plays a key research field for detecting different groups in a cyber-terrorist network. Many researchers are interested to find these communities, the managers, and the influencers which present a predictive way to protect users of social media networks. Then, the enormous evolution of terrorist communities over time presents a big problem to analyze and detect them. In this article, we introduce a new method for communities detection according to the network of contact, the publications, and their evolution based on Twitter as a social network. Also, we find the managers and the influencers in terrorist communities using swarm techniques. Our proposed method object is to optimize our proposed objective function to have a coherent partitioning inspired by the artificial bees comportment and using the data warehouse to save data in every evolution over time. Finally, we illustrate the performance of our proposed method by an experimental study on the real and artificial network and with a comparative study with the same related recent works. We test the performance of our approach by applying different quality functions on the terrorist communities detected.
... The methodological breakthroughs in network science, in fact, benefited the study of criminal and terrorist networks by providing empirical insights that have supported old theories or contributed to the generation of new ones [29][30][31][32]. In terrorism research, the network paradigm has been applied among other things to study Islamist or jihadism organizations [33][34][35][36], alliances between actors in the global scenario [37] and support and radicalization through social media platform [38][39][40]. Yet, relational perspectives to the study of such social phenomena fail to go beyond the tangible connections between individuals (or groups of individuals). With very few exceptions, however, the literature on networks and crime and networks and terrorism have not considered other types of relationships, e.g., those between events or characteristics of events, which may reveal underlying knowledge structures that escape the traditional methodologies embedded in traditional Euclidean spaces generally employed to study actors or their behaviours. ...
Behaviours across terrorist groups differ based on a variety of factors, such as groups' resources or objectives. We here show that organizations can also be distinguished by network representations of their operations. We provide evidence in this direction in the frame of a computational methodology organized in two steps, exploiting data on attacks plotted by Al Shabaab, Boko Haram, the Islamic State and the Taliban in the 2013-2018 period. First, we present LabeledSparseStruct, a graph embedding approach, to predict the group associated with each operational meta-graph. Second, we introduce SparseStruct-Explanation, an algorithmic explainer based on LabeledSparseStruct, that disentangles characterizing features for each organization, enhancing interpretability at the dyadic level. We demonstrate that groups can be discriminated according to the structure and topology of their operational meta-graphs, and that each organization is characterized by the recurrence of specific dyadic interactions among event features.
... (2) Network-based analysis: Exploit metadata and online interactions (e.g. likes, re-tweets, comments, mentions, re-blogging and hyperlinks to other pages) to detect communities, social leaders or controllers [16,17]. ...
Purpose Social networks (SNs) have recently evolved from a means of connecting people to becoming a tool for social engineering, radicalization, dissemination of propaganda and recruitment of terrorists. It is no secret that the majority of the Islamic State in Iraq and Syria (ISIS) members are Arabic speakers, and even the non-Arabs adopt Arabic nicknames. However, the majority of the literature researching the subject deals with non-Arabic languages. Moreover, the features involved in identifying radical Islamic content are shallow and the search or classification terms are common in daily chatter among people of the region. The authors aim at distinguishing normal conversation, influenced by the role religion plays in daily life, from terror-related content. Design/methodology/approach This article presents the authors' experience and the results of collecting, analyzing and classifying Twitter data from affiliated members of ISIS, as well as sympathizers. The authors used artificial intelligence (AI) and machine learning classification algorithms to categorize the tweets, as terror-related, generic religious, and unrelated. Findings The authors report the classification accuracy of the K-nearest neighbor (KNN), Bernoulli Naive Bayes (BNN) and support vector machine (SVM) [one-against-all (OAA) and all-against-all (AAA)] algorithms. The authors achieved a high classification F1 score of 83\%. The work in this paper will hopefully aid more accurate classification of radical content. Originality/value In this paper, the authors have collected and analyzed thousands of tweets advocating and promoting ISIS. The authors have identified many common markers and keywords characteristic of ISIS rhetoric. Moreover, the authors have applied text processing and AI machine learning techniques to classify the tweets into one of three categories: terror-related, non-terror political chatter and news and unrelated data-polluting tweets.
Full-text available
Access to and interaction with natural blue or green spaces is a critical factor in quality of life and overall well-being. Studies have shown that exposure to natural areas has health benefits for individuals and society. Incorporating interconnected natural ecosystems into the urban fabric is recognized as a means of building urban resilience and mitigating climate change. It is therefore essential to strengthen and expand existing networks. Mathematical measures of centrality provide a valuable approach to analyzing networks, based on the assumption that certain nodes are more central due to better connectivity. However, due to their complexity, centrality measures are not widely used in urban planning studies, and no research has been conducted in specific Polish conditions. This study aims to fill this gap by testing the usefulness of centrality measures in Krakow’s system of green spaces. The results show that there are few well-connected green areas and that the centrality measures vary. The information provided by this study can contribute to a better understanding of the spatial distribution of green spaces in Krakow and in future to better management and decision-making processes aimed at improving the accessibility of green spaces and the quality of life of residents.
Past work on key actor detection in the area of terrorism typically focuses on the identification of Threats i.e. users with overt radical signals, or Influencers i.e. users who communicate a high volume of tweets and receive a high volume of replies from their followers. In this work, we expanded the detection of key actors to include Vulnerables, Threats and Influencers who displayed consistent behaviours and incorporated the social-level metrics between these actors to assess potential for radicalisation. Through a Twitter Analytic Pipeline which comprised a bot detector, psychological language analysis, stance detector, and social network analysis modules, we detected 14,246 Vulnerables, 25,704 Threats, and 430 Influencers out of a total of 166,653 users in Twitter conversations on the 2020 France terror attacks. A network of 570 users were identified as accounts of interest, demonstrating the utility of this approach in streamlining the detection of persistent, high-ranking Influencers and Threats while contributing to the identification of early risk signals exhibited by Vulnerables.
This study proposes a novel method for identifying the primary conspirators involved in terrorist activities. To map the information related to terrorist activities, we gathered information from different sources of real cases involving terrorist attacks. We extracted useful information from available sources and then mapped them in the form of terrorist networks, and this mapping provided us with insights in these networks. Furthermore, we came up with a novel centrality measure for identifying the primary conspirators of a terrorist attack. Because the leaders of terrorist attacks usually direct conspirators to conduct terrorist activities, we designed a novel algorithm that can identify such leaders. This algorithm can identify terrorist attack leaders even if they have less connectivity in networks. We tested the effectiveness of the proposed algorithms on four real-world datasets and conducted an experimental evaluation, and the proposed algorithms could correctly identify the primary conspirators and leaders of the attacks in the four cases. To summarize, this work may provide information support for security agencies and can be helpful during the trials of the cases related to terrorist attacks.
Conference Paper
Full-text available
Community detection is a valuable tool for analyzing complex networks. This work investigates the community detection problem based on the density-based algorithm DBSCAN*. This algorithm requires, though, a lower bound for the community size to be determined a priori, a challenging task. To this end, this work proposes the application of a Martingale process to DBSCAN* that progressively detects communities at various levels of granularity. The proposed DBSCAN*-Martingale community detection algorithm corresponds to an iterative process that progressively lowers the threshold of the size of the acceptable communities, while maintaining the communities detected for higher thresholds. Evaluation experiments are performed based on four realistic benhmark networks and the results indicate improvements in the effectiveness of the proposed DBSCAN*-Martingale community detection algorithm in terms of the Normalized Mutual Information and the RAND metrics against several state-of-the-art community detection approaches.
Large networks contain plentiful information about the organization of a system. The challenge is to extract useful information buried in the structure of myriad nodes and links. Therefore, powerful tools for simplifying and highlighting important structures in networks are essential for comprehending their organization. Such tools are called community-detection methods and they are designed to identify strongly intraconnected modules that often correspond to important functional units. Here we describe one such method, known as the map equation, and its accompanying algorithms for finding, evaluating, and visualizing the modular organization of networks. The map equation framework is very flexible and can identify two-level, multi-level, and overlapping organization in weighted, directed, and multiplex networks with its search algorithm Infomap. Because the map equation framework operates on the flow induced by the links of a network, it naturally captures flow of ideas and citation flow, and is therefore well-suited for analysis of bibliometric networks.
Conference Paper
Monitoring terrorist groups and their suspicious activities in social media is a challenging task, given the large amounts of data involved and the need to identify the most influential users in a smart way. To this end, many efforts have focused on using centrality measures for the identification of the key players in terrorism-related social media networks, so that their suspension/removal leads to severe disruption in the connectivity of the network. This work proposes a novel centrality measure, Mapping Entropy Betweenness (MEB), and assesses its effectiveness for key player identification on a dataset of terrorism-related Twitter user accounts by simulating targeted attacks that remove the most central nodes of the network. The results indicate that the MEB affects the robustness of this terrorist network more than well-established centrality measures, in the largest part of the attack process.
The intuitive background for measures of structural centrality in social networks is reviewed and existing measures are evaluated in terms of their consistency with intuitions and their interpretability.
Conference Paper
Islamic State (IS) terrorist networks in Syria and Iraq pose threats to national security. IS' exploitation of social media and digital strategy plays a key role in its global dissemination of propaganda, radicalization, and recruitment. However, systematic research on Islamic terrorist communication via social media is limited. Our research investigates the question: How do IS members/supporters use Twitter for terrorism communication: propaganda, radicalization, and recruitment? Theoretically, we drew on microeconomic network theories to develop a theoretical framework for multi-sided Twitter networks in the global Islamic terrorist communication environment. Empirically, we collected 3,039 tweets posted by @shamiwitness who was identified in prior research as "an information disseminator" for the IS cause. Methodologically, we performed social network analysis, trend and content analyses of the tweet data. We find strong evidence for Shamiwitness-intermediated multi-sided Twitter networks of international mass media, regional Arabic mass media, IS fighters, and IS sympathizers, supporting the framework's utility.
The problem of finding the best strategy to attack network or immunize population with a minimal number of nodes attracts current research interest. The assessment of node importance has been a fundamental issue in such research of complex networks. In this paper, we propose a new concept called mapping entropy (ME) to identify the importance of a node in the complex network. The concept is established according to the local information which considers the correlation among all neighbours of a node. We evaluate the efficiency of the centrality by static attacks and dynamic attacks on standard network models and real-world networks. The simulation result shows that the new centrality is more efficient than traditional attack strategies, no matter in static manner or dynamic manner.