ArticlePDF Available

Communities in criminal networks: A case study

Authors:

Abstract and Figures

Criminal organizations tend to be clustered to reduce risks of detection and information leaks. Yet, the literature exploring the relevance of subgroups for their internal structure is so far very limited. The paper applies methods of community analysis to explore the structure of a criminal network representing the individuals’ co-participation in meetings. It draws from a case study on a large law enforcement operation (“Operazione Infinito”) tackling the ‘Ndrangheta, a mafia organization from Calabria, a southern Italian region. The results show that the network is indeed clustered and that communities are associated, in a non-trivial way, with the internal organization of the ‘Ndrangheta into different “locali” (similar to mafia families). Furthermore, the results of community analysis can improve the prediction of the “locale” membership of the criminals (up to two thirds of any random sample of nodes) and the leadership roles (above 90% precision in classifying nodes as either bosses or non-bosses). The implications of these findings on the interpretation of the structure and functioning of the criminal network are discussed.
Content may be subject to copyright.
tions of these findings on the interpretation of the structure and functioning of
the criminal network are discussed.
Keywords: Criminal networks; Dark networks; Community analysis;
Centrality measures; Leadership identification; Membership identification
1. Introduction
Academics and law enforcement agencies are increasingly applying network
analysis to organized crime networks. While the current applications mainly
focus on the identification of the key criminals through centrality measures
(Varese, 2006b; Morselli, 2009a; Calderoni, 2014) and other individual attributes5
(Carley et al., 2002; Morselli and Roy, 2008; Malm and Bichler, 2011; Bright
et al., 2015; Agreste et al., 2016), the analysis of the subgroups and their influ-
ence on the criminal activities received very limited attention so far.
Subgroups are a natural occurrence in criminal networks. Criminal orga-
nizations may structure themselves in functional, ethnic, or hierarchical units.10
Furthermore, the constraints of illegality limit information sharing to prevent
leaks and detection, as criminal groups face a specific efficiency vs. security
trade-off (Morselli et al., 2007). This tends to make criminal organizations
globally sparse but locally clustered networks, often showing both scale-free and
small-world properties (Malm and Bichler, 2011). Also, the larger the criminal15
organization, the most likely and relevant is the presence of subgroups. These
considerations suggest that the analysis of subgroups in criminal networks may
provide insight on both the internal structure of large organized crime groups
and on the best preventing and repressive strategies against them.
The mafias are a clear example of large organized crime groups, often com-20
prising several families or clans with a specific hierarchy and a strong cohesion.
These units may show different interactions among them, ranging from open
conflict to pacific cooperation. Each mafia family is a subgroup within a larger
criminal network, and inter-family dynamics are determinant for the activities
2
of the mafias. Nevertheless, possibly due to the difficulties in gathering reliable25
data, the literature has so far neglected the role of the family in the structure
and the activities of the mafias.
In the literature of network analysis (e.g., Boccaletti et al., 2006; Barrat
et al., 2008; Newman, 2010), one of the most challenging areas of investiga-
tion in recent years is community analysis, which is aimed at revealing possible30
subnetworks (i.e., groups of nodes called communities, or clusters, or modules)
characterized by comparatively large internal connectivity, namely whose nodes
tend to connect much more with the other nodes of the group than with the
rest of the network. A large number of contributions have explored the theo-
retical aspects of community analysis and proposed a broad set of algorithms35
for community detection (Fortunato, 2010). Most notably, community analysis
has revealed to be a powerful tool for deeply understanding the properties of a
number of real-world complex systems in virtually any field of science, including
biology (Jonsson et al., 2006), ecology (Krause et al., 2003), economics (Piccardi
et al., 2010), information (Flake et al., 2002; Fortuna et al., 2011) and social40
sciences (Girvan and Newman, 2002; Arenas et al., 2004).
This paper aims to apply the methods of community analysis to criminal
networks analyzing the co-participation in the meetings of a large mafia orga-
nization. The exercise aims to explore the relevance of subgroups in criminal
networks, with a specific focus on the characterization of mafia clans and families45
and the identification of bosses. The case study draws data from a large law en-
forcement operation in Italy (“Operazione Infinito”), which arrested more than
150 people and concerned the establishment of several ’Ndrangheta (a mafia
from Calabria, a southern Italian region) groups in the area around Milan, the
capital city of the Lombardy region and Italy’s “economic capital” and second50
largest city. The exploration has a double relevance. First, it improves the un-
derstanding of the internal functioning of criminal organizations, demonstrating
that the Infinito network is clustered in subgroups, and showing that the sub-
groups identified by community analysis are related in a non trivial way with
the internal organization of the ’Ndrangheta. Second, it may contribute in the55
3
development of law enforcement intelligence capacities, providing tools for early
identification of the internal structure of a criminal group.
The internal organization of the ’Ndrangheta provides an interesting oppor-
tunity to explore the relevance of subgroups in criminal networks. Indeed, this
mafia revolves around the blood family (Paoli, 2003; Varese, 2006a). One or60
several ’Ndrangheta families, frequently connected by marriages, godfathering
and similar social ties, form a “’ndrina”. The “’ndrine” from the same area may
form a “locale”, which controls a specific territory (Paoli, 2007). The “locale”
is the main structural unit of the ’Ndrangheta. Each “locale” has a number of
formal charges, tasked with specific functions: the boss of the “locale” is the65
“capobastone” or “capolocale”, the “contabile” (accountant) is responsible for
the common fund of the “locale”, the “crimine” (crime) oversees violent actions,
and the “mastro di giornata” (literally “master of the day”) takes care of the
communication flows within the “locale”.
Since the organization in “locali” plays such an important role in the struc-70
ture of the ’Ndrangheta, our investigation is specifically oriented to assess their
significance in the sense of community analysis. Therefore, after illustrating
some details on the network data (Sec. 2), we first quantify the cohesiveness
of each “locale” in the Infinito network, discovering a quite diversified picture
where very cohesive “locali” coexist with others apparently not so significant.75
The results of community analysis (Sec. 3) show that the Infinito network is
significantly clustered, suggesting that subgroups play an important role in its
internal organization. If we try and match the clusters obtained by community
analysis with the “locali” composition, we interestingly discover that in most
cases clusters correspond either to “locali” or to unions of them. Then (Secs.80
4 and 5) we use the results from community analysis to identify the “locale”
membership of each network participant, and to spot the bosses of the organiza-
tion. The latter, in particular, is a problem which is known to be critical since
the early contributions in the field (e.g., Sparrow, 1991; Klerks, 2001; Krebs,
2002; Roberts and Everton, 2011), given the difficulty to collect accurate data85
on criminal networks. The results are finally discussed (Sec. 6) for their im-
4
plications on the interpretation of the structure and functioning of the criminal
network.
2. The Infinito network
“Operazione Infinito” was aimed at disentangling the organizational struc-90
ture of the ’Ndrangheta in Lombardy, with a special care in charting the hierar-
chical structure and the different “locali” existing in the region. The documen-
tation3provides information on a large number of meetings among members.
Indeed, most of the investigation focused on meetings occurring in private (e.g.,
houses, cars) or public places (e.g. bars, restaurants or parks). The two sets,95
namely meetings and participants, define a standard bipartite (two-mode) net-
work with 574 meetings and 256 participants. The projection of the bipartite
network onto the set of participants leads to a (one-mode) weighted, undirected
network, whose largest connected component – which we will denote hereafter
as the Infinito network – has N= 254 nodes and L= 2132 links (the density100
is ρ= 2L/(N(N1)) = 0.066). The weight wij is the number of meetings
co-participation between nodes iand j, and it ranges from 1 to 115. However,
the mean value of the (nonzero) weights is hwiji= 1.88 and about 70% of
them is 1, denoting that only very few pairs of individuals co-attended a large
number of meetings. Similarly, the distributions of the nodes degree kiand105
strength si=Pjwij display a quite strong heterogeneity: indeed, their average
values are, respectively, hkii= 16.8 and hsii= 31.5, but the most represented
individual in the sample has both degree and strength equal to 1.
The affiliation of an individual to the “locale”, namely the group controlling
the criminal activities in a specific territory, is formal and follows strict tradi-110
tional rules. Each “locale” has a boss who is responsible of all the activities in
front of the higher hierarchical levels (see Calderoni (2014) for further details).
The investigation activity of “Operazione Infinito” was able to associate 177
3Pretrial detention order issued by the preliminary investigation judge upon request by the
prosecution (Tribunale di Milano, 2011).
5
L0
L1
L2
L3
L4
L5
L6
L7
L8
L9
L10
L11
L12
L13
L14
L15
L16
L17
L18
L19
L20
Figure 1: The Infinito network: nodes are grouped and colored according to the “locali”
partition (Table 1).
individuals (out of 254) to one of the 17 “locali” identified in Milan area, the
region under investigation. Of the remaining ones, 35 were known to belong to115
“locali” based in Calabria (the region of Southern Italy where the ’Ndrangheta
had origin and still has its headquarters), 3 came from a Lombardy “locale” not
in the area of investigation (Brescia), and 8 were known to be non affiliated to
’Ndrangheta, whereas the correct classification of the remaining 31 individuals
remained undefined. The Infinito network is displayed in Fig. 14. In the figure,120
node color reflects the 17 “locali” discussed above.
As a first analysis, we assess whether the partition defined by the “locale”
membership is significant in the sense of community analysis, namely whether
the intensity of intra-“locale” meetings is significantly larger than that of the
4All network figures in the paper were produced with Gephi (Bastian et al., 2009).
6
contacts among members of different “locali”. If so, this would confirm, on one125
hand, the actual modular structure of the crime organization; on the other hand,
it would provide a tool for investigations, as the composition of the “locali” could
endogenously be derived by mining meetings data.
We denote by Ckthe subgraph induced by the nodes belonging to “locale” k.
We quantify the cohesiveness of Ckby the persistence probability αk, namely the130
probability that a random walker, which is in one of the nodes of Ck, remains
in Ckat the next step. This quantity, which proved to be an effective tool for
mesoscale network analysis (Piccardi, 2011; Della Rossa et al., 2013), reduces
in an undirected network to:
αk=PiCkPjCkwij
PiCkPj∈{1,2,...,N}wij
,(1)
namely to the fraction of the strength of the nodes of Ckthat remains within Ck
135
(the same quantity is referred to as embeddedness by some authors (e.g., Hric
et al., 2014)). Radicchi et al. (2004) defined community a subnetwork which
has αk>0.5. Obviously, the larger αk, the larger is the internal cohesiveness of
Ck. Notice that, since αktends to grow with the size Nkof Ck(trivially, αk= 1
for the entire network), large αkvalues must be checked for their statistical140
significance. We derive the empirical distribution of the persistence probabilities
¯αkof the connected subgraphs of size Nk(we do that by randomly extracting
1000 samples), and we quantify the significance of αkby the z-score:
zk=αkµ(¯αk)
σ(¯αk).(2)
A large value of αk(i.e., αk>0.5) reveals the strong cohesiveness of the sub-
graph Ck, while a large value of zk(i.e., zk>3) denotes that such a cohesiveness145
is not trivially due to the size of the subgraph, but it is anomalously large with
respect to the subgraphs of the same size.
Table 1 summarizes the values of αkand zkcomputed on the subgraphs
corresponding to the “locali” (see Fig. 1). Notice that L2 to L18 actually refer
to the 17 “locali” under investigation, all based in Milan area (Milan itself plus150
16 small-medium towns); L19 collects the individuals, participating in some
7
“locale” Nkαkzk
L0 not specified 31 0.08 -3.15
L1 not affiliated 8 0.03 -0.84
L2 Bollate 13 0.25 1.31
L3 Bresso 15 0.39 2.72
L4 Canzo 2 0.10 0.47
L5 Cormano 22 0.41 3.96
L6 Corsico 4 0.12 0.21
L7 Desio 19 0.63 6.40
L8 Erba 9 0.37 2.44
L9 Giussano 10 0.63 5.26
L10 Legnano 10 0.20 0.77
L11 Limbiate 1 0 -
L12 Mariano Comense 9 0.27 1.40
L13 Milano 16 0.62 5.78
L14 Pavia 5 0.13 0.25
L15 Pioltello 20 0.43 3.83
L16 Rho 5 0.18 0.78
L17 Seregno 12 0.93 8.73
L18 Solaro 5 0.06 -0.42
L19 Calabria “locali” 35 0.19 -0.97
L20 Brescia 3 0.17 0.98
Table 1: Testing the “locali” partition. In bold, the four “locali” with significant cohesiveness
(αk>0.5).
of the meetings, belonging to any of the Calabria “locali”, and L20 contains
those affiliated to Brescia, not subject to investigation and whose members
participated in the meetings only occasionally; L0 are the individuals with non
specified affiliation, L1 those who are not affiliated. Overall, only 4 “locali”155
out of 17 reveal strong – and statistically significant – cohesiveness, proving
to actually behave as communities in the sense of network analysis. Most of
the other ones, however, display very mild cohesiveness. It cannot be claimed,
therefore, that the “locali” partition as a whole is significant in functional terms.
In the next section, we analyze whether the network is actually organized around160
a different clusterization.
3. Community analysis
Given a partition C1, C2, . . . , CKof the nodes of a weighted, undirected
network into Ksubgraphs, the modularity Q(Newman, 2006; Arenas et al.,
8
Nkαkzk
C1 12 0.93 9.07
C2 18 0.72 7.79
C3 25 0.66 9.85
C4 25 0.63 9.11
C5 45 0.68 8.20
C6 62 0.78 8.30
C7 67 0.67 5.72
Table 2: Results of max-modularity community analysis
2007) is given by165
Q=1
2sX
k=1,2,...,K X
i,jCkwij sisj
2s,(3)
where s=Pisi/2 is the total link weight of the network. Modularity Qis
the (normalized) difference between the total weight of links internal to the
subgraphs Ck, and the expected value of such a total weight in a randomized
“null network model” suitably defined (Newman, 2006). Community analysis
seeks the partition with the largest Q: large values (Q1) typically reveal170
a high network clusterization. Although the exact max-Qsolution cannot be
obtained because computationally unfeasible even for small-size networks (For-
tunato, 2010), many reliable sub-optimal algorithms are available: here we use
the so-called “Louvain method” (Blondel et al., 2008).
The result is a partition with 7 clusters (Q= 0.48)5, whose data are re-175
ported in Table 2. All clusters, which range from small (12) to medium-large
(67, about 26% of the network size), are strongly cohesive (αkmuch larger than
0.5, with large zk). Overall, the Infinito network is therefore strongly cluster-
ized, a result not surprising given that Infinito is a one-mode network derived
from a two-mode affiliation network. Nonetheless, the relationship between the180
communities Ckand the “locali” Lhis non trivial, as we discuss below.
The max-modularity partition of the Infinito network is displayed in Fig.
2. The patterns of node colors – which refer to the “locali”, see Fig. 1 –
5The maximum modularity is upper bounded by QQ0= 1 1/K, with Knumber of
clusters (e.g., van Mieghen, 2010). The normalized modularity (e.g., Borgatti et al., 2002) is
in our case Q/Q0= 0.56.
9
C1
C2 C3
C4
C5
C6 C7
Figure 2: The Infinito network: nodes are grouped according to the max-modularity partition
(Table 2) and colored according to the “locali” partition (Table 1).
denote a non trivial relationship between the “locali” partition and the max-
modularity partition. To disentangle this aspect, we pairwise compare the “lo-185
cali” L0, L1, . . . , L20 (Table 1) and the communities C1, C2, . . . , C 7 obtained by
max-modularity (Table 2), quantifying similarities by precision and recall (e.g.,
Baeza-Yates and Ribeiro-Neto, 1999). Let mhk be the number of nodes classi-
fied both in Lhand in Ck. Then the precision phk =mhk /|Ck|is the fraction of
the nodes of Ckthat belongs to Lhwhereas, dually, the recall rhk =mhk/|Lh|190
is the fraction of the nodes of Lhthat belongs to Ck. If we interpret Lhas the
“true” set and Ckas its “prediction”, then the precision quantifies how many
of the predicted nodes are true, and the recall how many of the true nodes are
predicted. Then phk =rhk = 1 if and only if the sets Lhand Ckcoincide, while
phk 1 if most of the nodes of Ckbelong to Lh, and rhk 1 if most of the195
nodes of Lhare included in Ck.
Figure 3 (upper panels) summarizes the results of this analysis by a graphical
representation of the precision and recall matrices. We firstly note that “locale”
L17 perfectly matches community C1 (it is the community in the upper-left
corner of Fig. 2). Moreover, “locale” L13 can be approximately identified with200
10
C1 C2 C3 C4 C5 C6 C7
L0
L1
L2
L3
L4
L5
L6
L7
L8
L9
L10
L11
L12
L13
L14
L15
L16
L17
L18
L19
L20
precision phk
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
C1 C2 C3 C4 C5 C6 C7
L0
L1
L2
L3
L4
L5
L6
L7
L8
L9
L10
L11
L12
L13
L14
L15
L16
L17
L18
L19
L20
recall rhk
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
C1 C2 C3 C4 C5 C6 C7
L17
L3,L20
L13
L9,L12
L1,L5,
L6,L14
L10,L11,L15,
L16,L18
L2,L4,
L7,L8
precision phk
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
C1 C2 C3 C4 C5 C6 C7
L17
L3,L20
L13
L9,L12
L1,L5,
L6,L14
L10,L11,L15,
L16,L18
L2,L4,
L7,L8
recall rhk
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 3: Precision/recall matrices of the comparison between the “locali” and the max-
modularity communities. Above: the “locali” L0, L1,...,L20 are compared with the com-
munities C1, C2,...,C7. Below: after “locali” have been partially aggregated, the diagonal
dominance of the precision/recall matrices evidences that communities coincide to a large
extent with unions of “locali”.
large extent with single “locali” or unions of them.
These findings support the intuition that subgroups are important elements
in the internal organizations of the mafias. The clusterization of unions of
“locali” may suggest that clans or families may have closer connections with a
few others. Several investigations showed that “locali” may raise and decline,210
compete or collaborate, merge or separate. Based on meeting co-participation
patterns, community analysis methods can effectively reveal a clusterization
closely connected with the formal structure of the mafia. The next two sections
will explore whether community analysis techniques can further contribute to
identifying the bosses and the “locale” membership.215
11
Figure 3: Precision/recall matrices of the comparison between the “locali” and the max-
modularity communities. Above: the “locali” L0, L1,...,L20 are compared with the com-
munities C1, C2,...,C7. Below: after “locali” have been partially aggregated, the diagonal
dominance of the precision/recall matrices evidences that communities coincide to a large
extent with unions of “locali”.
C3, whereas C2 corresponds to a large extent to the union of L3 and L20, and
C4 to the union of L9 and L12. But also the last three columns of the recall
matrix clearly put in evidence that C5, C6 and C7 actually behave, to a large
extent, as unions of “locali”. This clearly emerges from the lower panels of
Fig. 3, where the precision/recall analysis is performed again but after “locali”205
have been partially aggregated in 7 supersets: the diagonal dominance of the
matrices phk, rhk highlights that, overall, the Infinito network is quite strongly
compartmentalized (see again Table 2), and the compartments coincide to a
large extent with single “locali” or unions of them.
These findings support the intuition that subgroups are important elements210
11
in the internal organizations of the mafias. The clusterization of unions of
“locali” may suggest that clans or families may have closer connections with a
few others. Several investigations showed that “locali” may raise and decline,
compete or collaborate, merge or separate. Based on meeting co-participation
patterns, community analysis methods can effectively reveal a clusterization215
closely connected with the formal structure of the mafia. The next two sections
will explore whether community analysis techniques can further contribute to
identifying the “locale” membership and the bosses of the organization.
4. Identifying the “locale” membership
In this section we consider the problem of identifying the “locale” mem-220
bership of those individuals for which such an information is unknown. In the
Infinito network (254 nodes), this problem arises for 31 nodes (see Table 1, row
L0).
The problem can be set in the general framework of label prediction (Zhang
et al., 2010): we are given a set of network nodes X={x1, x2, . . . , x254}and a225
set of labels L={L1, L2, . . . , L20}which, in our case, code the “locali” of the
criminal organization (Table 1). The majority of the nodes have a label: Lhis
assigned to node xi(and we write L(xi) = Lh) if xiis affiliated to “locale” Lh.
The correspondence nodes/labels is, however, partially unknown, since there
are 31 nodes of Xwhose labeling is unknown and must be predicted based on230
the network structure and on the known labels.
A very general approach to the above problem relies on the notion of node
similarity, based on the assumption that the more two nodes are similar (in a
sense to be defined – see below), the more likely their label is the same. There-
fore, once defined a similarity score sij between nodes (xi, xj), the probability235
that the unlabeled node xihas label Lhis assumed equal to
p(L(xi) = Lh) = P{xj|j6=i,L(xj)=Lh}sij
P{xj|j6=i,L(xj)L}sij
, h = 1,2,...,20.(4)
In words, p(L(xi) = Lh) counts the relative abundance of nodes labeled Lhin
the network, and weights each of these nodes by its similarity to xi. The label
12
predicted for node xiis the one attaining the largest p(L(xi) = Lh).
240
4.1. Node similarities
We consider and test four definitions of the similarity score sij: (i) and (ii)
are very popular and find many applications in social network analysis (e.g., L¨u
and Zhou (2011)), (iii) and (iv) exploit the partition found by max-modularity
community analysis (Sec. 3).245
(i) Common Neighbors (CN): denoting by Γ(xi) the set of nodes neighbors to
xi, we let
sij =|Γ(xi)Γ(xj)|,(5)
where |Q|denotes the number of elements of the set Q.
(ii) Weighted Common Neighbors (wCN): it generalizes the above definition
by exploiting the information on link weights (L¨u and Zhou, 2010):250
sij =X
k∈{Γ(xi)Γ(xj)}
wik +wkj
2.(6)
(iii) Common Community (CC): a binary indicator, stating that similarity is
equivalent to the membership to the same community:
sij =
1,if c(i) = c(j),
0,otherwise,
(7)
where c(i) denotes the community node ibelongs to.
(iv) Weighted Common Neighbors - Common Community (wCN-CC): it com-
bines (ii) and (iii). It is equal to the Weighted Common Neighbors sim-255
ilarity, but it is nonzero only when (xi, xj) are in the same community:
sij =
Pk∈{Γ(xi)Γ(xj)}
wik+wkj
2,if c(i) = c(j),
0,otherwise.
(8)
13
4.2. Results
The label identification procedure, with the different node similarities above
defined, has been tested on the Infinito network. Unfortunately, the specificity260
of the case does not allow one to validate the method on the 31 nodes which are
actually unlabeled – their “locale” is unknown by definition. Thus the procedure
has been applied to the 177 nodes with known label L2, L3, . . . , L18 (the “locali”
in Milan area, the region under investigation – see Table 1), assuming their label
is unknown and trying to recover it.265
In order to mimic the real situation, in which an entire pool of labels have to
be simultaneously identified, in our experiments we assume that the labels of m
nodes have to be reconstructed at the same time, and we test the effectiveness
of the procedure by letting mincreasing from 1 to 30. For each m, we randomly
extract 5 ×103samples of mnodes in “locali” L2, L3, . . . , L18, and predict270
simultaneously their labels via equation (4). For each sample, we compute the
precision as the fraction of correct guesses. More in detail, for each node under
test we increment a success counter sby 1 if the label which maximizes the
probability (4) is the correct node label, while if the probability of r > 1 labels
is equally maximal in (4) we increment the counter by 1/r if the correct node275
label is one of them. For the m-node sample, the precision of the reconstruction
is eventually given by s/m.
Figure 4 summarizes the results, in terms of mean and standard deviation
of the precision over the samples, for all m= 1,2,...,30 and for the four
similarity measures above defined. In principle, we expect that the larger m,280
the more difficult the prediction task, since the latter is based on a smaller
set of known labels. In this respect, the results are rather counterintuitive.
Firstly, the average precision is largely insensitive to m, and ranges from about
45% to 65% according to the similarity measure adopted. Notably, the best
performing method (wCN-CC) exploits the analysis of the community structure285
of the network. Secondly, the variability of the precision rate displays a clear
decreasing trend as mincreases. This behaviour is due to a sort of “large
numbers” effect: when very few labels are to be guessed, the success depends
14
0 102030
precision
0
0.2
0.4
0.6
0.8
1
CN
0 102030
0
0.2
0.4
0.6
0.8
1
wCN
n. of unlabeled nodes m
0 102030
precision
0
0.2
0.4
0.6
0.8
1
CC
n. of unlabeled nodes m
0 102030
0
0.2
0.4
0.6
0.8
1
wCN-CC
Figure 4: Precision of the label identification methods with respect to the number mof
unlabeled nodes. The curves represent the average precision (circles) plus/minus standard
deviation (crosses) over 5 ×103random samples of mnodes (CN: Common Neighbors;
wCN: Weighted Common Neighbors; CC: Common Community; wCN-CC: Weighted Com-
mon Neighbors - Common Community).
very much on the specific nodes under scrutiny. When a large pool of nodes
are instead investigated, successes and failures tend to balance in a proportion290
which mildly depends on the specific set of nodes. Overall, this analysis confirms
that, on the Infinito network, the precision of the label reconstruction procedure
can reach a proportion of about two thirds, even for sets of the same order of
magnitude of the real unlabeled set L0.
5. Identifying bosses295
In this section we focus on the relation between the hierarchical role of indi-
viduals within the ’Ndrangheta organization, and the pattern of their meeting
attendance, as modeled by the Infinito network. The aim is to explore whether
the results from community analysis can provide tools to identify individuals
with leading roles, who will be referred to as bosses from now on. As already300
pointed out in Sec. 1, the ’Ndrangheta relies on a formal hierarchy with multiple
15
ranks and offices. In particular, each “locale” normally appoints a few major
officers: the capobastone or capolocale is the head of the “locale”; the contabile
is the accountant who manages the common fund of the group; the crimine
(crime) oversees violent actions; the mastro di giornata (master of the day) en-305
sures the flow of information within the “locale” (Calderoni, 2014). Information
on the actual number and roles of the offices in the ’Ndrangheta is incomplete.
Yet, in some investigations the suspects discuss about the different offices: these
conversations are sometimes tapped by the police, as in the Infinito case.
The judicial documentation classifies 34 of the 254 nodes of the Infinito310
network as bosses. Calderoni (2014), working on the unweighed network, in-
vestigated the correlation between a set of node centrality measures (including
degree, strength, betweenness, closeness, and eigenvector centrality) and the
boss role of the node, finding that betweenness is by far the most effective pre-
dictor. Indeed, the average betweenness of bosses turns out to be about 15 times315
larger than that of non-bosses, testifying a brokering role of bosses within the
criminal network.
Here we want to further improve the predictive performance by exploiting the
information provided by community analysis. As a matter of fact, the partition
induced by max-modularity has the effect of placing each node in a specific320
position in terms of intra-/inter-community connectivity, an information that
can potentially be useful in assessing its functional role.
5.1. z-P analysis
We follow the z-P analysis approach proposed by Guimera and Amaral
(2005) (see also Guimera et al. (2005)) where, after community analysis has325
identified a partition into Kmodules, the intra- vs inter-community role of
each node iis quantified by a pair of indexes (zi, Pi). We denote by c(i)
{1,2, . . . , K}the community node ibelongs to, and by sc(i)
i=Pjc(i)wij the
internal strength of i, i.e., the strength directed towards nodes of c(i). By
straightforwardly extending the definitions of Guimera and Amaral (2005) to330
16
the case of weighted networks, we define the within-community strength as
zi=sc(i)
iµ(sc(i)
i)
σ(sc(i)
i),(9)
where µ(sc(i)
i) and σ(sc(i)
i) are the mean and standard deviation of sc(i)
iover all
nodes ic(i), and the participation coefficient as
Pi= 1
K
X
c=1 sc
i
si2
,(10)
where sc
i=Pjcwij is the strength of node idirected towards nodes of com-
munity c. The normalized internal strength zimeasures how strongly a node is335
connected within its own community. On the other hand, Piquantifies to what
extent a node tends to be uniformly connected to all communities (Pi1)
rather than only to its own community (Pi0).
Figure 5 shows the results of the z-P analysis of the Infinito network (no-
tice that we normalize zito take values in the [0,1] interval, i.e., zi(zi340
min zi)/(max zimin zi)). The figure highlights that bosses tend to concen-
trate on the upper-right part of the plot, namely they have both within-module
strength ziand participation coefficient Pilarger than average. As a matter of
fact, the ratio between the values of the two indicators for bosses and non-bosses
is 2.51 for zi, and 2.30 for Pi. It seems, therefore, that leading individuals have a345
twofold characterization, namely a connectivity larger than average within their
own community, and at the same time the capability of connecting to a large
number of the other communities. In order to get the most effective prediction,
we can combine the role of ziand Piin a unique indicator defined as the product
Wi=ziPi. The ratio between the Wivalue for bosses and non-bosses is 5.46:350
as evidenced in Fig. 5, only 2 bosses out of 34 have Wilower than average.
We now want to explicitly quantify the predictive ability of the z-P analysis
in identifying the leading roles within the criminal network, and compare it
with a non community-based indicator such as the betweenness bi. For that,
first notice that all the indicators bi,zi,Pi, and Wiinduce a ranking in the set of355
254 nodes. Table 3 summarizes the performance of the above indicators in terms
17
participation coefficient Pi
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
normalized within-community strength zi
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1<Pi>
<zi>
<Wi>
Figure 5: z-P analysis of the Infinito network. Each node is identified by a cross corresponding
to the (zi, Pi) coordinates: bosses are highlighted by red circles. The magenta lines correspond
to the average value of zi,Pi, and Wi, over all nodes.
method precision
zP-score Wi0.735
z-score zi0.677
betweenness bi0.677
P-score Pi0.294
Table 3: Identifying bosses: for each method, the precision is computed as the fraction of true
bosses among the top 34 nodes ranked by the related indicator.
of their predictive precision, assuming to know the exact number of bosses to be
guessed (i.e., 34). In other words, we count how many of the top-34 nodes in the
relevant indicator’s ranking are actually bosses. While the P-score alone seems
unable to effectively capture the leading nodes, the z-score and the betweenness360
both identify 23 bosses (although the two sets are slightly different), but the
zP-score outperforms all the methods identifying 25 bosses over 34.
One may wonder to what extent the above performances are influenced by
the assumption of knowing exactly the number of bosses, an information not
available in reality. For these reasons, we refine our analysis and compute the365
precision pfor all methods as a function of the number m= 1,2,...,34 of
guessed bosses, i.e., we take the top-mnodes for each index and we compute
18
number of predicted boss m
0 5 10 15 20 25 30
precision p
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
betweenness
P-score
z-score
zP-score
Figure 6: Identifying bosses: for a given number of predicted boss m, the precision is computed
as the fraction of true bosses among the top mnodes ranked by the related indicator.
how many of them actually correspond to bosses:
p=# of nodes correctly guessed among mnodes
m.(11)
The precision pas a function of mis depicted, for all methods, in Fig. 6. Overall,
the zP-score has the best performance, with 100% precision up to m= 12370
and a good performance even for the largest mvalues. Betweenness is a valid
alternative, displaying comparable performances except for large m.
5.2. Integrating network-based measures
We complement the previous analysis through a set of multiple logistic re-
gressions estimating the influence of different factors on the probability of being375
a boss. This integrates and expands the analyses of Calderoni (2014, 2015),
which were restricted to the individual centrality measures on the subset of
meetings with more than 3 participants (215 nodes).
The dependent dichotomous variable is derived from the judicial documents
(1 for bosses, and 0 for non-bosses). Independent variables include two of the380
network centrality measures retained in Calderoni (2014), namely the between-
ness and the strength, and the z-score and zP-score from the previous subsec-
tion. The models also include two control variables: the first is the number of
19
variable min max mean st.dev. 2 3 4 5 6
1 boss 0 1 0.134 0.341
2 betweenness 0 100 4.23 11.7 2 - .77 .73 .76 .83
3 strength 1 361 31.6 48.2 3 - .81 .84 .88
4 z-score 0 1 0.228 0.158 4 - .82 .67
5 zP-score 0 100 12.5 17.2 5 - .70
6 n. of meetings 1 179 7.29 16.7 6 -
7 mafia charge 0 1 0.5 0.5
Table 4: Descriptive statistics (left) and Pearson’s correlation coefficients (right) of the vari-
ables used in the regression (all correlations are statistically significant at p < 0.001 level). To
improve the readability of the results, betweenness and zP-score have been normalized to the
[0,100] range.
meetings attended by each individual, the second (mafia charge) is a dummy
one describing whether an individual was charged with the offence of mafia-type385
association in the court order, a possible bias in the network (Table 4).
Given the low number of bosses in the sample (34 out of 254), in the logistic
regressions we adopt the penalized maximum likelihood estimation proposed by
Firth (1993). This method compensates for low numbers in one of the cate-
gories of the dependent variable, making it a good approach for the Infinito390
network. As for the standard logistic regressions, it models a dichotomous de-
pendent variable y(in this case, the boss attribute) as a linear combination of
independent variables xi(y=a+b1x2+b2x2+. . .). The outcomes can be
expressed as odds ratio (OR), where OR = exp(bi). In the present application,
OR expresses the change in the probability that a node is a boss per unitary395
increase in any independent variable, all other variables equal. For OR = 1 the
probability is the same, for OR > 1 it increases, and for OR < 1 it decreases.
For example, OR = 1.1 means that a unitary increase in the independent vari-
able implies a +10% increase in the probability of a node being a boss. Since
the logistic regression predicts the value of the dependent variable based on the400
values of the independent variables, comparison between predicted and observed
values enables to assess its predictive power (percentage of correct predictions)
(Hosmer et al., 2013).
The results are summarized in Table 5. Model I replicates the best model
20
from Calderoni (2014) on a wider sample, yielding very similar results. A unit405
increase in betweenness centrality provides +11% increase in the probability of
being a boss, all other variables equal. The strength contributes with a +3.5%
increase in probability. The model correctly classifies 94.1% of the population
and 61.8% of bosses (compare with a random probability of 13.3%). Model
II relies only on the control variables, mafia charge and number of meetings.410
Both are significant and positive. Yet the overall capacity of the model is lower
than the first one (90.9%), with a remarkable decrease in the identification of
bosses (41.2%). Model III includes both individual centrality measures and the
controls. Both strength and betweenness maintain their significant and posi-
tive effect, whereas the controls are non-significant. The prediction success are415
similar to model I, especially for bosses. Models IV to VI test the community
measures identified in the previous section. Model IV shows that z-score has
no significant impact on the probability of being a boss, once tested along with
the control variables. Conversely, in Model V the zP-score has a statistically
significant and positive influence (+9.8% per unitary increase of zP-score) de-420
spite the presence of the controls. The last model (VI) includes the controls
and both betweenness and zP-score. The latter results as the only significant
variable with an impact of +8.6% on the probability of being a boss, all other
variables equal. Overall, the share of correct predictions is slightly lower than
models I and III, with the best results in model VI (92.9% and 58.8% for total425
correct predictions and correct boss predictions, respectively).
The regressions corroborate the results of the previous section. Network
analysis measures can effectively predict the leadership roles of individuals in a
criminal network. All network measures perform better than naturally observ-
able variables such as the two controls. Centrality measures are effective and430
yield the highest share of correct predictions. Among community measures,
zP-score has a significant capacity to predict bosses. In a model with central-
ity measures and controls, zP-score is the only statistically significant variable,
indicating a strong capacity to capture the behavior of leaders in criminal net-
works.435
21
I II III IV V VI
strength 1.035*** - 1.032** - - -
betweenness 1.111** - 1.108* - - 1.035
mafia charge - 12.87* 4.501 8.400* 5.444 5.172
n. of meetings - 1.167*** 0.982 1.116** 1.057 1.046
z-score - - - 33.34 - -
zP-score - - - - 1.098*** 1.086**
true non-bosses 218 217 217 217 216 216
false non-bosses 13 20 13 16 15 14
true bosses 21 14 21 18 19 20
false bosses 2 3 3 3 4 4
precision (total) [%] 94.1 90.9 93.7 92.5 92.5 92.9
precision (bosses) [%] 61.8 41.2 61.8 52.9 55.9 58.8
Table 5: Results of Firth’s logistic regressions on bosses. The upper part of the table reports
the odds ratio with the statistical significance (*p < 0.05, **p < 0.01, ***p < 0.001), the bot-
tom part summarizes the predictive capabilities (percentage of correct predictions) of Models
I-VI described in the text.
These findings expand the literature on leadership in criminal networks, as
previous studies mainly relied on centrality measures only, often finding that
betweenness centrality identified leadership roles within crime groups (Morselli,
2009a; Calderoni, 2014). Whereas the previous studies pointed out the role of
brokering positions, they neglected the analysis of subgroups and its implications440
for leadership. The application of community analysis measures shows that
criminal leaders not only have a notable brokering capacity, but also manage to
balance the connection within and outside their group. These results advocate
for expanding the concept of brokerage beyond individuals measures. In fact,
bosses not only meet unconnected individuals, but also have a crucial function445
in bridging their group with other groups.
6. Discussion and Conclusions
This paper applied community analysis methods to investigate the struc-
ture of a mafia organization. Focusing on meeting participation as a proxy for
the relationships among criminals, community analysis assessed the clusterized450
structure of the mafia and showed that it often mirrors the internal subdivision
of the mafia among several clans or “locali”, or unions of them. This supports
the intuition that subgroups matter in this type of organizations.
22
Given the type of data, it is unsurprising that the Infinito network shows
significant clusterization. Yet, the clusters only partially overlap with the455
’Ndrangheta “locali” and most of the “locali” lack statistically significant co-
hesiveness. “Locali” are open criminal groups, with active interactions among
them; members of different “locali” frequently met to discuss criminal activi-
ties and internal problems or to participate to social events (weddings, dinners,
celebrations). Operation Infinito provides examples of complex group dynamics460
among the different “locali”, ranging from alliances to conflicts, from conver-
gence to internal divisions. Overall, these findings corroborates the cautions
against overemphasizing the importance of formal organizational charts in crim-
inal networks (Paoli, 2002; Kleemans, 2014). The internal organization matters,
but also other factors determine the internal relations.465
Subgroups are important in Infinito, notwithstanding the partial relevance
of formal “locale” affiliations. The max-modularity partition identifies seven
distinct communities: two of them match specific “locali”, whereas five corre-
spond to unions of “locali”. Different factors may explain these associations.
For example, C4 comprises L9 (Giussano) and L12 (Mariano Comense), two470
neighboring municipalities, while C1 includes the “locale” of Seregno (L17),
just a few of kilometers south. In fact, affiliates to Giussano and Seregno were
originally members of the same “locale”. During the investigation two distinct
groups emerged, and the tensions may explain the different communities. Also,
Giussano’s leaders asked for the mediation of the boss of Mariano Comense to475
arrange a meeting with the regional coordinator, a cooperation which may elu-
cidate the inclusion of both in C4. Conversely, C7 includes both L4 (Canzo) and
L7 (Erba), the two northmost “locali” in Infinito, in a conflicting relation during
the investigation. The former union and subsequent contrasts may explain the
inclusion of both “locali” in C4. Similar examples abound in the court order and480
their full examination goes beyond the scope of this paper. Clearly, the union of
several “locali” under a single community may reflect different relations among
criminal groups and across space. Perhaps the rigidity of the max-modularity
method imposes an excessively rigid partition to a more dynamic and complex
23
reality (Ferrara et al., 2014). Nevertheless, the findings show that, notwith-485
standing the scarcity of resources, the analysis may provide useful information
on the internal functioning of dark networks.
In the light of these findings, we tested the effectiveness of community anal-
ysis to illuminate the internal organization of the mafia. We focused on two
operational applications, namely the identification of “locale” membership and490
of criminal leaders.
For the first application, a weighted combination of community and common
neighbors (wCN-CC) identifies up to 65% of any random sample of 1 to 30
individuals. These findings are expected, as our original bipartite network had
many small meetings at the “locale” level and a few large meetings among495
“locali”. However, they further demonstrate the potential of the analysis of
subgroups within criminal networks. One the one hand, “locali” do not behave
as communities in a network perspective; on the other hand, the wCN-CC shows
that communities remarkably improves the probability of correctly identifying
the “locale” membership.500
The second application integrates the abundant literature on the identifica-
tion of criminal leaders, both from criminology and computer science (Bright
et al., 2012; Catanese et al., 2013; Calderoni, 2014, 2015; Taha and Yoo, 2016).
Our results show that the zP-score, which captures the interplay between a node
connectivity within its community and to the other communities, can effectively505
single out the bosses of the mafia. This has interesting implications for the un-
derstanding of criminal leadership, for the improvement of criminal network
methods, and for the support of law enforcement and intelligence activities.
Our results point out that criminal leaders’ are strategically positioned not
only at the individual level, but also among subgroups. ’Ndrangheta bosses510
achieve strategic positions both to broker information and resources, and to
maintain a more secure indirect control over criminal activities. Some studies
show that leaders may opt for indirect control, with higher betweenness central-
ity and lower degree centrality than other criminals (Morselli, 2009a,b, 2010).
In other cases, especially when degree and betweenness are highly correlated,515
24
middle-level criminals may take the most central positions in the network, with
leaders resorting to other forms of control (Calderoni, 2012; Bright et al., 2012;
Agreste et al., 2016). These works, however, focused on strategic positioning at
the individual level. Conversely, our exploration analyzes for the first time crimi-
nal leadership and network subgroups. In Infinito, betweenness and strength are520
positively correlated (Pearson’s coefficient 0.77), with no significant differences
between leaders and other members (0.64 and 0.66, respectively). We demon-
strate that leaders often balance a strong direct connectivity towards their own
community (which partially overlaps with the ’Ndrangheta “locali”) with uni-
form connections with other communities. Leaders’ control their community525
and are central nodes within their “locale” (high z-score); at the same time,
leaders broker between communities, thus managing the flow of information
and other resources among different clusters of the criminal network.
Our study demonstrates the potential of meeting data for analyzing criminal
networks. Most previous studies focused on wiretapped telephone communica-530
tions, which may entail several bias (Campana and Varese, 2009; Agreste et al.,
2016). Criminals face a number of constraints due to the illegal nature of their
activities (Reuter, 1983; Paoli, 2003). Criminal networks experience a trade-off
between efficiency and security in terms of density and connectivity both at the
group and individual level (Morselli et al., 2007). Dark networks often prioritize535
security, for example renouncing to the efficiency of telephones. Whereas leaders
may evade telephones as a security measure, they may unable to avoid meetings
(Calderoni, 2014). Meeting participation is inherently related to the nature of
criminal leadership. Refraining from meetings inevitably affects the status of
a boss. In the ’Ndrangheta, participation to celebrations, dinners and social540
events is the sign of a leaders’ prerogatives and prestige. For example, leaders
from the “locali” in Lombardy were invited to the weeding between the sons
of two powerful ’Ndrangheta dynasties, the Pelle and the Barbaro. This was a
major event for the ’Ndrangheta. Invitations reflected the status and power of
a “locale”, whereas non-invitation pointed out its weakness. In Infinito, leaders545
discussed at length about participation and presents. Clearly, missing such an
25
occasion is not an option for a ’Ndrangheta boss. Not only for the opportunity
to discuss important matters with other invited leaders, but also for the social
and cultural relevance of being present at such an event. The Pelle-Barbaro
wedding is just one of the many important events in Infinito. The close link550
between leadership and meeting participation suggests that meeting data may
overcome the limitations of wiretaps and enable effective identification of leaders
in dark networks.
Lastly, the identification of leaders may have important implications for law
enforcement and intelligence activities. While leaders may favor meetings in-555
stead of telephone calls as a security strategy, this may turn into a weakness
they may hardly avoid. Law enforcement and intelligence agencies may mon-
itor meeting participation patterns to identify leaders to target with further
investigative efforts. In our study, measures derived from community analysis
(zP-score) equaled betweenness in predicting leadership roles (Calderoni, 2014).560
Another study showed that predictions are reliable and accurate even at the first
stages of the investigation (Calderoni, 2015). These applications may further
develop into intelligence techniques and integrate into the growing industry of
intelligence software (for further information and discussion, see Ferrara et al.
(2014); Taha and Yoo (2016)).565
Overall, these findings reinforce the idea that the tools of network analysis
can be fruitfully adopted to enhance the understanding of the structure and
function of organized crime. This study, however, has limitations which may
be addressed by future research. First, our results rely on a single case study,
which implies limited external validity. Future research should perform a deeper570
structural analysis on a pool of criminal networks, assessing whether peculiar
structural attributes turn out to be recurrent in such networks. Also, a further
extension should demonstrate the superiority of meeting data against wiretaps
data for the identification of criminal leaders. Second, the analysis focused on
the ’Ndrangheta, a traditional, hierarchical mafia with very specific internal575
structure (see Introduction). Its peculiarities may determine the importance
of meeting attendance, hindering the generalizability of the results to other
26
form of organized crime. Further studies should test whether other criminal
groups, such as drug-trafficking organizations, street gangs, and terrorist cells
show similar or different patterns. Last, in this study we applied traditional and580
relatively simple community analysis techniques. Given the growth of this field
of network studies, other methods might prove to be more effective - including
those specifically devoted to bipartite networks(e.g., Barber, 2007; Larremore
et al., 2014), as it is our data structure before projection.
Acknowledgements585
The authors would like to thank Giulia Berlusconi, Vera Ferluga, Nicola
Parolini, Samuele Poy, and Marco Verani for many useful discussions.
References
Agreste, S., Catanese, S., Meo, P.D., Ferrara, E., Fiumara, G., 2016. Network
structure and resilience of mafia syndicates. Information Sciences 351, 30 –590
47. doi:10.1016/j.ins.2016.02.027.
Arenas, A., Danon, L., Diaz-Guilera, A., Gleiser, P., Guimera, R., 2004. Com-
munity analysis in social networks. European Physical Journal B 38, 373–380.
doi:10.1140/epjb/e2004-00130-1.
Arenas, A., Duch, J., Fernandez, A., Gomez, S., 2007. Size reduction of complex595
networks preserving modularity. New Journal of Physics 9, 176. doi:10.1088/
1367-2630/9/6/176.
Baeza-Yates, R., Ribeiro-Neto, B., 1999. Modern Information Retrieval. Addi-
son Wesley.
Barber, M.J., 2007. Modularity and community detection in bipartite networks.600
Physical Review E 76, 066102.
Barrat, A., Barth´elemy, M., Vespignani, A., 2008. Dynamical Processes on
Complex Networks. Cambridge University Press.
27
Bastian, M., Heymann, S., Jacomy, M., 2009. Gephi: An open source soft-
ware for exploring and manipulating networks, in: Third International AAAI605
Conference on Weblogs and Social Media, San Jose, CA, USA. URL:
http://gephi.org.
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E., 2008. Fast un-
folding of communities in large networks. Journal of Statistical Mechanics -
Theory and Experiment , P10008. doi:10.1088/1742-5468/2008/10/P10008.610
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.H., 2006. Com-
plex networks: Structure and dynamics. Physics Reports 424, 175–308.
doi:10.1016/j.physrep.2005.10.009.
Borgatti, S., Everett, M., Freeman, L., 2002. Ucinet for Windows: Soft-
ware for Social Network Analysis. URL: https://sites.google.com/site/615
ucinetsoftware/.
Bright, D.A., Greenhill, C., Reynolds, M., Ritter, A., Morselli, C., 2015. The
use of actor-level attributes and centrality measures to identify key actors: A
case study of an Australian drug trafficking network. Journal of Contemporary
Criminal Justice 31, 262–278. doi:10.1177/1043986214553378.620
Bright, D.A., Hughes, C.E., Chalmers, J., 2012. Illuminating dark networks:
a social network analysis of an Australian drug trafficking syndicate. Crime
Law and Social Change 57, 151–176. doi:10.1007/s10611-011-9336-z.
Calderoni, F., 2012. The structure of drug trafficking mafias: the ‘Ndrangheta
and cocaine. Crime Law and Social Change 58, 321–349. doi:10.1007/625
s10611-012-9387-9.
Calderoni, F., 2014. Identifying mafia bosses from meeting attendance, in:
Masys, A (Ed.), Networks and Network Analysis for Defence and Security.
Springer. Lecture Notes in Social Networks, pp. 27–48.
Calderoni, F., 2015. Predicting organized crime leaders, in: Bichler, G and630
Malm, Aili E. (Ed.), Disrupting Criminal Networks: Network Analysis in
28
Crime Prevention. Lynne Rienner Publishers, Boulder, CO. volume 28 of
Crime Prevention Studies, pp. 89–110.
Campana, P., Varese, F., 2009. Listening to the wire: criteria and techniques
for the quantitative analysis of phone intercepts. Trends in Organized Crime635
15, 13–30. doi:10.1007/s12117-011-9131-3.
Carley, K.M., Krackhardt, D., Lee, J.S., 2002. Destabilizing networks. Connec-
tions 24, 79–92.
Catanese, S., Ferrara, E., Fiumara, G., 2013. Forensic analysis of phone call
networks. Social Network Analysis and Mining 3, 15–33. doi:10.1007/640
s13278-012-0060-1.
Della Rossa, F., Dercole, F., Piccardi, C., 2013. Profiling core-periphery net-
work structure by random walkers. Scientific Reports 3, 1467. doi:10.1038/
srep01467.
Ferrara, E., De Meo, P., Catanese, S., Fiumara, G., 2014. Detecting criminal645
organizations in mobile phone networks. Expert Systems with Applications
41, 5733–5750. doi:10.1016/j.eswa.2014.03.024.
Firth, D., 1993. Bias reduction of maximum-likelihood-estimates. Biometrika
80, 27–38. doi:10.1093/biomet/80.1.27.
Flake, G., Lawrence, S., Giles, C., Coetzee, F., 2002. Self-organization and650
identification of web communities. Computer 35, 66–71.
Fortuna, M.A., Bonachela, J.A., Levin, S.A., 2011. Evolution of a modular
software network. Proceedings of the National Academy of Sciences of the
United States of America 108, 19985–19989. doi:10.1073/pnas.1115960108.
Fortunato, S., 2010. Community detection in graphs. Physics Reports 486,655
75–174. doi:10.1016/j.physrep.2009.11.002.
29
Girvan, M., Newman, M., 2002. Community structure in social and biological
networks. Proceedings of the National Academy of Sciences of the United
States of America 99, 7821–7826. doi:10.1073/pnas.122653799.
Guimera, R., Amaral, L., 2005. Cartography of complex networks: modules and660
universal roles. Journal of Statistical Mechanics - Theory and Experiment ,
P02001. doi:10.1088/1742-5468/2005/02/P02001.
Guimera, R., Mossa, S., Turtschi, A., Amaral, L., 2005. The worldwide air trans-
portation network: Anomalous centrality, community structure, and cities’
global roles. Proceedings of the National Academy of Sciences of the United665
States of America 102, 7794–7799. doi:10.1073/pnas.0407994102.
Hosmer, D.W., Lemeshow, S., Sturdivant, R.X., 2013. Applied Logistic Regres-
sion, 3rd ed. John Wiley & Sons, Hoboken, NJ.
Hric, D., Darst, R.K., Fortunato, S., 2014. Community detection in networks:
Structural communities versus ground truth. Physical Review E 90, 062805.670
doi:10.1103/PhysRevE.90.062805.
Jonsson, P., Cavanna, T., Zicha, D., Bates, P., 2006. Cluster analysis of networks
generated through homology: automatic identification of important protein
communities involved in cancer metastasis. BMC Bioinformatics 7. doi:10.
1186/1471-2105-7-2.675
Kleemans, E.R., 2014. Theoretical perspectives on organized crime, in: Paoli, L.
(Ed.), The Oxford Handbook of Organized Crime. Oxford University Press,
pp. 32–52. doi:10.1093/oxfordhb/9780199730445.013.005.
Klerks, P., 2001. The network paradigm applied to criminal organisations: The-
oretical nitpicking or a relevant doctrine for investigators? Recent develop-680
ments in the Netherlands. Connections 24, 53–65.
Krause, A.E., Frank, K.A., Mason, D.M., Ulanowicz, R.E., Taylor, W.W., 2003.
Compartments revealed in food-web structure. Nature 426, 282–285. doi:10.
1038/nature02115.
30
Krebs, V., 2002. Mapping networks of terrorist cells. Connections 24, 43–52.685
Larremore, D.B., Clauset, A., Jacobs, A.Z., 2014. Efficiently inferring commu-
nity structure in bipartite networks. Physical Review E 90. doi:10.1103/
PhysRevE.90.012805.
u, L., Zhou, T., 2010. Link prediction in weighted networks: The role of weak
ties. EPL 89. doi:10.1209/0295-5075/89/18001.690
u, L., Zhou, T., 2011. Link prediction in complex networks: A survey. Physica
A - Statistical Mechanics and its Applications 390, 1150–1170. doi:10.1016/
j.physa.2010.11.027.
Malm, A., Bichler, G., 2011. Networks of collaborating criminals: Assessing the
structural vulnerability of drug markets. Journal of Research in Crime and695
Delinquency 48, 271–297. doi:10.1177/0022427810391535.
van Mieghen, P., 2010. Graph Spectra for Complex Networks. Cambridge
University Press, Cambridge, UK.
Morselli, C., 2009a. Hells angels in springtime. Trends in Organized Crime 12,
145–158. doi:10.1007/s12117-009-9065-1.700
Morselli, C., 2009b. Inside Criminal Networks. Springer.
Morselli, C., 2010. Assessing vulnerable and strategic positions in a criminal
network. Journal of Contemporary Criminal Justice 26, 382–392. doi:10.
1177/1043986210377105.
Morselli, C., Giguere, C., Petit, K., 2007. The efficiency/security trade-off in705
criminal networks. Social Networks 29, 143–153. doi:10.1016/j.socnet.
2006.05.001.
Morselli, C., Roy, J., 2008. Brokerage qualifications in ringing operations. Crim-
inology 46, 71–98. doi:10.1111/j.1745-9125.2008.00103.x.
31
Newman, M.E.J., 2006. Modularity and community structure in networks. Pro-710
ceedings of the National Academy of Sciences of the United States of America
103, 8577–8582. doi:10.1073/pnas.0601602103.
Newman, M.E.J., 2010. Networks: An Introduction. Oxford University Press.
Paoli, L., 2002. The paradoxes of organized crime. Crime Law and Social
Change 37, 51–97. doi:10.1023/A:1013355122531.715
Paoli, L., 2003. Mafia brotherhoods: organized crime, Italian style. Oxford
University Press, New York, NY.
Paoli, L., 2007. Mafia and organised crime in Italy: The unacknowledged suc-
cesses of law enforcement. West European Politics 30, 854–880. doi:10.1080/
01402380701500330.720
Piccardi, C., 2011. Finding and testing network communities by lumped Markov
chains. PLoS One 6, e27028. doi:10.1371/journal.pone.0027028.
Piccardi, C., Calatroni, L., Bertoni, F., 2010. Communities in italian corporate
networks. Physica A - Statistical Mechanics and its Applications 389, 5247–
5258. doi:10.1016/j.physa.2010.06.038.725
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D., 2004. Defin-
ing and identifying communities in networks. Proceedings of the National
Academy of Sciences of the United States of America 101, 2658–2663.
doi:10.1073/pnas.0400054101.
Reuter, P., 1983. Disorganized Crime: The Economics of the Visible Hand.730
MIT Press, Cambridge, MA, USA.
Roberts, N., Everton, S.F., 2011. Strategies for combating dark networks. Jour-
nal of Social Structure 12, 1–32.
Sparrow, M., 1991. The application of network analysis to criminal intelligence
- An assessment of the prospects. Social Networks 13, 251–274. doi:10.1016/735
0378-8733(91)90008-H.
32
Taha, K., Yoo, P.D., 2016. SIIMCO: A forensic investigation tool for identifying
the influential members of a criminal organization. IEEE Transactions on
Information Forensics and Security 11, 811–822. doi:10.1109/TIFS.2015.
2510826.740
Tribunale di Milano, 2011. Ordinanza di applicazione di misura coercitiva con
mandato di cattura - art. 292 c.p.p. (Operazione Infinito). Ufficio del giudice
per le indagini preliminari (in Italian).
Varese, F., 2006a. How mafias migrate: The case of the Ndrangheta in northern
Italy. Law & Society Review 40, 411–444. doi:10.1111/j.1540-5893.2006.745
00260.x.
Varese, F., 2006b. The structure of a criminal network examined: The Russian
mafia in Rome. Oxford Legal Studies Research Paper No. 21/2006 .
Zhang, Q.M., Shang, M.S., Lue, L., 2010. Similarity-based classification in
partially labeled networks. International Journal of Modern Physics C 21,750
813–824. doi:10.1142/S012918311001549X.
33
... Beyond the individual advantages that brokers may get from their strategic position, a striking feature of brokerage is the role it appears to play at the meso level, in connecting individuals belonging to different criminal groups (Calderoni et al. 2017;DellaPosta 2017;Schaefer et al. 2017), or market levels (Bright et al. 2018;Malm and Bichler 2011). Illicit drug distribution chains are relatively long, making it almost impossible (and especially risky) for a single group to connect all steps of the trade. ...
... Several studies investigated brokerage on its own, without examining the specific brokers across communities (e.g., Morselli et al. 2017;Morselli 2009a, b). The reverse situationmeso-level studies of communities within illicit networks-are also common (Ouellet et al. 2019;Lantz and Hutchison 2015;Calderoni et al. 2017;DellaPosta 2017;Schaefer et al. 2017). Yet, most of these studies do not focus on the individuals who may contribute to linking communities together. ...
... Yet, the objective of these studies was not to identify the brokers that held these communities together. The few studies that examined meso-level brokers in illicit networks focused on describing some of their personal attributes (e.g., rank in an organization) compared to others (Calderoni et al. 2017;DellaPosta 2017;Schaefer et al. 2017). These studies, however, did not rely on measures that could detect and score meso-level brokers systematically. ...
Article
Full-text available
Objectives Brokers are said to be the oiling chain of illicit networks, facilitating the efficient flow of illicit products to destination. Yet, most of the available brokerage measures focus on local or individual networks, missing the brokers who connect others across communities, such as market levels. This study introduces a robust measure that uncovers, scores, and positions these community brokers. Methods We used network data aggregated from numerous investigations related to 1,800 criminal entrepreneurs operating in Western Canada. After uncovering the communities using the Leiden algorithm, we developed a community brokerage score that assesses individual potential reach and control at the meso level, and that accounts for individual position changes due to different community structures. We examined how the score relates to brokerage and structural hole measures as well as seriousness of involvement in criminality. Results We found that the illicit network studied has a strong and stable community structure, and community brokers form about 9% of the population. The score developed is statistically robust and is not strongly related to network and structural hole measures, which confirms the need for a novel measure that captures this strategic position in illicit and other networks. Conclusions Community brokers are especially important in illicit networks where large-scale covert coordination among criminal entrepreneurs is risky. The measure we propose is not overlapping with currently existing brokerage measures and has the potential to contribute to our understanding of how products and information flow beyond local networks, in criminology and other fields.
... This area of research often begins with known cases used to detect certain network or structural signatures that then get expanded to look for subroutines within a broader network. For example, Calderoni et al. used community detection algorithms to look for subgroups or "locales" within the Ndrangheta mafia organization in Italy [48]. Based on data of meetings between participants obtained from the law enforcement operation "Operazione Infinito," the authors constructed a network and used the modularity-based Louvain algorithm to partition the network into communities. ...
... As noted by Wachs and Kertész [63], a group's exclusivity, with respect to their network structure, is directly proportional to the likelihood that they exhibit markers of cartel behavior, or crew behavior in this case. Inspired by the z-P analysis used in Guimera and Amaral [64] and Calderoni et al [48], we use two separate variables in order to discount any bias towards larger communities. ...
Article
Full-text available
Explanations for police misconduct often center on a narrow notion of “problem officers,” the proverbial “bad apples.” Such an individualistic approach not only ignores the larger systemic problems of policing but also takes for granted the group-based nature of police work. Nearly all of police work is group-based and officers’ formal and informal networks can impact behavior, including misconduct. In extreme cases, groups of officers (what we refer to as, “crews”) have even been observed to coordinate their abusive and even criminal behaviors. This study adopts a social network and machine learning approach to empirically investigate the presence and impact of officer crews engaging in alleged misconduct in a major U.S. city: Chicago, IL. Using data on Chicago police officers between 1971 and 2018, we identify potential crews and analyze their impact on alleged misconduct and violence. Results detected approximately 160 possible crews, comprised of less than 4% of all Chicago police officers. Officers in these crews were involved in an outsized amount of alleged and actual misconduct, accounting for approximately 25% of all use of force complaints, city payouts for civil and criminal litigations, and police-involved shootings. The detected crews also contributed to racial disparities in arrests and civilian complaints, generating nearly 18% of all complaints filed by Black Chicagoans and 14% of complaints filed by Hispanic Chicagoans.
... In line with other research on social systems 7-11 , complex networks can suitably describe the intricate relations among criminals and reveal the patterns based on which criminal organizations operate. Beyond theoretical explorations, recent articles have empirically demonstrated that these methods can be useful in investigations involving drug trafficking 12 , political networks 13,14 , police intelligence networks 15 , cartel detection 16 , money laundering 17 , pedophile rings 18 , and a range of other examples [19][20][21][22][23][24] . ...
Article
Full-text available
Recent research has shown that criminal networks have complex organizational structures, but whether this can be used to predict static and dynamic properties of criminal networks remains little explored. Here, by combining graph representation learning and machine learning methods, we show that structural properties of political corruption, police intelligence, and money laundering networks can be used to recover missing criminal partnerships, distinguish among different types of criminal and legal associations, as well as predict the total amount of money exchanged among criminal agents, all with outstanding accuracy. We also show that our approach can anticipate future criminal associations during the dynamic growth of corruption networks with significant accuracy. Thus, similar to evidence found at crime scenes, we conclude that structural patterns of criminal networks carry crucial information about illegal activities, which allows machine learning methods to predict missing information and even anticipate future criminal behavior.
... In most cases, we hope to ensure network connectivity, which has promoted research on network robustness in recent decades [3]- [6]. However, if a network is harmful, such as terrorist networks [7], criminal networks [8], epidemic spreading networks [9], financial contagion networks [10] and cancer networks [11], efficiently disrupting the structure and function of the network becomes a meaningful and challenging task. This so-called network disintegration problem has attracted increasing attention among researchers [12]- [16]. ...
Preprint
Full-text available
We live in a hyperconnected world---connectivity that can sometimes be detrimental. Finding an optimal subset of nodes or links to disintegrate harmful networks is a fundamental problem in network science, with potential applications to anti-terrorism, epidemic control, and many other fields of study. The challenge of the network disintegration problem is to balance the effectiveness and efficiency of strategies. In this paper, we propose a cost-effective targeted enumeration method for network disintegration. The proposed approach includes two stages: searching candidate objects and identifying an optimal solution. In the first stage, we use rank aggregation to generate a comprehensive node importance ranking, upon which we identify a small-scale candidate set of nodes to remove. In the second stage, we use an enumeration method to find an optimal combination among the candidate nodes. Extensive experimental results on synthetic and real-world networks demonstrate that the proposed method achieves a satisfying trade-off between effectiveness and efficiency. The introduced two-stage targeted enumeration framework can also be applied to other computationally intractable combinational optimization problems, from team assembly, via portfolio investment, to drug design.
... A complex system possesses constituents that are interrelated between themselves which is able represented as nodes (the composing elements of the system) and links (the known interconnection between nodes [5] [6]. The structure of these systems holds topological information that contains viable information which are processed into solutions to solve the inherent problems of the complex system being represented as a graphical network. ...
Article
Full-text available
Community detection using graph theory allows us to detect a community within an organized criminal syndicate that has network orientated structures. Classical community detection methods will have problems to detect communities with different network orientated structures even though they have similar nodes. Studying the inter-connections between the nodes by employing isomorphic subgraph analytics allows the researchers and law enforcement agencies to understand and to determine the key participants and the criminals' modus operandi of illicit operations. One of the domains which we have selected to work on is criminal network analysis as there is a lack of new perspective in the Criminal Network Analysis (CNA), which is urgently required as the modus operandi behind crimes are considerably complex now. We studied community detection in criminal networks using graph theory and formally introduced an algorithm that opened a new perspective of community detection compared to the traditional methods used to model the relations between objects using the isomorphic graph-based analytics. Community structure is an important property of complex networks, which is generally described as densely connected nodes and similar patterns of links. Our method differed from the traditional methods because our method allowed the law enforcement agencies to compare the detected communities, and this would allow a different point of view of the criminal network. This research allowed and assisted enforcement agencies and researchers to detect the same community from different patterns and structures by employing isomorphism. This would allow the detection of the communities that may not have been found using the traditional methods.
... The measure aroused some interest among scholars: for example, it has been used in Della Rossa et al. (2013) to detect the core-periphery structures in many real networks such as the Karate Club, the co-authorship, the proteins and the World Trade networks. Further analyses of the World Trade through persistence are available in Piccardi and Tajoli (2012) and they have been used to identifies the locali (the local mobs) of the n'drangheta criminal networks in Calderoni et al. (2017). ...
Preprint
The persistence probability is a statistical index that has been proposed to detect one or more communities embedded in a network. Even though its definition is straightforward, e.g, the probability that a random walker remains in a group of nodes, it has been seldom applied possibly for the difficulty of developing an efficient algorithm to calculate it. Here, we propose a new mathematical programming model to find the community with the largest persistence probability. The model is integer fractional programming, but it can be reduced to mixed-integer linear programming with an appropriate variable substitution. Nevertheless, the problem can be solved in a reasonable time for networks of small size only, therefore we developed some heuristic procedures to approximate the optimal solution. First, we elaborated a randomized greedy-ascent method, taking advantage of a peculiar data structure to generate feasible solutions fast. After analyzing the greedy output and determining where the optimal solution is eventually located, we implemented improving procedures based on a local exchange, but applying different long term diversification principles, that are based on variable neighborhood search and random restart. Next, we applied the algorithms on simulated graphs that reproduce accurately the clustering characteristics found in real networks to determine the reliability and the effectiveness of our methodology. Finally, we applied our method to two real networks, comparing our findings to what found by two well-known alternative community detection procedures.
... Community detection can be used in a variety of applications such as identifying the role of an undiscovered protein in biology, 2 clustering similar users for recommender systems, 3 optimization of routing policy in mobile networks, 4 and recently, identifying criminal and terrorist groups on social networks in security and criminology. 5 There are many different definitions of the term "community," usually depending on the field of application. Typically, a community is a group of nodes densely connected to each other and weakly connected to other communities in the graph. ...
Article
Full-text available
The community structure is proving to have a very important role in the understanding of complex networks, but discovering them remains a very difficult problem despite the existence of several methods. In this article, we propose a novel algorithm for discovering communities in complex networks based on a modified random walk (RW) and label propagation algorithm (LPA). First, we calculate the similarity between nodes based on the new formula of RW. Then, the labels are propagated by the obtained similarity of the first step using LPA. Finally, the third step will be a new measure to find the optimal partitioning of communities. Experimental results obtained on several real and synthetic networks reveal that our algorithm outperforms existing methods in finding communities.
Chapter
Right from the Internet, which has been publicly available, users have been able to engage with one another through virtual networks and in the last decade, due to the emergence of online social networks, community identification in complex networks has received a lot of attention. As community identification task involves identifying important people and their linkages, social network analysis is one such technique to analyse complex networks such as criminal networks. Keeping in view the diversity of actors and gangs involved in crime activities, the goal is to investigate and assess their characteristics so that the essential information characterising their behaviour is extracted. The current work will employ a social network analysis-based novel approach called sentiment analysis on influential nodes (SAOIN) to attain this important goal. Our approach claims to be computationally efficient as only the influential nodes (aka leaders) of the established subnetworks (communities) are taken into consideration for further investigation rather than inquiring each and every actor of a network. This discerns our model from other already existing community identifying techniques. The proposed model generates small subnetworks that can be used to discover the list of actors and their relationships that need to be inquired further, As opposed to other already existing community detecting methods that generates larger and much complex networks. This study inquires actors of the social network like Twitter whose activities promotes criminal propaganda across diverse stages. The information dissemination among these actors directs sole insight towards their behaviours.KeywordsNon-negative matrix factorization (NMF)Degree centrality
Article
The increase in executive pay has been attracting attention to the practice of peer benchmarking, which is commonly used to determine CEO compensation. Using a network approach, we construct and analyze the compensation peer network and examine how the structure of the network influences CEO pay. Using data on public firms in the period between 2006 and 2020, we find that the peer network exhibits strong community structure and that bridging across communities influences CEO pay. Specifically, CEO pay increases by approximately $10 million as the number of bridging ties increases from 30 to 100, which indicates that obfuscation can result in inflated CEO pay and supports the managerial power perspective. We empirically distinguish the predicted effect of peer benchmarking on CEO pay as outlined in the market for executive talent and the managerial power perspectives. We show that firms may avoid scrutiny and offer high CEO compensation when they either have a very small number of targeted bridging ties or a very large number of diffused, nontargeted bridging ties in the peer network. The intent of peer benchmarking was to make CEO compensation practices more transparent, legitimate, and functional; however, our findings indicate that these intentions have not been fully realized and instead benchmarking can be used to inflate CEO pay while avoiding stakeholder scrutiny through obfuscation. These insights provide an opportunity to policy makers to be more effective in encouraging additional transparency and stronger justification for boards' choice of peers.
Article
Full-text available
Corruption crimes demand highly coordinated actions among criminal agents to succeed. But research dedicated to corruption networks is still in its infancy and indeed little is known about the properties of these networks. Here we present a comprehensive investigation of corruption networks related to political scandals in Spain and Brazil over nearly three decades. We show that corruption networks of both countries share universal structural and dynamical properties, including similar degree distributions, clustering and assortativity coefficients, modular structure, and a growth process that is marked by the coalescence of network components due to a few recidivist criminals. We propose a simple model that not only reproduces these empirical properties but reveals also that corruption networks operate near a critical recidivism rate below which the network is entirely fragmented and above which it is overly connected. Our research thus indicates that actions focused on decreasing corruption recidivism may substantially mitigate this type of organized crime.
Article
Members of a criminal organization, who hold central positions in the organization, are usually targeted by criminal investigators for removal or surveillance. This is because they play key and influential roles by acting as commanders, who issue instructions or serve as gatekeepers. Removing these central members (i.e., influential members) is most likely to disrupt the organization and put it out of business. Most often, criminal investigators are even more interested in knowing the portion of these influential members, who are the immediate leaders of lower level criminals. These lower level criminals are the ones who usually carry out the criminal works; therefore, they are easier to identify. The ultimate goal of investigators is to identify the immediate leaders of these lower level criminals in order to disrupt future crimes. We propose, in this paper, a forensic analysis system called SIIMCO that can identify the influential members of a criminal organization. Given a list of lower level criminals in a criminal organization, SIIMCO can also identify the immediate leaders of these criminals. SIIMCO first constructs a network representing a criminal organization from either mobile communication data that belongs to the organization or crime incident reports. It adopts the concept space approach to automatically construct a network from crime incident reports. In such a network, a vertex represents an individual criminal, and a link represents the relationship between two criminals. SIIMCO employs formulas that quantify the degree of influence/importance of each vertex in the network relative to all other vertices. We present these formulas through a series of refinements. All the formulas incorporate novel-weighting schemes for the edges of networks. We evaluated the quality of SIIMCO by comparing it experimentally with two other systems. Results showed marked improvement.
Article
In this paper we present the results of the study of Sicilian Mafia organization by using Social Network Analysis. The study investigates the network structure of a Mafia organization, describing its evolution and highlighting its plasticity to interventions targeting membership and its resilience to disruption caused by police operations. We analyze two different datasets about Mafia gangs built by examining different digital trails and judicial documents spanning a period of ten years: the former dataset includes the phone contacts among suspected individuals, the latter is constituted by the relationships among individuals actively involved in various criminal offenses. Our report illustrates the limits of traditional investigation methods like tapping: criminals high up in the organization hierarchy do not occupy the most central positions in the criminal network, and oftentimes do not appear in the reconstructed criminal network at all. However, we also suggest possible strategies of intervention, as we show that although criminal networks (i.e., the network encoding mobsters and crime relationships) are extremely resilient to different kind of attacks, contact networks (i.e., the network reporting suspects and reciprocated phone calls) are much more vulnerable and their analysis can yield extremely valuable insights.