ArticlePDF Available

Generalized Rich-Club Ordering in Networks

Authors:

Abstract and Figures

Rich-club ordering refers to the tendency of nodes with a high degree to be more interconnected than expected. In this paper we consider the concept of rich-club ordering when generalized to structural measures that differ from the node degree and to non-structural measures (i.e. to node metadata). The differences in considering rich-club ordering (RCO) with respect to both structural and non-structural measures is then discussed in terms of employed coefficients and of appropriate null models (link rewiring vs metadata reshuffling). Once a framework for the evaluation of generalized rich-club ordering (GRCO) is defined, we investigate such a phenomenon in real networks provided with node metadata. By considering different notions of node richness, we compare structural and non-structural rich-club ordering, observing how external information about the network nodes is able to validate the presence of rich-clubs in networked systems.
Content may be subject to copyright.
1
Generalized Rich-Club Ordering in Networks
Matteo Cinelli 1
Department of Enterprise Engineering, Tor Vergata, Via del Politecnico, 1 - 00133 Rome, Italy
Abstract Rich-club ordering refers to the tendency of nodes with a high degree to be more interconnected than
expected. In this paper we consider the concept of rich-club ordering when generalized to structural measures that
differ from the node degree and to non-structural measures (i.e. to node metadata). The differences in considering
rich-club ordering (RCO) with respect to both structural and non-structural measures is then discussed in terms
of employed coefficients and of appropriate null models (link rewiring vs metadata reshuffling). Once a framework
for the evaluation of generalized rich-club ordering (GRCO) is defined, we investigate such a phenomenon in real
networks provided with node metadata. By considering different notions of node richness, we compare structural and
non-structural rich-club ordering, observing how external information about the network nodes is able to validate the
presence of rich-clubs in networked systems.
Keywords rich-club, node metadata, generalized, null model
I. INTRODUCTION
Networks are characterized by a number of topological properties that are able to provide important insights into
their functional aspects. Well-known examples are represented by the presence of communities [1], i.e. subgraphs
whose nodes have a higher probability to be linked to every node of the subgraph than to any other node of the
graph [2], or of core-periphery structures [3], i.e. structures that allow for the partitioning of the network into a
set of central and densely connected nodes (the core) and a set of noncentral and sparsely connected nodes (the
periphery) [4]. When the hubs of a certain network are densely interconnected (i.e. they form a tight subgraph often
referred to as core) such a network is said to display a rich-club [5]. The presence of a rich-club is quantitatively
recognized through the rich-club coefficient, called φ(k), which measures the ratio between the number of links among
the nodes having degree higher than a given value kand the maximum possible number of links among such nodes.
The rich-club coefficient, when compared with its expectation over a set of rewired networks with the same degree
sequence of the original one, is called φ(k)norm and a network is said to display rich-club ordering when φ(k)norm >1.
This phenomenon has been extensively investigated [6–11] as well as recognized in several real networks [12–15], with
special focus on neuroscience [16–19]. Such developments have fostered further research related to the study of the
network core such as its size [6, 20] and its contribution to network resilience [21], as well as its functional role [22].
Here we consider the concept of rich-club ordering by investigating the interconnections between nodes that are
considered as important from a number of different perspectives. Building on this, the generalized rich-club ordering
(GRCO) refers to the tendency of important nodes (under a certain declared point of view) to form a core denser than
expected. The importance of nodes can be evaluated from a structural point of view, e.g. the node degree or other
nodal centrality measures, and from a non-structural point of view, e.g. the node metadata. Node metadata refer
to non-structural information, such as social or technical attributes, related to network nodes that possibly display
a certain correlation with the observed network structure, and their importance is increasingly being recognized in
terms of understanding networked systems [23–28]. Additionally, node metadata represent exogenous information
about the network nodes (also in weighted networks) that may be impossible to split over the network links. It follows
that the study of GRCO becomes particularly interesting when dealing with networks with various node metadata;
as such, we aim to investigate the interrelation between such node metadata and the network structure.
For instance, if we consider a social network with known individuals’ incomes, we may find that the nodes with the
highest incomes, which are not necessarily hubs, are more interconnected than expected, while those with the highest
degree are not. Moreover, it is important to recall that, despite rich-club ordering and assortativity being two related
concepts, positive assortativity doesn’t necessarily imply rich-club ordering, and viceversa [7].
In the network from Figure 1 we have a slightly wealth-disassortative network rwealth =0.052 in which, conversely,
the wealthiest nodes (that are 5 if we set the wealth threshold to w > 93) are tightly connected (they have 7 links
out of 10) despite the fact they are not the hubs of the considered network. Moreover, this network, which displays
rdegree =0.282, doesn’t show rich-club ordering (to degree) for each value of k(i.e. φ(k)norm <1k). This means
that rich-club ordering to node metadata does not imply rich-club ordering to node degrees, and viceversa. In a more
general sense, we note that: in the case of rich-club ordering, in terms of node degree, it is easy to compute φ(k)norm ,
because we know which null model to use (degree-preserving rewiring [29]) while in terms of node wealth, the situation
1matteo.cinelli@uniroma2.it
2
88
36
8
55
57
12 81
62
15
20
49
64
10
7
44
94
97
98
99 100
68
79
51
6
23
Wealth
1
4
3
3
4
53
2
4
1
1
2
1
1
4
4
2
4
32
1
4
1
2
2
Degree
FIG. 1. Two toy networks with the same topology for which rich-club ordering can be evaluated with respect to structural and
non-structural measures. On the left, the node labels correspond to their wealth, while on the right, the node labels correspond
to their degree.
becomes trickier since the wealth can’t be directly considered a structural property. For this reason, in the following
sections, we provide a framework as a way of determining the evaluation of GRCO together with specific null models
for evaluating the significance of rich-club ordering in the case of node metadata.
II. RELATED WORKS
The concept of rich club ordering was initially introduced by Zhou and Mondrag´on [5] in order to analyze the
Internet topology at the Autonomous Systems level and to provide a reasonable explanation as to why such kind of
network includes tightly interconnected hubs. In order to investigate the presence of a rich-club, the authors of [5]
introduced the rich-club coefficient φ(r) in terms of rank rof the node (the sequence of ranks reflects the sequence of
node degrees arranged in non-increasing order).
After the contribution of [5], Colizza et al. [30] considered the rich-club coefficient φ(k) in terms of degree kof
the node (here the sequence of the degree is opposite to the degree sequence, i.e. the node degrees are arranged in
non-decreasing order). Since the work of [30], the rich-club coefficient has been mostly exploited in its φ(k) version;
however, the two coefficients yield identical results.
Despite the different formulations of the rich-club coefficient, the fundamental contribution of [30] derives from
exploiting a null model used to detect the presence of the rich-club. Such null model exploits the procedure of degree-
preserving rewiring introduced in [29] in order to compute the expected rich-club coefficient φ(k)norm. The reason
behind the necessity of a null model for studying rich-club ordering is the observation of the monotonically increasing
behavior of the rich-club coefficient φ(k). Such a statement was subsequently denied by the authors of [7], who also
discussed the interrelation of the assortativity coefficient [23] with the rich-club structure and introduced a null model
able to preserve the density of the rich-club.
Other works concerning the evaluation of rich-club ordering are related to its statistical significance [10, 31] in terms
of p-value under different null models; to its effect on other structural measures, such as the clustering coefficient
and degree assortativity [8] that are strongly influenced by the rich-club density; to the improvement of the rich-club
coefficient itself [6], thus allowing one to consider the constraints introduced by the degree sequence of the network.
Together with its implementation on unweighted networks, the concept of rich-club ordering has been also extended
to weighted networks by using various null models which are able to preserve different aspects of the network, including
degree and strength distribution [32, 33]. Other extensions of the rich-club involve dense [34], hierarchical [35]
and interdependent networks [36]. Moreover, other contributions discuss the importance of a rich-club in network
robustness [21], spreading processes [37], as well as the generative processes and dynamics that lead to networks
displaying a rich-club [9, 38].
Finally, another important aspect related to the presence of a rich-club is to measure its size in terms of number
of nodes. This has been achieved through the persistence probability of a random walker in the network [20] and
the number of nodes that are necessary to realize a complete subgraph from the degree sequence of the considered
network [6].
3
III. EVALUATING RICH-CLUB ORDERING FOR THE NODE DEGREE
Rich-club ordering can be quantified using the coefficient φ(k):
φ(k) = 2E>k
N>k(N>k 1) (1)
where E>k is the number of links among the N>k nodes having degree higher than a given value kand N>k(N>k 1)
2is
the maximum possible number of links among the N>k nodes. Therefore, φ(k) measures the fraction of links connecting
the N>k nodes out of the maximum number of links they might possibly share. This implies that φ(k) = 1 when the
N>k nodes are arranged into a clique. When rich-club ordering is investigated, the rich-club coefficient needs to be
compared against a null model in order to evaluate its significance (i.e. to test that the presence of rich-club ordering is
not a natural consequence of the considered degree sequence). The use of null models and of the normalization process
of structural measures in complex networks represents a practice widely used to comprehend whether an observed
pattern could have arisen by chance. For this reason, the normalization of the rich-club coefficient, suggested in [30]
and adopted in many further studies [10, 17, 34, 35, 39–41], is a necessary procedure that has to be adopted in order to
take into account the significance of this index. The normalization procedure of φ(k) involves an ensemble of rewired
networks which have the same degree sequence of the one under investigation and that, if generated in a sufficiently
large number, provide a null distribution of the rich-club coefficient. The rewiring procedure itself is simple since it
chooses two arbitrary edges at each step ((a,b) and (c,d) for instance) and changes their endpoints (such that we
obtain (a,d) and (c,b)) [29]; in cases whereby one or both of these new links already exist in the network, this step is
aborted and a new pair of links is selected. The described procedure has been widely adopted since it preserves an
important network parameter represented by nodes degree; however, other procedures that aim at preserving other
parameters may be adopted [7, 9, 31, 42, 43].
In general, the normalized rich-club coefficient [30] is defined as:
φ(k)norm =φ(k)
φ(k)rand
(2)
where φ(k)rand is the average rich-club coefficient across the set of rewired networks (typically 1000 networks [15, 18,
19]) and we observe rich-club ordering when φ(k)norm >1.
Additionally, in [7] it is argued that when the considered network is made up of nodes whose maximum degree
kmax is larger than the cut-off degree ks[44] (i.e. the quantity for which it is impossible to obtain networks with
no degree-degree correlation) the degree-preserving rewiring could produce randomized networks with a rich-club
coefficient that is too close to the initial one. This is because the rewiring procedure, in which couples of links are
uniformly sampled, could cause the disruption of several high-degree to low-degree connections with the consequent
creation of high-degree to high-degree connections. Indeed, since a high proportion of links is attached to hubs the
probability of picking a couple of links whose endpoints are hubs is relatively high, particularly when there are nodes
with degree kmax > ks.
IV. NULL MODELS FOR THE EVALUATION OF RICH-CLUB ORDERING
The evaluation of rich-club ordering in the case of degree exploits a null model that rewires the network while
keeping its degree sequence. As the degree can be considered a structural attribute of the node, a null model that
evaluates different network topologies (i.e. alters the original network structure while keeping certain fundamental
properties) constitutes a reasonable choice. The same choice seems to be reasonable also in the case of other structural
properties of the node (such as centrality measures) even if the degree-preserving rewiring doesn’t keep the same value
of centrality over the nodes due to the topology being subject to change.
In the case of non-structural attributes (i.e. node metadata), the structural rewiring doesn’t seem to be the unique
option. Indeed, we may be interested in knowing if different arrangements of the node metadata over the same network
structure are able to unveil rich-club ordering as well. In other words, we may also be interested in using a null model
that keeps the original network structure while reshuffling the node metadata.
More intuitively, when we evaluate rich-club ordering with link rewiring we are basically asking the question: does
the considered network possess a topology so unusual that it allows room for rich-club ordering? Alternatively, if
we evaluate rich-club ordering with metadata reshuffling we are basically asking the question: does the considered
network possess an arrangement of node metadata so unusual that it allows room for rich-club ordering?
As an example, let us suppose that a relatively large clique of wealthy nodes is present in a certain social network,
like that in the example of Section I. The two questions from above then become:
4
1. Is it so peculiar, given the degree of the wealthy nodes, to observe a realization that contains a clique made up
of such nodes?
2. Is it so peculiar, given the network structure, to observe a distribution of the node metadata such that the
wealthy nodes are arranged into a clique?
Consequently, we may be interested in understanding if the metadata distribution is related to the presence of
structural rich-club ordering, and if rich-club ordering, evaluated with respect to node metadata, can be interpreted
as a reinforcement (or a weakening) of the evidence of structural rich-club ordering. Indeed, the exploitation of node
metadata takes into account an additional layer of information that derives from the coupling between the network
structure and the node metadata.
Thus, in order to evaluate rich-club ordering in the case of node metadata we suggest a comparison between the
number of links observed among the rich nodes and the average number of links observed among such rich nodes over
two different ensembles: the former made up of networks and obtained via the rewiring of links; the latter made up
of vectors of metadata and obtained via the reshuffling of node attributes.
These two methods generate different ensembles into which different aspects of the original network are kept. In
the first case (link rewiring) we lose the original network topology while we keep its degree sequence and its degree-
attribute correlation. In the second case we lose the degree-attribute correlation but we keep the original network
structure. Both the methods, despite their clear differences, seem to provide a valid basis of comparison in the case of
node metadata. Moreover, when we evaluate GRCO considering node metadata, if the node metadata and the node
degrees don’t display a significant correlation (either positive or negative), we could observe a set of rich nodes with
very heterogeneous degrees. This doesn’t imply, however, the absence of a dense subgraph made up of rich nodes.
As an example, if we suppose that a subgraph of 15 nodes has the highest metadata values, then the degree of such
nodes has to be at least 14 in order to realize a clique that is connected to the rest of the graph; thus, considering
a sufficiently large network, such nodes don’t have to be necessarily hubs in order to establish a rich-club from the
metadata point of view. It follows that a positive correlation between node metadata and node degree has an effect
on the rich-club evaluation and that, in certain cases, the presence of rich-club ordering with respect to node degree
could also indicate rich-club ordering with respect to node metadata.
As an extreme case, if we have perfect positive correlation among the considered attribute and the node degree
ρx,k = 1, the sorting of the node metadata will correspond to the sorting of the node degrees (i.e. to the degree
sequence). Therefore, we will evaluate rich-club ordering with the same sorting of nodes but with implications
that will differ depending on the null model that we choose to adopt. Additionally, we should consider that the
degree-attribute correlation, when significant, represents an important feature of the considered network which, if not
completely dropped, represents a remarkable element to further stress the presence of rich-club ordering. Therefore,
rich-club ordering in the case of node metadata could be evaluated with respect to random reshuffling, and this would
provide us an ensemble of reshuffled labels that would be, in general, uncorrelated with the network structure in
terms of node degrees, but also stressed with respect to a reshuffling procedure that, similarly as in [45], keeps the
distribution of the degree-attribute correlation somewhat closer to that of the original network.
In order to address this point, in Section IX we introduce a procedure which, by keeping a certain node-attribute
correlation, aims at further stressing the presence of rich-club ordering by comparing it with a somewhat unfavorable
set of metadata shuffles.
V. EVALUATING RICH-CLUB ORDERING FOR STRUCTURAL MEASURES DIFFERENT FROM
DEGREE
When evaluating rich-club ordering (i.e. to compute φnor m) with respect to structural measures different from node
degree, we should take into account that the degree-preserving rewiring entails the two following aspects:
1. The structural measure of node imay change its value due to rewiring.
2. The number of nodes that retain a value of the considered structural measure above the threshold for which we
evaluate rich-club ordering may change.
In order to address this problem we evaluate rich-club ordering by creating a ranking of nodes; in other words, we
consider the rich-club coefficient as a measure of position. We thus rank, for each network in the random ensemble, the
nodes in non-decreasing order of the considered measure and we assign each of them to a position pwith p[1, N ].
Then we consider the number of links among the nodes that have a rank greater than a given value p. In other
words, while the degree sequence is fixed across rewired networks, the rewiring procedure may alter the values of
the structural measure associated to each node and consequently the number of nodes with a certain value of a such
5
measure. In order to address this issue and consider the same amount of nodes at each iteration (which corresponds to
keeping the denominator of φiconstant for a certain i) both in the original network and in the randomized ensemble,
we evaluate rich-club ordering by creating a ranking of such nodes. Therefore, the nodes of the original network and of
its randomized instances are ranked in non-decreasing order of the considered structural measure and assigned with a
position p[1, N ]. In such a way, for each network, the node with the lowest value of the considered measure will be
in position 1 while that with the highest value will be in position N, despite the possible differences of highest/lowest
values among different networks.
Therefore, in order to compute φ(p) we compute the density of connections among nodes whose index of position
is greater than p:
φ(p) = 2E>p
N>p(N>p 1) (3)
where E>p is the number of edges among the N>p nodes with centrality value greater then the value in position p
and N>p(N>p 1)
2is the maximum possible number of edges among the N>p nodes.
By using this procedure we obtain φ(p)norm =φ(p)
φ(p)rand where φ(p)rand is the average of φ(p) over the random
ensemble. It is worth mentioning that this way of computing the coefficient φ(p) (as a measure of position) is similar
to that proposed in the paper that originally discussed rich-club ordering [5]. Additionally, this measure is also
related to the rich-club coefficient for weighted networks (i.e. networks with non-binary links) proposed by [33]. In
[33], indeed, they consider structural measures such as the node strength (i.e. the sum of the weights attached to
the links of a certain node) and the average weight (i.e. the ratio between the node strength and the node degree),
and they normalize the rich-club coefficient with a method that reshuffles the weights over the links and then links
themselves.
VI. EVALUATING RICH-CLUB ORDERING FOR NON-STRUCTURAL MEASURES
The rich-club coefficient in the case of node metadata can be computed only for scalar metadata as we need a
quantity which, like the node degree or other structural measures, can be sorted in a certain order. The coefficient can
be easily derived from the case of node degree by considering, instead of the degree k, a certain value mcorresponding
to the value of the node metadata. Therefore, rich-club ordering can be discovered via the coefficient φ(m):
φ(m) = 2E>m
N>m(N>m 1) (4)
where E>m is the number of edges among the N>m nodes having metadata value higher than a given value mand
N>m(N>m 1)
2is the maximum possible number of edges among the N>m nodes.
The normalized rich-club coefficient, φ(m)norm, can be derived by considering mas the value corresponding to a
certain value of the node metadata, whilst considering φ(m)rand from two different perspectives. In other words, in
the case of node metadata, we obtain two values of φ(m)r and that depend on the null model that we use. In the case
of link rewiring, φ(m)rand is called φ(m)rew
rand and we use, as for example in [46], the coefficient:
φ(m)rew
norm =φ(m)
φ(m)rew
rand
(5)
while in the case of metadata reshuffling, φ(m)rand is called φ(m)resh
rand and we use the coefficient:
φ(m)resh
norm =φ(m)
φ(m)resh
rand
(6)
Finally it is worth adding that, in the case of non-structural measures, both the rewiring and reshuffling procedures
do not obviously affect the values of the metadata vector whose entries, in the latter case, are only modified in terms
of position.
6
VII. A FRAMEWORK FOR THE EVALUATION OF GENERALIZED RICH-CLUB ORDERING
(GRCO)
In Table VII we propose a framework for the evaluation of generalized rich-club ordering (GRCO 2).
Consider a certain node feature. It can be:
1 Structural
2 Non-structural
if Structural, it can be:
degree Compute φ(k)norm with degree-preserving rewiring
as in Equation 2
6= degree Compute φ(p)norm with degree-preserving rewiring
as in Equation 3
if Non-Structural:
Compute φ(m)rew
norm with degree-preserving rewiring
as in Equation 5
Compute φ(m)resh
norm with metadata reshuffling
as in Equation 6
TABLE I. Generalized framework for the evaluation of rich-club ordering
VIII. APPLICATION
A. Social Network
We test the introduced framework in the case of a criminal social network [49]. The network is made up of the
relationships (m= 315 that we consider unweighted) among confirmed members (n= 54) of a London street gang
between 2005-2009. We choose this network as it comes with various node metadata (which are an important piece
of information in criminal networks [50]) such as age, number of arrests and convictions. We compute the rich-club
coefficient for two structural characteristics of nodes, degree and eigenvector centrality, and for three non-structural
characteristics, corresponding to the node metadata using degree-preserving rewiring (ks'kmax = 25) and node
metadata reshuffling. In Figure 2, we observe that the considered network displays rich-club ordering φ(k)norm >1
to degree. Thus, in such a network, hubs happens to be more connected than what we observe, on average, across
the rewired network ensemble. The network displays also what we can call power-club ordering, as the nodes with
highest eigenvector centrality are also tightly connected. The latter result is, however, expected since the degree and
the eigenvector centrality are, in general, positively correlated [51].
When we consider the node age and the number of arrests we also observe rich-club ordering. We especially observe
how, in the two cases, the metadata reshuffling entails a stronger rich-club ordering φ(m)resh
norm φ(m)rew
norm >1 than
the link rewiring. This means that the metadata are arranged in a way that elicits rich-club ordering and that this
arrangement is hard to replicate via random label reshuffling. The fact that the two measures are both in favour of
rich-club ordering denotes that the presence of this phenomenon is far from being random from different perspectives,
thus underlining the importance of the interplay among the node metadata and the network topology. Conversely,
we observe slightly discordant results when the number of convictions is taken into account. For a certain value of
k(k= 9), rich-club ordering appears to be absent from the structural point of view and present from the metadata
point of view. Such a discrepancy is due to the fact that nodes with the highest number of convictions (>9 in
this case) also have heteregeneous degrees. Indeed, there are five nodes with more than nine convictions, and they
have degree d= [2,2,2,14,16]. The maximum number of links that those five nodes could share is ten, while in
the actual network they only share two links. Such a small amount of links is also due to the presence of nodes
with very low degree. Moreover, since about 90% of the nodes in the actual network have degree higher than 2 and
the network, being relatively dense, displays several complete subgraphs of size 5, then we should expect reshuffled
2R code for the evaluation of GRCO exploits the libraries igraph [47] and brainGraph [48] and is available at https://github.com/
cinHELLi
7
0.0
0.5
1.0
1.5
2.0
5 10 15 20 25
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50
p (eigenvector)
φnorm(p)
0
1
2
3
16 18 20 22 24
m (age)
φnorm
rew (m)
0
1
2
3
16 18 20 22 24
m (age)
φnorm
resh (m)
0.0
0.5
1.0
1.5
2.0
0 5 10 15 20
m (number of arrests)
φnorm
rew (m)
0.0
0.5
1.0
1.5
2.0
0 5 10 15 20
m (number of arrests)
φnorm
resh (m)
0
1
2
3
0.0 2.5 5.0 7.5
m (number of convictions)
φnorm
rew (m)
0
1
2
3
0.0 2.5 5.0 7.5
m (number of convictions)
φnorm
resh (m)
FIG. 2. Curves of the coefficient φnorm for the criminal social network. The dashed line occurs in correspondence with
φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,
age (rewiring), age (reshuffling), arrests (rewiring), arrests (reshuffling), convictions (rewiring), convictions (reshuffling).
instances displaying a rich-club coefficient below one. In other words, we should expect a higher number of links
among highly convicted nodes in randomized networks. In this case, the value of the rich-club coefficient φ(k)resh
norm
indicates that highly convicted elements tend to avoid each other.
More technically, this analysis entails that the degree heterogeneity of the nodes that we take into account, when
related to other elements such as the network density, is able to explain the discrepancies observed between the
coefficients φ(k)rew
norm and φ(k)resh
norm. A discordant result between the two coefficients is also partially explained by the
low value of the correlation coefficient between the degree and the number of convictions, which is ρdeg,conv = 0.058.
B. Linguistic Network
We consider the global language network in which each node represents a language and links connect languages
that are likely to be co-spoken [52]. In more detail, languages are connected according to the frequency of book
translations, i.e. two languages are connected if, at the very least, a book is translated from one language to the
other. The data are pre-processed in order to consider only the largest connected component of the language network
compatibly with the availability of node metadata. The resulting network has n= 54 nodes and m= 104 links, and
the node metadata are represented by two elements: the GDP (gross domestic product) per capita for a language and
the number of speakers of a certain language. As described in [52] the GDP per capita for a language is measured
as the average contribution of a single speaker of language lto the world GDP, and is calculated by adding the
contributions of speakers of lto the GDP of every country, and dividing the sum by the number of speakers of l. The
number of speakers of a certain language are computed using the speaker estimates from the June 14, 2012 version
of the Wikipedia Statistics page, as explained in [52]. In Figure 3 we observe how the global language network tends
to display rich-club ordering from both a structural and non-structural point of view. It is interesting how languages
with a relatively high number of speakers are, in some instances, less interconnected than expected, meaning that
certain books written originally in a widely spoken language were not directly translated into the other most spoken
languages. It is likely that such books were first translated into other less spoken languages, which acted as mediums.
8
0.9
1.0
1.1
5 10
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
0 20 40
p (eigenvector)
φnorm(p)
1.0
1.5
2.0
2.5
1000 10000
m (GDP per capita in $)
φnorm
rew (m)
2
4
6
1000 10000
m (GDP per capita in $)
φnorm
resh (m)
1.0
1.1
1.2
1.3
10 1000
m (Speakers in millions)
φnorm
rew (m)
4
8
12
16
10 1000
m (Speakers in millions)
φnorm
resh (m)
FIG. 3. Curves of the coefficient φnorm for the global language network. The dashed line occurs in correspondence with
φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,
GDP per capita (rewiring), GDP per capita (reshuffling), amount of speakers (rewiring), amount of speakers (reshuffling).
C. Transportation Network
We also consider the case of US airports network of domestic flights in December 2010 (the network is considered
in its undirected/unweighted version with n= 745, m= 4618 and ks< kmax = 166) 3in which the number of
flights departing from each airport and the number of passengers leaving a certain airport are used as node metadata.
These two quantities (reasonably) show a very high correlation with the node degrees ρ(pass, deg)=0.906 and
ρ(dep, deg)=0.928. For this reason, when we evaluate rich-club ordering with respect to node metadata, we observe
both positive but very different values of φ(m)rew
norm and φ(m)resh
norm. In more detail, in Figure 4, we observe very high
values of φ(m)resh
norm that depend on the fact that a random reshuffling of the node metadata causes a complete loss of
the observed correlation between the metadata and the degree. The US airports network displays rich-club ordering
to degree, to eigenvector centrality (from a certain point) and to the metadata values. In the latter case, we observe
that, because of the random reshuffling, the obtained values are clearly on a different scale than those obtained in
the case of link rewiring. In other words, the number of links among nodes with the highest metadata values is far
greater than that expected by chance. This implies that the arrangement of node metadata deriving from the original
network is significant and difficult to replicate (because of the degree-attribute correlation) using the current null
model (i.e. a model that randomly redistributes the node metadata). Moreover, this result confirms the presence and
the significance of interconnections among important airports from a wide array of perspectives connected to both
the traffic generated by the airports, as well as the airports themselves.
D. Technological Network
We consider a network obtained from the data of the seventh Framework Programme for Research and Technological
Development (FP7), provided by the European Commission (EC). FP7 was run from 2007 until 2013 with a total
budget of over 50 billions. Most of the budget was spent on grants to both European and global research institutions
to co-finance research, technological development and demonstration pro jects. Among the different lines of funding
of FP7, we consider the data of projects related to the call for environmental issues.
Using such data we first build a bipartite network in which one partition is made up of projects while the other
is made up of participants of projects. A link between the partitions exists if an institution participated in project.
3The network is available in the igraphdata package for R [53]
9
0.0
0.5
1.0
1.5
2.0
1 10 100
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
10 100 800
p (eigenvector)
φnorm(p)
0.0
0.5
1.0
1.5
2.0
10 1000
m (number of departures)
φnorm
rew (m)
0
20
40
60
10 1000
m (number of departures)
φnorm
resh (m)
0.0
0.5
1.0
1.5
2.0
10 1000 100000
m (passengers)
φnorm
rew (m)
0
20
40
60
80
10 1000 100000
m (passengers)
φnorm
resh (m)
FIG. 4. Curves of the coefficient φnorm for the US airports network. The dashed line occurs in correspondence with φnorm = 1,
the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, number
of departures (rewiring), number of departures (reshuffling), number of passengers leaving (rewiring), number of passengers
leaving (reshuffling).
Then we perform a one-mode projection of the bipartite network in a way such that two institutions are connected if
they participated in the same project. The resulting network has n= 2739 and m= 45667 and we consider as node
metadata the contribution of the EC to each institution, measured in euros.
The network of institutions that we take into account has a very peculiar structure in that the participants in
each project are connected in a complete subgraph, while institutions participating in multiple projects connect such
dense substructures. This network, made up of several interconnected cliques, is particularly apt to being studied in
terms of rich-club ordering to node metadata due to its particular structure. Indeed, we can foresee how the rewiring
procedure would break up the multiple cliques, which represent each financed project and thus provide evidence for
structural rich-club ordering. When we analyze rich-club ordering in terms of contribution of the EC, i.e. when we
ask ourselves if the richest nodes (in terms of received funds) are arranged into a rich-club , the reshuffling procedure
represents a more suitable null model since it preserves the network structure while changing the degree-metadata
correlation. In Figure 5, we observe that the considered network displays rich-club ordering from both a structural
and non-structural point of view, confirming, as also observed in [13] for projects funded by the Engineering and
Physical Sciences Research Council of United Kingdom, how research funds are allocated to rich-clubs. The fact that
an elite circle of academic institutions tends to over-attract funding [54] represents a major problem in research that
needs to be investigated in other datasets and addressed with proper measures and interventions aimed at reducing
evident inequalities.
IX. AN ALTERNATIVE TO RANDOM RESHUFFLING
Considering the reasoning outlined above, we suggest a procedure that, based on a certain parameter, is able to
reshuffle the node metadata while keeping a degree-metadata correlation profile closer to that of the original network.
Indeed, we aim at investigating GRCO discerning between two cases: where rich-club ordering is discovered due to
a distribution of the node metadata that is significant with respect to an appropriate null model for the considered
case and where rich-club ordering is discovered due to the comparison against networks whose attributes distribution
is too far from the original one.
The procedure is based on the idea of swapping a couple of metadata values whose corresponding entries in the
metadata vector are at a certain distance sfrom one another, and it is made up of the following steps:
1. Consider the vector of metadata of length Nand choose randomly an entry in position i[1, N ]
2. Select the parameter s[1, N ] which determines the range of the metadata swap. In other words, sdetermines
10
1.0
1.1
1.2
10 100
k (degree)
φnorm(k)
1.0
1.1
1.2
1.3
10 1000
p (eigenvector)
φnorm(p)
1.0
1.1
1.2
10000 1000000
m (EC Contribution in Euros)
φnorm
rew (m)
0
20
40
60
80
10000 1000000
m (EC Contribution in Euros)
φnorm
resh (m)
FIG. 5. Curves of the coefficient φnorm for the FP7 projects network. The dashed line occurs in correspondence with φnorm = 1,
the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, European
Commission contribution (rewiring), European Commission contribution (reshuffling).
the distance of the randomly chosen entry, in position i, from the candidate entry, in position i0=i±s, which
will be selected for the swap
3. Select the direction, δ∈ {0,1}, of the swap with a Bernoulli trial with probability p= 0.5
If δ= 0 set i0=is
If δ= 1 set i0=i+s
4. If is < 1 and δ= 0 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random
one entry in position [1, i 1] if i6= 1, or in position i0= 1 if i= 1. Swap the entries in position iand i0
5. If i+s > N and δ= 1 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random
one entry in position [i+ 1, N ] if i6=N, or in position i0=Nif i=N. Swap the entries in position iand i0
6. Else swap the entries in position iwith the entry in position i0where i0=i±sdepending on the value of δ
7. Repeat the steps from 3 to 6 O(M) times
In Figure 6 we pictorially display an iteration of the proposed procedure while in Figure 7 we show three distributions
of degree-attribute correlation in the case of the US airports networks, considering as node metadata the passengers
leaving each airport (node). The three represented cases are: random reshuffling; reshuffling with the described
procedure using as a parameter the mean degree s=k; reshuffling with the described procedure using as a parameter
the square root of the degree of the selected node s=ki.
It is worth noting that, asymptotically, the proposed procedure and the random shuffling should end up with
somewhat equivalent distributions of the node metadata. Indeed, by iterating the procedure for a number of times
which tends to infinity, we should observe reshuffled vectors displaying a degree attribute correlation which is close to
that of a randomized vector, regardless of the value of s. Nonetheless, since we are comparing the two cases of link
rewiring and metadata reshuffling we should also consider that in the former case the number of performed rewirings
is, in general, O(M) where Mis the number of links. This implies that, in practical contexts by performing O(M)
iterations, the proposed procedure would produce a correlation profile which differs from the random one. Indeed,
in Figure 7 we observe how the proposed procedure, regardless of the chosen parameter, keeps a higher correlation
profile that overlaps with the random one only in its left tail. By using this procedure we can further test the presence
of rich-club ordering on different normalized ensembles, thus obtaining the results of Figure 8. The obtained results
still confirm the presence of rich-club ordering in the case of node metadata but they are clearly on a different scale
with respect to the case of random reshuffling, as displayed in Figure 4. The results of this stress test provide further
evidence for the presence of a tight core in the considered airport network.
11
= 2
10
7
1
9
6
14
1
2
27
1
= 4 ! = 1
"= 6
10
7
1
9
6
14
1
2
27
1
= 4
!= 6
10
7
1
14
6
9
1
2
27
1
FIG. 6. Example iteration of the procedure of metadata reshuffling on a random metadata vector. A entry in position i= 4
is randomly selected and the parameter sis set to s=2. We suppose that the result of the Bernoulli trial is δ= 1 and thus
i0=i+s= 6. Finally the entries in position iand i0are switched and another iteration is repeated using the new metadata
vector.
0
50
100
150
200
-0.1 0.0 0.1 0.2 0.3 0.4
Correlation
Frequency
k
r
k
FIG. 7. Histograms displaying the frequencies of correlation values computed using 1000 shuffled vectors of node metadata.
We choose 1000 shuffled vectors as we also consider 1000 rewired networks when computing the normalized rich-club coefficient
in the case of structural measures. In the legend, rrefers to the random mixing of the node metadata while kand krefer to
the mixing parameters of the proposed procedure of metadata shuffling.
X. DISCUSSION
In this paper we discussed the generalization of the concept of rich-club ordering, considering both node structural
attributes and metadata. This allowed room for the evaluation of such a phenomenon from a number of different
perspectives that embed external information about nodes and that can be useful in the study of real networks. For
instance, when studying economic networks, such as trade networks or interbanks networks, one may be interested
in noticing whether the richest agents (in an economic sense) do actually form a rich-club whilst not being hubs. In
other words, whether they tend to saturate their degree by connecting only to other rich-members, thus minimizing
their feeder (i.e. rich-club to non rich-club) connections. The study of such feeder connections, whose endpoints are
nodes outside the rich-club, i.e. nodes which can be in a certain proportion considered eligible to join the rich-club,
has proved to be important in confirming the presence of rich-club ordering [6] and it could provide insights for the
12
0
2
4
6
8
10 1000 100000
m (passengers)
φnorm
resh (m)
k
0
2
4
6
8
10 1000 100000
m (passengers)
φnorm
resh (m)
k
FIG. 8. Results for the US airports network. We compute GRCO in the case of node metadata (number of passengers leaving)
by using the presented procedure of node metadata reshuffling. We use as parameters of the procedure the average degree (left)
and the square root of the node degree (right).
understanding of the dynamical properties of the rich-club which are, like the growth, still largely unexplored [20].
Moreover, GRCO can be easily extended to the case of weighted (i.e. networks with edge metadata) and directed
networks by using the right null models for these specific cases [32–34].
This generalization also aims at shedding more light on the relationship that exists between topological and non-
topological patterns in real networks, as well as at emphasizing the importance of node metadata. Given the current
possibility to collect and store increasingly richer datasets and networks, the metadata are indeed gaining attention
in Network Science and many topological phenomena such as the Friendship Paradox [55] (which states that your
friends have, on average, more friends than you have), are now being generalized considering the presence of node
characteristics [56]. The use of such metadata has also been extended to other topological network properties, such
as motifs [57], that are now enriched considering their functional aspects when examined in real networks [18].
Additionally, we discussed the importance of testing rich-club ordering with the appropriate null models, which
can provide us with a deeper understanding of the numerous facets of this problem. However, such an approach
always implies a trade-off between what can be kept and what can be dropped regarding the network structure and
its relation to the node metadata.
ACKNOWLEDGMENTS
The author thanks Leto Peel, Ra`ul J. Mondrag´on and Antonio Iovanella for their insightful suggestions and com-
ments.
[1] Mark EJ Newman and Michelle Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E 69,
026113 (2004).
[2] Santo Fortunato and Darko Hric, “Community detection in networks: A user guide,” Physics Reports 659, 1 – 44 (2016),
community detection in networks: A user guide.
[3] Robert M May, “Will a large complex system be stable?” Nature 238, 413 (1972).
[4] Peter Csermely, Andr´as London, Ling-Yun Wu, and Brian Uzzi, “Structure and dynamics of core/periphery networks,”
Journal of Complex Networks 1, 93–123 (2013).
[5] Shi Zhou and Ra´ul J. Mondrag´on, “The rich-club phenomenon in the internet topology,” IEEE Communications Letters
8, 180–182 (2004).
13
[6] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Rich-club ordering and the dyadic effect: Two interrelated
phenomena,” Physica A: Statistical Mechanics and its Applications 490, 808 – 818 (2018).
[7] Shi Zhou and Ra´ul J. Mondrag´on, “Structural constraints in complex networks,” New Journal of Physics 9, 173 (2007).
[8] Xiao-Ke Xu, Jie Zhang, and Michael Small, “Rich-club connectivity dominates assortativity and transitivity of complex
networks,” Phys. Rev. E 82, 046117 (2010).
[9] Ra´ul J Mondrag´on and Shi Zhou, “Random networks with given rich-club coefficient,” The European Physical Journal B
85, 328 (2012).
[10] Zhi-Qiang Jiang and Wei-Xing Zhou, “Statistical significance of the rich-club phenomenon in complex networks,” New
Journal of Physics 10, 043002 (2008).
[11] Christopher Ansell, Renata Bichir, and Shi Zhou, “Who says networks, says oligarchy? oligarchies as” rich club” networks.”
Connections (02261766) 35 (2016).
[12] Shi Zhou and Ra´ul J. Mondrag´on, “Accurately modeling the internet topology,” Phys. Rev. E 70, 066108 (2004).
[13] Athen Ma, Ra´ul J. Mondrag´on, and Vito Latora, “Anatomy of funded research in science,” Proceedings of the National
Academy of Sciences 112, 14760–14765 (2015), http://www.pnas.org/content/112/48/14760.full.pdf.
[14] Giovanna Ferraro and Antonio Iovanella, “Revealing correlations between structure and innovation attitude in inter-
organisational innovation networks,” International Journal of Computational Economics and Econometrics 6, 93–113
(2016).
[15] Manlio De Domenico and Alex Arenas, “Modeling structure and resilience of the dark network,” Phys. Rev. E 95, 022313
(2017).
[16] Martijn P Van Den Heuvel and Olaf Sporns, “Rich-club organization of the human connectome,” Journal of Neuroscience
31, 15775–15786 (2011).
[17] Logan Harriger, Martijn P. van den Heuvel, and Olaf Sporns, “Rich club organization of macaque cerebral cortex and its
role in network communication,” PLOS ONE 7, 1–13 (2012).
[18] Martijn P van den Heuvel, Ren´e S Kahn, Joaqu´ın Go˜ni, and Olaf Sporns, “High-cost, high-capacity backbone for global
brain communication,” Proceedings of the National Academy of Sciences 109, 11372–11377 (2012).
[19] Guusje Collin, Olaf Sporns, Ren´e CW Mandl, and Martijn P van den Heuvel, “Structural and functional aspects relating
to cost and benefit of rich club organization in the human cerebral cortex,” Cerebral cortex 24, 2258–2267 (2014).
[20] Athen Ma and Ra´ul J. Mondrag´on, “Rich-cores in networks,” PLOS ONE 10, 1–13 (2015).
[21] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Resilience of core-periphery networks in the case of rich-club,”
Complexity 2017 (2017).
[22] David S. Grayson, Siddharth Ray, Samuel Carpenter, Swathi Iyer, Taciana G. Costa Dias, Corinne Stevens, Joel T. Nigg,
and Damien A. Fair, “Structural and functional rich club organization of the brain in children and adults,” PLOS ONE
9, 1–13 (2014).
[23] Mark EJ Newman, “Mixing patterns in networks,” Phys. Rev. E 67, 026126 (2003).
[24] Juyong Park and Albert-L´aszl´o Barab´asi, “Distribution of node characteristics in complex networks,” Proceedings of the
National Academy of Sciences 104, 17916–17920 (2007).
[25] Ginestra Bianconi, Paolo Pin, and Matteo Marsili, “Assessing the relevance of node features for network structure,”
Proceedings of the National Academy of Sciences 106, 11433–11438 (2009).
[26] Leto Peel, Daniel B Larremore, and Aaron Clauset, “The ground truth about metadata and community detection in
networks,” Science advances 3, e1602548 (2017).
[27] Darko Hric, Tiago P Peixoto, and Santo Fortunato, “Network structure, metadata, and the prediction of missing nodes
and annotations,” Physical Review X 6, 031038 (2016).
[28] Mark EJ Newman and Aaron Clauset, “Structure and inference in annotated networks,” Nature Communications 7, 11863
(2016).
[29] Sergei Maslov and Kim Sneppen, “Specificity and stability in topology of protein networks,” Science 296, 910–913 (2002).
[30] Vittoria Colizza, Alessandro Flammini, M Angeles Serrano, and Alessandro Vespignani, “Detecting rich-club ordering in
complex networks,” Nature physics 2, 110–115 (2006).
[31] Alessandro Muscoloni and Carlo Vittorio Cannistraci, “Rich-clubness test: how to determine whether a complex network
has or doesn’t have a rich-club?” arXiv preprint arXiv:1704.03526 (2017).
[32] M. ´
Angeles Serrano, “Rich-club vs rich-multipolarization phenomena in weighted networks,” Phys. Rev. E 78, 026101
(2008).
[33] Tore Opsahl, Vittoria Colizza, Pietro Panzarasa, and Jose J Ramasco, “Prominence and control: the weighted rich-club
effect,” Physical review letters 101, 168702 (2008).
[34] Vinko Zlatic, Ginestra Bianconi, Albert D´ıaz-Guilera, Diego Garlaschelli, Francesco Rao, and Guido Caldarelli, “On the
rich-club effect in dense and weighted networks,” The European Physical Journal B 67, 271–275 (2009).
[35] Julian J McAuley, Luciano da Fontoura Costa, and Tib´erio S Caetano, “Rich-club phenomenon across complex network
hierarchies,” Applied Physics Letters 91, 084103 (2007).
[36] Lucas Daniel Valdez, Pablo Alejandro Macri, H Eugene Stanley, and Lidia Adriana Braunstein, “Triple point in correlated
interdependent networks,” Physical Review E 88, 050803 (2013).
[37] Kamal Berahmand, Negin Samadi, and Seyed Mahmood Sheikholeslami, “Effect of rich-club on diffusion in complex
networks,” International Journal of Modern Physics B 32, 1850142 (2018).
[38] M´at´e Csigi, Attila K˝or¨osi, J´ozsef B´ır´o, Zal´an Heszberger, Yury Malkov, and Andr´as Guly´as, “Geometric explanation of
the rich-club phenomenon in complex networks,” Scientific Reports 7(2017).
14
[39] Ed Bullmore and Olaf Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,”
Nature Reviews Neuroscience 10, 186–198 (2009).
[40] Olaf Sporns, Dante R Chialvo, Marcus Kaiser, and Claus C Hilgetag, “Organization, development and function of complex
brain networks,” Trends in cognitive sciences 8, 418–425 (2004).
[41] Gorka Zamora-L´opez, Changsong Zhou, and J¨urgen Kurths, “Exploring brain function from anatomical connectivity,”
Frontiers in neuroscience 5, 83 (2011).
[42] Shweta Bansal, Shashank Khandelwal, and Lauren Ancel Meyers, “Exploring biological network structure with clustered
random networks,” BMC bioinformatics 10, 1 (2009).
[43] R. J. Mondragˆon, “Network null-model based on maximal entropy and the rich-club,” Journal of Complex Networks 2,
288–298 (2014).
[44] Mari´an Bogun´a, Romualdo Pastor-Satorras, and Alessandro Vespignani, “Cut-offs and finite size effects in scale-free
networks,” The European Physical Journal B 38, 205–209 (2004).
[45] David Laniado, Yana Volkovich, Karolin Kappler, and Andreas Kaltenbrunner, “Gender homophily in online dyadic and
triadic relationships,” EPJ Data Science 5, 19 (2016).
[46] Nahuel Almeira, Ana L Schaigorodsky, Juan I Perotti, and Orlando V Billoni, “Structure constrained by metadata in
networks of chess players,” Scientific reports 7, 15186 (2017).
[47] Gabor Csardi and Tamas Nepusz, “The igraph software package for complex network research,” InterJournal Complex
Systems, 1695 (2006).
[48] Christopher G. Watson, brainGraph: Graph Theory Analysis of Brain MRI Data (2018), r package version 2.6.0.
[49] Thomas U. Grund and James A. Densley, “Ethnic homophily and triad closure: Mapping internal gang struc-
ture using exponential random graph models,” Journal of Contemporary Criminal Justice 31, 354–370 (2015),
https://doi.org/10.1177/1043986214553377.
[50] Salvatore Villani, Michele Mosca, and Mauro Castiello, “A virtuous combination of structural and skill analysis to defeat
organized crime,” Socio-Economic Planning Sciences (2018), https://doi.org/10.1016/j.seps.2018.01.002.
[51] Thomas W Valente, Kathryn Coronges, Cynthia Lakon, and Elizabeth Costenbader, “How correlated are network centrality
measures?” Connections (Toronto, Ont.) 28, 16 (2008).
[52] Shahar Ronen, Bruno Gon¸calves, Kevin Z Hu, Alessandro Vespignani, Steven Pinker, and C´esar A Hidalgo, “Links that
speak: The global language network and its association with global fame,” Proceedings of the National Academy of Sciences
111, E5616–E5622 (2014).
[53] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical
Computing, Vienna, Austria (2008), ISBN 3-900051-07-0.
[54] Michael Szell and Roberta Sinatra, “Research funding goes to rich clubs,” Proceedings of the National Academy of Sciences
112, 14749–14750 (2015).
[55] Ezra W. Zuckerman and John T. Jost, “What makes you think you’re so popular? self-evaluation maintenance and the
subjective side of the ”friendship paradox”,” Social Psychology Quarterly 64, 207–223 (2001).
[56] Young-Ho Eom and Hang-Hyun Jo, “Generalized friendship paradox in complex networks: The case of scientific collabo-
ration,” Scientific reports 4, srep04603 (2014).
[57] Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon, “Network motifs: simple
building blocks of complex networks,” Science 298, 824–827 (2002).
... However, such a trend is actually misleading since it is induced by a strong localization of connections among nodes displaying a high prominence in terms of degree. Once it is discovered that mostly highdegree nodes are very well interconnected (i.e., the network displays a rich-club (Zhou & Mondragón, 2004;Cinelli, 2019)), one could leverage such knowledge to optimize the diffusion of information about a specific product or service. Using the insights obtained resolving the ambiguity of the assortativity coefficient, we could exploit the density of connection among hubs being almost sure that targeting even just one hub would cause the complete diffusion of information across the whole network. ...
... Metadata can be either uncorrelated to the network structure (providing information unrelated to the system) or correlated to the network structure (well representing the observed patterns of connections). The analysis of metadata is of extreme importance (Peel, Larremore, & Clauset, 2017;Cinelli, 2019) since, when combined with structural outcomes, it is able to ease the interpretation of the observed structural patterns thus allowing room for a consistent use of information deriving from the network structure. In order to present the problem of interrelation between the network structure and the node metadata we will take into account the aforementioned example of a service provider willing to profile users for commercial purposes using a social network (represented here by the Harvard friendship network introduced in Section 4). ...
Article
The extent to which available data is continuously growing in terms of volume is forcing organizations to contend with and seek to resolve the so-called Big Data Challenge. Big data comes or can be structured in the form of networks from which information can be extracted via statistical and computational tools. The results of such investigations can be generally referred to as network outcomes. Such outcomes, despite being often characterized by a inner ambiguity, need to be well understood and interpreted in order to exploit the potentialities of network data, especially in practical situations. For this reason, addressing the ambiguity of network outcomes becomes a key issue in business-related environments, where the possibility of rapidly interpreting and properly exploiting network data can positively affect performances. In this paper, we propose a framework to face ambiguity of network outcomes that, by means of specific solutions, allows practitioners to successfully interpret and exploit the obtained outcomes.
... They constructed a weighted unipartite network in which the weight of each edge between two authors is equal to the number of coauthored papers, which corresponds to the one-mode projection of the bipartite network to a unipartite network, and then applied a method to detect weighted rich clubs for dyadic networks. The same method was applied to detect a rich club in a bipartite brain network (Crossley et al., 2013), a bipartite transportation network (Feng et al., 2016), and a bipartite technological network (Cinelli, 2019). In the present work, we investigate rich clubs in higher-order networks of collaborative grants among institutions, which one-mode projection does not characterize. ...
Article
Full-text available
Modern scientific work, including writing papers and submitting research grant proposals, increasingly involves researchers from different institutions. In grant collaborations, it is known that institutions involved in many collaborations tend to densely collaborate with each other, forming rich clubs. Here we investigate higher-order rich-club phenomena in networks of collaborative research grants among institutions and their associations with research impact. Using publicly available data from the National Science Foundation in the US, we construct a bipartite network of institutions and collaborative grants, which distinguishes among the collaboration with different numbers of institutions. By extending the concept and algorithms of the rich club for dyadic networks to the case of bipartite networks, we find rich clubs both in the entire bipartite network and the bipartite subnetwork induced by the collaborative grants involving a given number of institutions up to five. We also find that the collaborative grants within rich clubs tend to be more impactful in a per-dollar sense than the control. Our results highlight advantages of collaborative grants among the institutions in the rich clubs.
... There are other possible formulations of the normalized structural rich club [42] and of its generalization to weighted [34,[43][44][45], hierarchical [46], and temporal [38] networks, but the key point pooling all these descriptors together is the need to distinguish the case where nodes with a lot of (or strong) connections have more links between them just by chance from the case in which hubs have, indeed, an intense connectivity giving them, e.g., control over resources flowing in the system or facilitating the rapid exchange of information among them. ...
Article
Full-text available
Real systems are characterized by complex patterns of interactions between their units, by dynamical processes on them, and by the interplay of the two. It is well known that particular structures affect dynamical processes at different scales. Sometimes richly connected units are connected by costly, long-range links. In the brain, hubs form rich clubs for integrating information between different brain regions, and many biological and social networks show this same structural organization. It remains, however, unclear whether this structural organization alone enables a rapid communication between highly connected nodes or whether a functional rich club may emerge as a combination of direct links and longer paths between rich nodes. Here, we identify functional rich clubs through the diffusion geometry, providing a perspective on rich-club phenomena in complex networks. We show that weak structural rich clubs may be functionally stronger, thanks to bridge nodes, while diffusion inside strong structural rich clubs may be damped in modular networks.
... Complex network approaches and the science of networks put well in evidence the role of the links and of the underlying network topology in the propagation of contagions and cascades (Elliott et al. 2014;Markose et al. 2012;Varela and Rotundo 2016;White 2014). Besides the crossshareholdings, literature has examined other channels for detecting the connection among companies and their managers, posing in evidence the interlock of directorates and rich-club relationships (Cinelli 2019;Croci and Grassi 2014;D'Errico et al. 2009;Drago et al. 2015). ...
Article
Full-text available
In this work, we focus on the cross-shareholding structure in financial markets. Specifically, we build ad hoc indices of concentration and control by employing a complex network approach with a weighted adjacency matrix. To describe their left and right tail dependence properties, we explore the theoretical dependence structure between such indices through copula functions. The theoretical framework has been tested over a high-quality dataset based on the Italian Stock Market. In doing so, we clearly illustrate how the methodological setting works and derive financial insights. In particular, we advance calibration exercises on parametric copulas under the minimization of both Euclidean distance and entropy measure.
... There are two ways to determine the value of ξ, namely by degree or weighted degree. Cinelli [56] proposed a generalized rich-club framework for computing the normalized rich-club connectivity in terms of other structural measures distinct to degree, displayed as Formula (3): ...
Article
Full-text available
In this paper, we present a study on keyword selection behavior in social media analysis that is focused on particular topics, and propose a new effective strategy that considers the co-occurrence relationships between keywords and uses graph-based techniques. In particular, we used the normalized rich-club connectivity considering the weighted degree, closeness centrality, betweenness centrality and PageRank values to measure a subgroup of highly connected “rich keywords” in a keyword co-occurrence network. Community detection is subsequently applied to identify several keyword combinations that are able to accurately and comprehensively represent the researched topic. The empirical results based on four topics and comparing four existing models confirm the performance of our proposed strategy in promoting the quantity and ensuing the quality of data related to particular topics collected from social media. Overall, our findings are expected to offer useful guidelines on how to select keywords for social media-based studies and thus further increase the reliability and validity of their respective conclusions.
... A unifying framework for the weighted network has been proposed by Alstott et al. [35]. In addition to using a network structural attribute (e.g., network degree) to compute rich-club phenomenon, Cinelli [36] recently suggested a generalized rich-club framework using non-structural information (e.g., social or technical attributes related to network nodes). In his work, instead of using only the network degree, any structural measures distinct to degree (e.g., node centrality measures) could also be used to evaluate rich-club ordering. ...
Article
Full-text available
The brain is a complex network. Growing evidence supports the critical roles of a set of brain regions within the brain network, known as the brain’s cores or hubs. These regions require high energy cost but possess highly efficient neural information transfer in the brain’s network and are termed the rich-club. The rich-club of the brain network is essential as it directly regulates functional integration across multiple segregated regions and helps to optimize cognitive processes. Here, we review the recent advances in rich-club organization to address the fundamental roles of the rich-club in the brain and discuss how these core brain regions affect brain development and disorders. We describe the concepts of the rich-club behind network construction in the brain using graph theoretical analysis. We also highlight novel insights based on animal studies related to the rich-club and illustrate how human studies using neuroimaging techniques for brain development and psychiatric/neurological disorders may be relevant to the rich-club phenomenon in the brain network.
... The rich-club coefficient and its generalisations have been proved to be a useful measure for studying complex networks [49][50][51][52][53][54][55][56] . In recent years it has been used to describe the connectivity of the brain, the connectome 2,57-62 . ...
Article
Full-text available
Many of the structural characteristics of a network depend on the connectivity with and within the hubs. These dependencies can be related to the degree of a node and the number of links that a node shares with nodes of higher degree. In here we revise and present new results showing how to construct network ensembles which give a good approximation to the degree–degree correlations, and hence to the projections of this correlation like the assortativity coefficient or the average neighbours degree. We present a new bound for the structural cut–off degree based on the connectivity within the hubs. Also we show that the connections with and within the hubs can be used to define different networks cores. Two of these cores are related to the spectral properties and walks of length one and two which contain at least on hub node, and they are related to the eigenvector centrality. We introduce a new centrality measured based on the connectivity with the hubs. In addition, as the ensembles and cores are related by the connectivity of the hubs, we show several examples how changes in the hubs linkage effects the degree–degree correlations and core properties.
... The concept of rich-club can be easily extended to measures beyond degree and to weighted networks using appropriate null models for each of the cases, e.g. Opsahl et al. (2008), Cinelli et al. (2018);Cinelli (2019). In this one, we assess rich-club ordering in the case of node strength and we measure the density of connections among nodes with the highest strength φ(s). ...
Preprint
The speeches stated by influential politicians can have a decisive impact on the future of a country. In particular, the economic content of such speeches affects the economy of countries and their financial markets. For this reason, we examine a novel dataset containing the economic content of 951 speeches stated by 45 US Presidents from George Washington (April 1789) to Donald Trump (February 2017). In doing so, we use an economic glossary carried out by means of text mining techniques. The goal of our study is to examine the structure of significant interconnections within a network obtained from the economic content of presidential speeches. In such a network, nodes are represented by talks and links by values of cosine similarity, the latter computed using the occurrences of the economic terms in the speeches. The resulting network displays a peculiar structure made up of a core (i.e. a set of highly central and densely connected nodes) and a periphery (i.e. a set of non-central and sparsely connected nodes). The presence of different economic dictionaries employed by the Presidents characterize the core-periphery structure. The Presidents' talks belonging to the network's core share the usage of generic (non-technical) economic locutions like "interest" or "trade". While the use of more technical and less frequent terms characterizes the periphery (e.g. "yield" ). Furthermore, the speeches close in time share a common economic dictionary. These results together with the economics glossary usages during the US periods of boom and crisis provide unique insights on the economic content relationships among Presidents' speeches.
Chapter
Full-text available
This paper argues that stratified structures in university systems should be addressed more explicitly in debates on research funding. The paper connects findings from several streams of literature on US-American research universities: (a) the relationship of organizational status and scientific quality, (b) positional competitions among elite universities, (c) concentration of research funding, and (d) faculty exchange networks as measures of university prestige. Taken together, these literatures reveal a crystalline hierarchy with intense competition for scientific talent at the top but little opportunity for upward institutional and personal mobility. While elite universities provide advantages in terms of research output and prestige, the findings point to social closure as a potentially problematic outcome for a democratic knowledge society. Therefore, the comparison highlights two policy challenges by means of two scenarios: closing the gap in organizational resources, while at the same time ensuring continuing expansion of the research university system in Europe.
Article
In this article, we present two new concepts related to subgraph counting where the focus is not on the number of subgraphs that are isomorphic to some fixed graph $H$, but on the frequency with which a vertex or an edge belongs to such subgraphs. In particular, we are interested in the case where $H$ is a complete graph. These new concepts are termed vertex participation and edge participation, respectively. We combine these concepts with that of the rich-club to identify what we call a Super rich-club and rich edge-club. We show that the concept of vertex participation is a generalization of the rich-club. We present experimental results on randomized Erdös–Rényi and Watts–Strogatz small-world networks. We further demonstrate both concepts on a complex brain network and compare our results to the rich-club of the brain.
Article
Full-text available
One of the main issues in complex networks is the phenomenon of diffusion in which the goal is to find the nodes with the highest diffusing power. In diffusion, there is always a conflict between accuracy and efficiency time complexity; therefore, most of the recent studies have focused on finding new centralities to solve this problem and have offered new ones, but our approach is different. Using one of the complex networks’ features, namely the “rich-club”, its effect on diffusion in complex networks has been analyzed and it is demonstrated that in datasets which have a high rich-club, it is better to use the degree centrality for finding influential nodes because it has a linear time complexity and uses the local information; however, this rule does not apply to datasets which have a low rich-club. Next, real and artificial datasets with the high rich-club have been used in which degree centrality has been compared to famous centrality using the SIR standard.
Article
Full-text available
Core-periphery networks are structures that present a set of central and densely connected nodes, namely the core, and a set of non-central and sparsely connected nodes, namely the periphery. The rich-club refers to a set in which the highest degree nodes show a high density of connections. Thus, a network that displays a rich-club can be interpreted as a core-periphery network in which the core is made up by a number of hubs. In this paper, we test the resilience of networks showing a progressively denser rich-club and we observe how this structure is able to affect the network measures in terms of both cohesion and efficiency in information flow. Additionally, we consider the case in which, instead of making the core denser, we add links to the periphery. These two procedures of core and periphery thickening delineate a decision process in the placement of new links and allow us to conduct a scenario analysis that can be helpful in the comprehension and supervision of complex networks under the resilience perspective. The advantages of the two procedures, as well as their implications, are discussed in relation to both network effciency and node heterogeneity.
Article
Full-text available
Chess is an emblematic sport that stands out because of its age, popularity and complexity. It has served to study human behavior from the perspective of a wide number of disciplines, from cognitive skills such as memory and learning, to aspects like innovation and decision making. Given that an extensive documentation of chess games played throughout the history is available, it is possible to perform detailed and statistically significant studies about this sport. Here we use one of the most extensive chess databases in the world to construct two networks of chess players. One of the networks includes games that were played over-the-board and the other is related to games played on the Internet. We studied the main topological characteristics of the networks, such as degree distribution and correlation, transitivity and community structure. We complemented the structural analysis by incorporating players' level of play as node metadata. While the two networks are topologically different, we found that in both cases players gather in communities according to their expertise and that an emergent rich-club structure, composed by the top-rated players, is also present.
Article
Full-text available
The rich club organization (the presence of highly connected hub core in a network) influences many structural and functional characteristics of networks including topology, the efficiency of paths and distribution of load. Despite its major role, the literature contains only a very limited set of models capable of generating networks with realistic rich club structure. One possible reason is that the rich club organization is a divisive property among complex networks which exhibit great diversity, in contrast to other metrics (e.g. diameter, clustering or degree distribution) which seem to behave very similarly across many networks. Here we propose a simple yet powerful geometry-based growing model which can generate realistic complex networks with high rich club diversity by controlling a single geometric parameter. The growing model is validated against the Internet, protein-protein interaction, airport and power grid networks.
Article
Full-text available
While the statistical and resilience properties of the Internet are no more changing significantly across time, the Darknet, a network devoted to keep anonymous its traffic, still experiences rapid changes to improve the security of its users. Here, we study the structure of the Darknet and we find that its topology is rather peculiar, being characterized by non-homogenous distribution of connections -- typical of scale-free networks --, very short path lengths and high clustering -- typical of small-world networks -- and lack of a core of highly connected nodes. We propose a model to reproduce such features, demonstrating that the mechanisms used to improve cyber-security are responsible for the observed topology. Unexpectedly, we reveal that its peculiar structure makes the Darknet much more resilient than the Internet -- used as a benchmark for comparison at a descriptive level -- to random failures, targeted attacks and cascade failures, as a result of adaptive changes in response to the attempts of dismantling the network across time.
Article
Full-text available
The empirical validation of community detection methods is often based on available annotations on the nodes that serve as putative indicators of the large-scale network structure. Most often, the suitability of the annotations as topological descriptors itself is not assessed, and without this it is not possible to ultimately distinguish between actual shortcomings of the community detection algorithms, on one hand, and the incompleteness, inaccuracy, or structured nature of the data annotations themselves, on the other. In this work, we present a principled method to access both aspects simultaneously. We construct a joint generative model for the data and metadata, and a nonparametric Bayesian framework to infer its parameters from annotated data sets. We assess the quality of the metadata not according to their direct alignment with the network communities, but rather in their capacity to predict the placement of edges in the network. We also show how this feature can be used to predict the connections to missing nodes when only the metadata are available, as well as predicting missing metadata. By investigating a wide range of data sets, we show that while there are seldom exact agreements between metadata tokens and the inferred data groups, the metadata are often informative of the network structure nevertheless, and can improve the prediction of missing nodes. This shows that the method uncovers meaningful patterns in both the data and metadata, without requiring or expecting a perfect agreement between the two.
Article
Full-text available
Across many scientific domains, there is common need to automatically extract a simplified view or a coarse-graining of how a complex system's components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called \textit{ground truth} communities. This works well in synthetic networks with planted communities because such networks' links are formed explicitly based on the planted communities. However, there are no planted communities in real world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that no algorithm can perform better than any other across all inputs. However, node metadata still have value and a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structure.
Article
Rich-club ordering and the dyadic effect are two phenomena observed in complex networks that are based on the presence of certain substructures composed of specific nodes. Rich-club ordering represents the tendency of highly connected and important elements to form tight communities with other central elements. The dyadic effect denotes the tendency of nodes that share a common property to be much more nterconnected than expected. In this study, we consider the interrelation between these two phenomena, which until now have always been studied separately. We contribute with a new formulation of the rich-club measures in terms of the dyadic effect. Moreover, we introduce certain measures related to the analysis of the dyadic effect, which are useful in that they confirm the presence and relevance of rich-clubs in complex networks and provide certain insights and a baseline for the evaluation of the rich-club size. In addition, certain computational experiences show the usefulness of the introduced quantities with regard to different classes of real networks.
Article
The rich-club concept has been introduced in order to characterize the presence of a cohort of nodes with a large number of links (rich nodes) that tend to be well connected between each other, creating a tight group (club). Rich-clubness defines the extent to which a network displays a topological organization characterized by the presence of a node rich-club. It is crucial for the investigation of internal organization and function of networks arising in systems of disparate fields such as transportation, social, communication and neuroscience. Different methods have been proposed for assessing the rich-clubness and various null-models have been adopted for performing statistical tests. However, a procedure that assigns a unique value of rich-clubness significance to a given network is still missing. Our solution to this problem grows on the basis of three new pillars. We introduce: i) a null-model characterized by a lower rich-club coefficient; ii) a fair strategy to normalize the level of rich-clubness of a network in respect to the null-model; iii) a statistical test that, exploiting the maximum deviation of the normalized rich-club coefficient attributes a unique p-value of rich-clubness to a given network. In conclusion, this study proposes the first attempt to quantify, using a unique measure, whether a network presents a significant rich-club topological organization. The general impact of our study on engineering and science is that simulations investigating how the functional performance of a network is changing in relation to rich-clubness might be more easily tuned controlling one unique value: the proposed rich-clubness measure.