Content uploaded by Matteo Cinelli
Author content
All content in this area was uploaded by Matteo Cinelli on Feb 20, 2019
Content may be subject to copyright.
1
Generalized Rich-Club Ordering in Networks
Matteo Cinelli 1
Department of Enterprise Engineering, Tor Vergata, Via del Politecnico, 1 - 00133 Rome, Italy
Abstract Rich-club ordering refers to the tendency of nodes with a high degree to be more interconnected than
expected. In this paper we consider the concept of rich-club ordering when generalized to structural measures that
differ from the node degree and to non-structural measures (i.e. to node metadata). The differences in considering
rich-club ordering (RCO) with respect to both structural and non-structural measures is then discussed in terms
of employed coefficients and of appropriate null models (link rewiring vs metadata reshuffling). Once a framework
for the evaluation of generalized rich-club ordering (GRCO) is defined, we investigate such a phenomenon in real
networks provided with node metadata. By considering different notions of node richness, we compare structural and
non-structural rich-club ordering, observing how external information about the network nodes is able to validate the
presence of rich-clubs in networked systems.
Keywords rich-club, node metadata, generalized, null model
I. INTRODUCTION
Networks are characterized by a number of topological properties that are able to provide important insights into
their functional aspects. Well-known examples are represented by the presence of communities [1], i.e. subgraphs
whose nodes have a higher probability to be linked to every node of the subgraph than to any other node of the
graph [2], or of core-periphery structures [3], i.e. structures that allow for the partitioning of the network into a
set of central and densely connected nodes (the core) and a set of noncentral and sparsely connected nodes (the
periphery) [4]. When the hubs of a certain network are densely interconnected (i.e. they form a tight subgraph often
referred to as core) such a network is said to display a rich-club [5]. The presence of a rich-club is quantitatively
recognized through the rich-club coefficient, called φ(k), which measures the ratio between the number of links among
the nodes having degree higher than a given value kand the maximum possible number of links among such nodes.
The rich-club coefficient, when compared with its expectation over a set of rewired networks with the same degree
sequence of the original one, is called φ(k)norm and a network is said to display rich-club ordering when φ(k)norm >1.
This phenomenon has been extensively investigated [6–11] as well as recognized in several real networks [12–15], with
special focus on neuroscience [16–19]. Such developments have fostered further research related to the study of the
network core such as its size [6, 20] and its contribution to network resilience [21], as well as its functional role [22].
Here we consider the concept of rich-club ordering by investigating the interconnections between nodes that are
considered as important from a number of different perspectives. Building on this, the generalized rich-club ordering
(GRCO) refers to the tendency of important nodes (under a certain declared point of view) to form a core denser than
expected. The importance of nodes can be evaluated from a structural point of view, e.g. the node degree or other
nodal centrality measures, and from a non-structural point of view, e.g. the node metadata. Node metadata refer
to non-structural information, such as social or technical attributes, related to network nodes that possibly display
a certain correlation with the observed network structure, and their importance is increasingly being recognized in
terms of understanding networked systems [23–28]. Additionally, node metadata represent exogenous information
about the network nodes (also in weighted networks) that may be impossible to split over the network links. It follows
that the study of GRCO becomes particularly interesting when dealing with networks with various node metadata;
as such, we aim to investigate the interrelation between such node metadata and the network structure.
For instance, if we consider a social network with known individuals’ incomes, we may find that the nodes with the
highest incomes, which are not necessarily hubs, are more interconnected than expected, while those with the highest
degree are not. Moreover, it is important to recall that, despite rich-club ordering and assortativity being two related
concepts, positive assortativity doesn’t necessarily imply rich-club ordering, and viceversa [7].
In the network from Figure 1 we have a slightly wealth-disassortative network rwealth =−0.052 in which, conversely,
the wealthiest nodes (that are 5 if we set the wealth threshold to w > 93) are tightly connected (they have 7 links
out of 10) despite the fact they are not the hubs of the considered network. Moreover, this network, which displays
rdegree =−0.282, doesn’t show rich-club ordering (to degree) for each value of k(i.e. φ(k)norm <1∀k). This means
that rich-club ordering to node metadata does not imply rich-club ordering to node degrees, and viceversa. In a more
general sense, we note that: in the case of rich-club ordering, in terms of node degree, it is easy to compute φ(k)norm ,
because we know which null model to use (degree-preserving rewiring [29]) while in terms of node wealth, the situation
1matteo.cinelli@uniroma2.it
2
88
36
8
55
57
12 81
62
15
20
49
64
10
7
44
94
97
98
99 100
68
79
51
6
23
Wealth
1
4
3
3
4
53
2
4
1
1
2
1
1
4
4
2
4
32
1
4
1
2
2
Degree
FIG. 1. Two toy networks with the same topology for which rich-club ordering can be evaluated with respect to structural and
non-structural measures. On the left, the node labels correspond to their wealth, while on the right, the node labels correspond
to their degree.
becomes trickier since the wealth can’t be directly considered a structural property. For this reason, in the following
sections, we provide a framework as a way of determining the evaluation of GRCO together with specific null models
for evaluating the significance of rich-club ordering in the case of node metadata.
II. RELATED WORKS
The concept of rich club ordering was initially introduced by Zhou and Mondrag´on [5] in order to analyze the
Internet topology at the Autonomous Systems level and to provide a reasonable explanation as to why such kind of
network includes tightly interconnected hubs. In order to investigate the presence of a rich-club, the authors of [5]
introduced the rich-club coefficient φ(r) in terms of rank rof the node (the sequence of ranks reflects the sequence of
node degrees arranged in non-increasing order).
After the contribution of [5], Colizza et al. [30] considered the rich-club coefficient φ(k) in terms of degree kof
the node (here the sequence of the degree is opposite to the degree sequence, i.e. the node degrees are arranged in
non-decreasing order). Since the work of [30], the rich-club coefficient has been mostly exploited in its φ(k) version;
however, the two coefficients yield identical results.
Despite the different formulations of the rich-club coefficient, the fundamental contribution of [30] derives from
exploiting a null model used to detect the presence of the rich-club. Such null model exploits the procedure of degree-
preserving rewiring introduced in [29] in order to compute the expected rich-club coefficient φ(k)norm. The reason
behind the necessity of a null model for studying rich-club ordering is the observation of the monotonically increasing
behavior of the rich-club coefficient φ(k). Such a statement was subsequently denied by the authors of [7], who also
discussed the interrelation of the assortativity coefficient [23] with the rich-club structure and introduced a null model
able to preserve the density of the rich-club.
Other works concerning the evaluation of rich-club ordering are related to its statistical significance [10, 31] in terms
of p-value under different null models; to its effect on other structural measures, such as the clustering coefficient
and degree assortativity [8] that are strongly influenced by the rich-club density; to the improvement of the rich-club
coefficient itself [6], thus allowing one to consider the constraints introduced by the degree sequence of the network.
Together with its implementation on unweighted networks, the concept of rich-club ordering has been also extended
to weighted networks by using various null models which are able to preserve different aspects of the network, including
degree and strength distribution [32, 33]. Other extensions of the rich-club involve dense [34], hierarchical [35]
and interdependent networks [36]. Moreover, other contributions discuss the importance of a rich-club in network
robustness [21], spreading processes [37], as well as the generative processes and dynamics that lead to networks
displaying a rich-club [9, 38].
Finally, another important aspect related to the presence of a rich-club is to measure its size in terms of number
of nodes. This has been achieved through the persistence probability of a random walker in the network [20] and
the number of nodes that are necessary to realize a complete subgraph from the degree sequence of the considered
network [6].
3
III. EVALUATING RICH-CLUB ORDERING FOR THE NODE DEGREE
Rich-club ordering can be quantified using the coefficient φ(k):
φ(k) = 2E>k
N>k(N>k −1) (1)
where E>k is the number of links among the N>k nodes having degree higher than a given value kand N>k(N>k −1)
2is
the maximum possible number of links among the N>k nodes. Therefore, φ(k) measures the fraction of links connecting
the N>k nodes out of the maximum number of links they might possibly share. This implies that φ(k) = 1 when the
N>k nodes are arranged into a clique. When rich-club ordering is investigated, the rich-club coefficient needs to be
compared against a null model in order to evaluate its significance (i.e. to test that the presence of rich-club ordering is
not a natural consequence of the considered degree sequence). The use of null models and of the normalization process
of structural measures in complex networks represents a practice widely used to comprehend whether an observed
pattern could have arisen by chance. For this reason, the normalization of the rich-club coefficient, suggested in [30]
and adopted in many further studies [10, 17, 34, 35, 39–41], is a necessary procedure that has to be adopted in order to
take into account the significance of this index. The normalization procedure of φ(k) involves an ensemble of rewired
networks which have the same degree sequence of the one under investigation and that, if generated in a sufficiently
large number, provide a null distribution of the rich-club coefficient. The rewiring procedure itself is simple since it
chooses two arbitrary edges at each step ((a,b) and (c,d) for instance) and changes their endpoints (such that we
obtain (a,d) and (c,b)) [29]; in cases whereby one or both of these new links already exist in the network, this step is
aborted and a new pair of links is selected. The described procedure has been widely adopted since it preserves an
important network parameter represented by nodes degree; however, other procedures that aim at preserving other
parameters may be adopted [7, 9, 31, 42, 43].
In general, the normalized rich-club coefficient [30] is defined as:
φ(k)norm =φ(k)
φ(k)rand
(2)
where φ(k)rand is the average rich-club coefficient across the set of rewired networks (typically 1000 networks [15, 18,
19]) and we observe rich-club ordering when φ(k)norm >1.
Additionally, in [7] it is argued that when the considered network is made up of nodes whose maximum degree
kmax is larger than the cut-off degree ks[44] (i.e. the quantity for which it is impossible to obtain networks with
no degree-degree correlation) the degree-preserving rewiring could produce randomized networks with a rich-club
coefficient that is too close to the initial one. This is because the rewiring procedure, in which couples of links are
uniformly sampled, could cause the disruption of several high-degree to low-degree connections with the consequent
creation of high-degree to high-degree connections. Indeed, since a high proportion of links is attached to hubs the
probability of picking a couple of links whose endpoints are hubs is relatively high, particularly when there are nodes
with degree kmax > ks.
IV. NULL MODELS FOR THE EVALUATION OF RICH-CLUB ORDERING
The evaluation of rich-club ordering in the case of degree exploits a null model that rewires the network while
keeping its degree sequence. As the degree can be considered a structural attribute of the node, a null model that
evaluates different network topologies (i.e. alters the original network structure while keeping certain fundamental
properties) constitutes a reasonable choice. The same choice seems to be reasonable also in the case of other structural
properties of the node (such as centrality measures) even if the degree-preserving rewiring doesn’t keep the same value
of centrality over the nodes due to the topology being subject to change.
In the case of non-structural attributes (i.e. node metadata), the structural rewiring doesn’t seem to be the unique
option. Indeed, we may be interested in knowing if different arrangements of the node metadata over the same network
structure are able to unveil rich-club ordering as well. In other words, we may also be interested in using a null model
that keeps the original network structure while reshuffling the node metadata.
More intuitively, when we evaluate rich-club ordering with link rewiring we are basically asking the question: does
the considered network possess a topology so unusual that it allows room for rich-club ordering? Alternatively, if
we evaluate rich-club ordering with metadata reshuffling we are basically asking the question: does the considered
network possess an arrangement of node metadata so unusual that it allows room for rich-club ordering?
As an example, let us suppose that a relatively large clique of wealthy nodes is present in a certain social network,
like that in the example of Section I. The two questions from above then become:
4
1. Is it so peculiar, given the degree of the wealthy nodes, to observe a realization that contains a clique made up
of such nodes?
2. Is it so peculiar, given the network structure, to observe a distribution of the node metadata such that the
wealthy nodes are arranged into a clique?
Consequently, we may be interested in understanding if the metadata distribution is related to the presence of
structural rich-club ordering, and if rich-club ordering, evaluated with respect to node metadata, can be interpreted
as a reinforcement (or a weakening) of the evidence of structural rich-club ordering. Indeed, the exploitation of node
metadata takes into account an additional layer of information that derives from the coupling between the network
structure and the node metadata.
Thus, in order to evaluate rich-club ordering in the case of node metadata we suggest a comparison between the
number of links observed among the rich nodes and the average number of links observed among such rich nodes over
two different ensembles: the former made up of networks and obtained via the rewiring of links; the latter made up
of vectors of metadata and obtained via the reshuffling of node attributes.
These two methods generate different ensembles into which different aspects of the original network are kept. In
the first case (link rewiring) we lose the original network topology while we keep its degree sequence and its degree-
attribute correlation. In the second case we lose the degree-attribute correlation but we keep the original network
structure. Both the methods, despite their clear differences, seem to provide a valid basis of comparison in the case of
node metadata. Moreover, when we evaluate GRCO considering node metadata, if the node metadata and the node
degrees don’t display a significant correlation (either positive or negative), we could observe a set of rich nodes with
very heterogeneous degrees. This doesn’t imply, however, the absence of a dense subgraph made up of rich nodes.
As an example, if we suppose that a subgraph of 15 nodes has the highest metadata values, then the degree of such
nodes has to be at least 14 in order to realize a clique that is connected to the rest of the graph; thus, considering
a sufficiently large network, such nodes don’t have to be necessarily hubs in order to establish a rich-club from the
metadata point of view. It follows that a positive correlation between node metadata and node degree has an effect
on the rich-club evaluation and that, in certain cases, the presence of rich-club ordering with respect to node degree
could also indicate rich-club ordering with respect to node metadata.
As an extreme case, if we have perfect positive correlation among the considered attribute and the node degree
ρx,k = 1, the sorting of the node metadata will correspond to the sorting of the node degrees (i.e. to the degree
sequence). Therefore, we will evaluate rich-club ordering with the same sorting of nodes but with implications
that will differ depending on the null model that we choose to adopt. Additionally, we should consider that the
degree-attribute correlation, when significant, represents an important feature of the considered network which, if not
completely dropped, represents a remarkable element to further stress the presence of rich-club ordering. Therefore,
rich-club ordering in the case of node metadata could be evaluated with respect to random reshuffling, and this would
provide us an ensemble of reshuffled labels that would be, in general, uncorrelated with the network structure in
terms of node degrees, but also stressed with respect to a reshuffling procedure that, similarly as in [45], keeps the
distribution of the degree-attribute correlation somewhat closer to that of the original network.
In order to address this point, in Section IX we introduce a procedure which, by keeping a certain node-attribute
correlation, aims at further stressing the presence of rich-club ordering by comparing it with a somewhat unfavorable
set of metadata shuffles.
V. EVALUATING RICH-CLUB ORDERING FOR STRUCTURAL MEASURES DIFFERENT FROM
DEGREE
When evaluating rich-club ordering (i.e. to compute φnor m) with respect to structural measures different from node
degree, we should take into account that the degree-preserving rewiring entails the two following aspects:
1. The structural measure of node imay change its value due to rewiring.
2. The number of nodes that retain a value of the considered structural measure above the threshold for which we
evaluate rich-club ordering may change.
In order to address this problem we evaluate rich-club ordering by creating a ranking of nodes; in other words, we
consider the rich-club coefficient as a measure of position. We thus rank, for each network in the random ensemble, the
nodes in non-decreasing order of the considered measure and we assign each of them to a position pwith p∈[1, N ].
Then we consider the number of links among the nodes that have a rank greater than a given value p. In other
words, while the degree sequence is fixed across rewired networks, the rewiring procedure may alter the values of
the structural measure associated to each node and consequently the number of nodes with a certain value of a such
5
measure. In order to address this issue and consider the same amount of nodes at each iteration (which corresponds to
keeping the denominator of φiconstant for a certain i) both in the original network and in the randomized ensemble,
we evaluate rich-club ordering by creating a ranking of such nodes. Therefore, the nodes of the original network and of
its randomized instances are ranked in non-decreasing order of the considered structural measure and assigned with a
position p∈[1, N ]. In such a way, for each network, the node with the lowest value of the considered measure will be
in position 1 while that with the highest value will be in position N, despite the possible differences of highest/lowest
values among different networks.
Therefore, in order to compute φ(p) we compute the density of connections among nodes whose index of position
is greater than p:
φ(p) = 2E>p
N>p(N>p −1) (3)
where E>p is the number of edges among the N>p nodes with centrality value greater then the value in position p
and N>p(N>p −1)
2is the maximum possible number of edges among the N>p nodes.
By using this procedure we obtain φ(p)norm =φ(p)
φ(p)rand where φ(p)rand is the average of φ(p) over the random
ensemble. It is worth mentioning that this way of computing the coefficient φ(p) (as a measure of position) is similar
to that proposed in the paper that originally discussed rich-club ordering [5]. Additionally, this measure is also
related to the rich-club coefficient for weighted networks (i.e. networks with non-binary links) proposed by [33]. In
[33], indeed, they consider structural measures such as the node strength (i.e. the sum of the weights attached to
the links of a certain node) and the average weight (i.e. the ratio between the node strength and the node degree),
and they normalize the rich-club coefficient with a method that reshuffles the weights over the links and then links
themselves.
VI. EVALUATING RICH-CLUB ORDERING FOR NON-STRUCTURAL MEASURES
The rich-club coefficient in the case of node metadata can be computed only for scalar metadata as we need a
quantity which, like the node degree or other structural measures, can be sorted in a certain order. The coefficient can
be easily derived from the case of node degree by considering, instead of the degree k, a certain value mcorresponding
to the value of the node metadata. Therefore, rich-club ordering can be discovered via the coefficient φ(m):
φ(m) = 2E>m
N>m(N>m −1) (4)
where E>m is the number of edges among the N>m nodes having metadata value higher than a given value mand
N>m(N>m −1)
2is the maximum possible number of edges among the N>m nodes.
The normalized rich-club coefficient, φ(m)norm, can be derived by considering mas the value corresponding to a
certain value of the node metadata, whilst considering φ(m)rand from two different perspectives. In other words, in
the case of node metadata, we obtain two values of φ(m)r and that depend on the null model that we use. In the case
of link rewiring, φ(m)rand is called φ(m)rew
rand and we use, as for example in [46], the coefficient:
φ(m)rew
norm =φ(m)
φ(m)rew
rand
(5)
while in the case of metadata reshuffling, φ(m)rand is called φ(m)resh
rand and we use the coefficient:
φ(m)resh
norm =φ(m)
φ(m)resh
rand
(6)
Finally it is worth adding that, in the case of non-structural measures, both the rewiring and reshuffling procedures
do not obviously affect the values of the metadata vector whose entries, in the latter case, are only modified in terms
of position.
6
VII. A FRAMEWORK FOR THE EVALUATION OF GENERALIZED RICH-CLUB ORDERING
(GRCO)
In Table VII we propose a framework for the evaluation of generalized rich-club ordering (GRCO 2).
Consider a certain node feature. It can be:
1 Structural
2 Non-structural
if Structural, it can be:
degree Compute φ(k)norm with degree-preserving rewiring
as in Equation 2
6= degree Compute φ(p)norm with degree-preserving rewiring
as in Equation 3
if Non-Structural:
Compute φ(m)rew
norm with degree-preserving rewiring
as in Equation 5
Compute φ(m)resh
norm with metadata reshuffling
as in Equation 6
TABLE I. Generalized framework for the evaluation of rich-club ordering
VIII. APPLICATION
A. Social Network
We test the introduced framework in the case of a criminal social network [49]. The network is made up of the
relationships (m= 315 that we consider unweighted) among confirmed members (n= 54) of a London street gang
between 2005-2009. We choose this network as it comes with various node metadata (which are an important piece
of information in criminal networks [50]) such as age, number of arrests and convictions. We compute the rich-club
coefficient for two structural characteristics of nodes, degree and eigenvector centrality, and for three non-structural
characteristics, corresponding to the node metadata using degree-preserving rewiring (ks'kmax = 25) and node
metadata reshuffling. In Figure 2, we observe that the considered network displays rich-club ordering φ(k)norm >1
to degree. Thus, in such a network, hubs happens to be more connected than what we observe, on average, across
the rewired network ensemble. The network displays also what we can call power-club ordering, as the nodes with
highest eigenvector centrality are also tightly connected. The latter result is, however, expected since the degree and
the eigenvector centrality are, in general, positively correlated [51].
When we consider the node age and the number of arrests we also observe rich-club ordering. We especially observe
how, in the two cases, the metadata reshuffling entails a stronger rich-club ordering φ(m)resh
norm ≥φ(m)rew
norm >1 than
the link rewiring. This means that the metadata are arranged in a way that elicits rich-club ordering and that this
arrangement is hard to replicate via random label reshuffling. The fact that the two measures are both in favour of
rich-club ordering denotes that the presence of this phenomenon is far from being random from different perspectives,
thus underlining the importance of the interplay among the node metadata and the network topology. Conversely,
we observe slightly discordant results when the number of convictions is taken into account. For a certain value of
k(k= 9), rich-club ordering appears to be absent from the structural point of view and present from the metadata
point of view. Such a discrepancy is due to the fact that nodes with the highest number of convictions (>9 in
this case) also have heteregeneous degrees. Indeed, there are five nodes with more than nine convictions, and they
have degree d= [2,2,2,14,16]. The maximum number of links that those five nodes could share is ten, while in
the actual network they only share two links. Such a small amount of links is also due to the presence of nodes
with very low degree. Moreover, since about 90% of the nodes in the actual network have degree higher than 2 and
the network, being relatively dense, displays several complete subgraphs of size 5, then we should expect reshuffled
2R code for the evaluation of GRCO exploits the libraries igraph [47] and brainGraph [48] and is available at https://github.com/
cinHELLi
7
0.0
0.5
1.0
1.5
2.0
5 10 15 20 25
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50
p (eigenvector)
φnorm(p)
0
1
2
3
16 18 20 22 24
m (age)
φnorm
rew (m)
0
1
2
3
16 18 20 22 24
m (age)
φnorm
resh (m)
0.0
0.5
1.0
1.5
2.0
0 5 10 15 20
m (number of arrests)
φnorm
rew (m)
0.0
0.5
1.0
1.5
2.0
0 5 10 15 20
m (number of arrests)
φnorm
resh (m)
0
1
2
3
0.0 2.5 5.0 7.5
m (number of convictions)
φnorm
rew (m)
0
1
2
3
0.0 2.5 5.0 7.5
m (number of convictions)
φnorm
resh (m)
FIG. 2. Curves of the coefficient φnorm for the criminal social network. The dashed line occurs in correspondence with
φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,
age (rewiring), age (reshuffling), arrests (rewiring), arrests (reshuffling), convictions (rewiring), convictions (reshuffling).
instances displaying a rich-club coefficient below one. In other words, we should expect a higher number of links
among highly convicted nodes in randomized networks. In this case, the value of the rich-club coefficient φ(k)resh
norm
indicates that highly convicted elements tend to avoid each other.
More technically, this analysis entails that the degree heterogeneity of the nodes that we take into account, when
related to other elements such as the network density, is able to explain the discrepancies observed between the
coefficients φ(k)rew
norm and φ(k)resh
norm. A discordant result between the two coefficients is also partially explained by the
low value of the correlation coefficient between the degree and the number of convictions, which is ρdeg,conv = 0.058.
B. Linguistic Network
We consider the global language network in which each node represents a language and links connect languages
that are likely to be co-spoken [52]. In more detail, languages are connected according to the frequency of book
translations, i.e. two languages are connected if, at the very least, a book is translated from one language to the
other. The data are pre-processed in order to consider only the largest connected component of the language network
compatibly with the availability of node metadata. The resulting network has n= 54 nodes and m= 104 links, and
the node metadata are represented by two elements: the GDP (gross domestic product) per capita for a language and
the number of speakers of a certain language. As described in [52] the GDP per capita for a language is measured
as the average contribution of a single speaker of language lto the world GDP, and is calculated by adding the
contributions of speakers of lto the GDP of every country, and dividing the sum by the number of speakers of l. The
number of speakers of a certain language are computed using the speaker estimates from the June 14, 2012 version
of the Wikipedia Statistics page, as explained in [52]. In Figure 3 we observe how the global language network tends
to display rich-club ordering from both a structural and non-structural point of view. It is interesting how languages
with a relatively high number of speakers are, in some instances, less interconnected than expected, meaning that
certain books written originally in a widely spoken language were not directly translated into the other most spoken
languages. It is likely that such books were first translated into other less spoken languages, which acted as mediums.
8
0.9
1.0
1.1
5 10
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
0 20 40
p (eigenvector)
φnorm(p)
1.0
1.5
2.0
2.5
1000 10000
m (GDP per capita in $)
φnorm
rew (m)
2
4
6
1000 10000
m (GDP per capita in $)
φnorm
resh (m)
1.0
1.1
1.2
1.3
10 1000
m (Speakers in millions)
φnorm
rew (m)
4
8
12
16
10 1000
m (Speakers in millions)
φnorm
resh (m)
FIG. 3. Curves of the coefficient φnorm for the global language network. The dashed line occurs in correspondence with
φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,
GDP per capita (rewiring), GDP per capita (reshuffling), amount of speakers (rewiring), amount of speakers (reshuffling).
C. Transportation Network
We also consider the case of US airports network of domestic flights in December 2010 (the network is considered
in its undirected/unweighted version with n= 745, m= 4618 and ks< kmax = 166) 3in which the number of
flights departing from each airport and the number of passengers leaving a certain airport are used as node metadata.
These two quantities (reasonably) show a very high correlation with the node degrees ρ(pass, deg)=0.906 and
ρ(dep, deg)=0.928. For this reason, when we evaluate rich-club ordering with respect to node metadata, we observe
both positive but very different values of φ(m)rew
norm and φ(m)resh
norm. In more detail, in Figure 4, we observe very high
values of φ(m)resh
norm that depend on the fact that a random reshuffling of the node metadata causes a complete loss of
the observed correlation between the metadata and the degree. The US airports network displays rich-club ordering
to degree, to eigenvector centrality (from a certain point) and to the metadata values. In the latter case, we observe
that, because of the random reshuffling, the obtained values are clearly on a different scale than those obtained in
the case of link rewiring. In other words, the number of links among nodes with the highest metadata values is far
greater than that expected by chance. This implies that the arrangement of node metadata deriving from the original
network is significant and difficult to replicate (because of the degree-attribute correlation) using the current null
model (i.e. a model that randomly redistributes the node metadata). Moreover, this result confirms the presence and
the significance of interconnections among important airports from a wide array of perspectives connected to both
the traffic generated by the airports, as well as the airports themselves.
D. Technological Network
We consider a network obtained from the data of the seventh Framework Programme for Research and Technological
Development (FP7), provided by the European Commission (EC). FP7 was run from 2007 until 2013 with a total
budget of over 50 billions. Most of the budget was spent on grants to both European and global research institutions
to co-finance research, technological development and demonstration pro jects. Among the different lines of funding
of FP7, we consider the data of projects related to the call for environmental issues.
Using such data we first build a bipartite network in which one partition is made up of projects while the other
is made up of participants of projects. A link between the partitions exists if an institution participated in project.
3The network is available in the igraphdata package for R [53]
9
0.0
0.5
1.0
1.5
2.0
1 10 100
k (degree)
φnorm(k)
0.0
0.5
1.0
1.5
2.0
10 100 800
p (eigenvector)
φnorm(p)
0.0
0.5
1.0
1.5
2.0
10 1000
m (number of departures)
φnorm
rew (m)
0
20
40
60
10 1000
m (number of departures)
φnorm
resh (m)
0.0
0.5
1.0
1.5
2.0
10 1000 100000
m (passengers)
φnorm
rew (m)
0
20
40
60
80
10 1000 100000
m (passengers)
φnorm
resh (m)
FIG. 4. Curves of the coefficient φnorm for the US airports network. The dashed line occurs in correspondence with φnorm = 1,
the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, number
of departures (rewiring), number of departures (reshuffling), number of passengers leaving (rewiring), number of passengers
leaving (reshuffling).
Then we perform a one-mode projection of the bipartite network in a way such that two institutions are connected if
they participated in the same project. The resulting network has n= 2739 and m= 45667 and we consider as node
metadata the contribution of the EC to each institution, measured in euros.
The network of institutions that we take into account has a very peculiar structure in that the participants in
each project are connected in a complete subgraph, while institutions participating in multiple projects connect such
dense substructures. This network, made up of several interconnected cliques, is particularly apt to being studied in
terms of rich-club ordering to node metadata due to its particular structure. Indeed, we can foresee how the rewiring
procedure would break up the multiple cliques, which represent each financed project and thus provide evidence for
structural rich-club ordering. When we analyze rich-club ordering in terms of contribution of the EC, i.e. when we
ask ourselves if the richest nodes (in terms of received funds) are arranged into a rich-club , the reshuffling procedure
represents a more suitable null model since it preserves the network structure while changing the degree-metadata
correlation. In Figure 5, we observe that the considered network displays rich-club ordering from both a structural
and non-structural point of view, confirming, as also observed in [13] for projects funded by the Engineering and
Physical Sciences Research Council of United Kingdom, how research funds are allocated to rich-clubs. The fact that
an elite circle of academic institutions tends to over-attract funding [54] represents a major problem in research that
needs to be investigated in other datasets and addressed with proper measures and interventions aimed at reducing
evident inequalities.
IX. AN ALTERNATIVE TO RANDOM RESHUFFLING
Considering the reasoning outlined above, we suggest a procedure that, based on a certain parameter, is able to
reshuffle the node metadata while keeping a degree-metadata correlation profile closer to that of the original network.
Indeed, we aim at investigating GRCO discerning between two cases: where rich-club ordering is discovered due to
a distribution of the node metadata that is significant with respect to an appropriate null model for the considered
case and where rich-club ordering is discovered due to the comparison against networks whose attributes distribution
is too far from the original one.
The procedure is based on the idea of swapping a couple of metadata values whose corresponding entries in the
metadata vector are at a certain distance sfrom one another, and it is made up of the following steps:
1. Consider the vector of metadata of length Nand choose randomly an entry in position i∈[1, N ]
2. Select the parameter s∈[1, N ] which determines the range of the metadata swap. In other words, sdetermines
10
1.0
1.1
1.2
10 100
k (degree)
φnorm(k)
1.0
1.1
1.2
1.3
10 1000
p (eigenvector)
φnorm(p)
1.0
1.1
1.2
10000 1000000
m (EC Contribution in Euros)
φnorm
rew (m)
0
20
40
60
80
10000 1000000
m (EC Contribution in Euros)
φnorm
resh (m)
FIG. 5. Curves of the coefficient φnorm for the FP7 projects network. The dashed line occurs in correspondence with φnorm = 1,
the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, European
Commission contribution (rewiring), European Commission contribution (reshuffling).
the distance of the randomly chosen entry, in position i, from the candidate entry, in position i0=i±s, which
will be selected for the swap
3. Select the direction, δ∈ {0,1}, of the swap with a Bernoulli trial with probability p= 0.5
If δ= 0 set i0=i−s
If δ= 1 set i0=i+s
4. If i−s < 1 and δ= 0 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random
one entry in position [1, i −1] if i6= 1, or in position i0= 1 if i= 1. Swap the entries in position iand i0
5. If i+s > N and δ= 1 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random
one entry in position [i+ 1, N ] if i6=N, or in position i0=Nif i=N. Swap the entries in position iand i0
6. Else swap the entries in position iwith the entry in position i0where i0=i±sdepending on the value of δ
7. Repeat the steps from 3 to 6 O(M) times
In Figure 6 we pictorially display an iteration of the proposed procedure while in Figure 7 we show three distributions
of degree-attribute correlation in the case of the US airports networks, considering as node metadata the passengers
leaving each airport (node). The three represented cases are: random reshuffling; reshuffling with the described
procedure using as a parameter the mean degree s=k; reshuffling with the described procedure using as a parameter
the square root of the degree of the selected node s=√ki.
It is worth noting that, asymptotically, the proposed procedure and the random shuffling should end up with
somewhat equivalent distributions of the node metadata. Indeed, by iterating the procedure for a number of times
which tends to infinity, we should observe reshuffled vectors displaying a degree attribute correlation which is close to
that of a randomized vector, regardless of the value of s. Nonetheless, since we are comparing the two cases of link
rewiring and metadata reshuffling we should also consider that in the former case the number of performed rewirings
is, in general, O(M) where Mis the number of links. This implies that, in practical contexts by performing O(M)
iterations, the proposed procedure would produce a correlation profile which differs from the random one. Indeed,
in Figure 7 we observe how the proposed procedure, regardless of the chosen parameter, keeps a higher correlation
profile that overlaps with the random one only in its left tail. By using this procedure we can further test the presence
of rich-club ordering on different normalized ensembles, thus obtaining the results of Figure 8. The obtained results
still confirm the presence of rich-club ordering in the case of node metadata but they are clearly on a different scale
with respect to the case of random reshuffling, as displayed in Figure 4. The results of this stress test provide further
evidence for the presence of a tight core in the considered airport network.
11
= 2
10
7
1
9
6
14
1
2
27
1
= 4 ! = 1
"= 6
10
7
1
9
6
14
1
2
27
1
= 4
!= 6
10
7
1
14
6
9
1
2
27
1
FIG. 6. Example iteration of the procedure of metadata reshuffling on a random metadata vector. A entry in position i= 4
is randomly selected and the parameter sis set to s=2. We suppose that the result of the Bernoulli trial is δ= 1 and thus
i0=i+s= 6. Finally the entries in position iand i0are switched and another iteration is repeated using the new metadata
vector.
0
50
100
150
200
-0.1 0.0 0.1 0.2 0.3 0.4
Correlation
Frequency
k
r
k
FIG. 7. Histograms displaying the frequencies of correlation values computed using 1000 shuffled vectors of node metadata.
We choose 1000 shuffled vectors as we also consider 1000 rewired networks when computing the normalized rich-club coefficient
in the case of structural measures. In the legend, rrefers to the random mixing of the node metadata while kand √krefer to
the mixing parameters of the proposed procedure of metadata shuffling.
X. DISCUSSION
In this paper we discussed the generalization of the concept of rich-club ordering, considering both node structural
attributes and metadata. This allowed room for the evaluation of such a phenomenon from a number of different
perspectives that embed external information about nodes and that can be useful in the study of real networks. For
instance, when studying economic networks, such as trade networks or interbanks networks, one may be interested
in noticing whether the richest agents (in an economic sense) do actually form a rich-club whilst not being hubs. In
other words, whether they tend to saturate their degree by connecting only to other rich-members, thus minimizing
their feeder (i.e. rich-club to non rich-club) connections. The study of such feeder connections, whose endpoints are
nodes outside the rich-club, i.e. nodes which can be in a certain proportion considered eligible to join the rich-club,
has proved to be important in confirming the presence of rich-club ordering [6] and it could provide insights for the
12
0
2
4
6
8
10 1000 100000
m (passengers)
φnorm
resh (m)
k
0
2
4
6
8
10 1000 100000
m (passengers)
φnorm
resh (m)
k
FIG. 8. Results for the US airports network. We compute GRCO in the case of node metadata (number of passengers leaving)
by using the presented procedure of node metadata reshuffling. We use as parameters of the procedure the average degree (left)
and the square root of the node degree (right).
understanding of the dynamical properties of the rich-club which are, like the growth, still largely unexplored [20].
Moreover, GRCO can be easily extended to the case of weighted (i.e. networks with edge metadata) and directed
networks by using the right null models for these specific cases [32–34].
This generalization also aims at shedding more light on the relationship that exists between topological and non-
topological patterns in real networks, as well as at emphasizing the importance of node metadata. Given the current
possibility to collect and store increasingly richer datasets and networks, the metadata are indeed gaining attention
in Network Science and many topological phenomena such as the Friendship Paradox [55] (which states that your
friends have, on average, more friends than you have), are now being generalized considering the presence of node
characteristics [56]. The use of such metadata has also been extended to other topological network properties, such
as motifs [57], that are now enriched considering their functional aspects when examined in real networks [18].
Additionally, we discussed the importance of testing rich-club ordering with the appropriate null models, which
can provide us with a deeper understanding of the numerous facets of this problem. However, such an approach
always implies a trade-off between what can be kept and what can be dropped regarding the network structure and
its relation to the node metadata.
ACKNOWLEDGMENTS
The author thanks Leto Peel, Ra`ul J. Mondrag´on and Antonio Iovanella for their insightful suggestions and com-
ments.
[1] Mark EJ Newman and Michelle Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E 69,
026113 (2004).
[2] Santo Fortunato and Darko Hric, “Community detection in networks: A user guide,” Physics Reports 659, 1 – 44 (2016),
community detection in networks: A user guide.
[3] Robert M May, “Will a large complex system be stable?” Nature 238, 413 (1972).
[4] Peter Csermely, Andr´as London, Ling-Yun Wu, and Brian Uzzi, “Structure and dynamics of core/periphery networks,”
Journal of Complex Networks 1, 93–123 (2013).
[5] Shi Zhou and Ra´ul J. Mondrag´on, “The rich-club phenomenon in the internet topology,” IEEE Communications Letters
8, 180–182 (2004).
13
[6] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Rich-club ordering and the dyadic effect: Two interrelated
phenomena,” Physica A: Statistical Mechanics and its Applications 490, 808 – 818 (2018).
[7] Shi Zhou and Ra´ul J. Mondrag´on, “Structural constraints in complex networks,” New Journal of Physics 9, 173 (2007).
[8] Xiao-Ke Xu, Jie Zhang, and Michael Small, “Rich-club connectivity dominates assortativity and transitivity of complex
networks,” Phys. Rev. E 82, 046117 (2010).
[9] Ra´ul J Mondrag´on and Shi Zhou, “Random networks with given rich-club coefficient,” The European Physical Journal B
85, 328 (2012).
[10] Zhi-Qiang Jiang and Wei-Xing Zhou, “Statistical significance of the rich-club phenomenon in complex networks,” New
Journal of Physics 10, 043002 (2008).
[11] Christopher Ansell, Renata Bichir, and Shi Zhou, “Who says networks, says oligarchy? oligarchies as” rich club” networks.”
Connections (02261766) 35 (2016).
[12] Shi Zhou and Ra´ul J. Mondrag´on, “Accurately modeling the internet topology,” Phys. Rev. E 70, 066108 (2004).
[13] Athen Ma, Ra´ul J. Mondrag´on, and Vito Latora, “Anatomy of funded research in science,” Proceedings of the National
Academy of Sciences 112, 14760–14765 (2015), http://www.pnas.org/content/112/48/14760.full.pdf.
[14] Giovanna Ferraro and Antonio Iovanella, “Revealing correlations between structure and innovation attitude in inter-
organisational innovation networks,” International Journal of Computational Economics and Econometrics 6, 93–113
(2016).
[15] Manlio De Domenico and Alex Arenas, “Modeling structure and resilience of the dark network,” Phys. Rev. E 95, 022313
(2017).
[16] Martijn P Van Den Heuvel and Olaf Sporns, “Rich-club organization of the human connectome,” Journal of Neuroscience
31, 15775–15786 (2011).
[17] Logan Harriger, Martijn P. van den Heuvel, and Olaf Sporns, “Rich club organization of macaque cerebral cortex and its
role in network communication,” PLOS ONE 7, 1–13 (2012).
[18] Martijn P van den Heuvel, Ren´e S Kahn, Joaqu´ın Go˜ni, and Olaf Sporns, “High-cost, high-capacity backbone for global
brain communication,” Proceedings of the National Academy of Sciences 109, 11372–11377 (2012).
[19] Guusje Collin, Olaf Sporns, Ren´e CW Mandl, and Martijn P van den Heuvel, “Structural and functional aspects relating
to cost and benefit of rich club organization in the human cerebral cortex,” Cerebral cortex 24, 2258–2267 (2014).
[20] Athen Ma and Ra´ul J. Mondrag´on, “Rich-cores in networks,” PLOS ONE 10, 1–13 (2015).
[21] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Resilience of core-periphery networks in the case of rich-club,”
Complexity 2017 (2017).
[22] David S. Grayson, Siddharth Ray, Samuel Carpenter, Swathi Iyer, Taciana G. Costa Dias, Corinne Stevens, Joel T. Nigg,
and Damien A. Fair, “Structural and functional rich club organization of the brain in children and adults,” PLOS ONE
9, 1–13 (2014).
[23] Mark EJ Newman, “Mixing patterns in networks,” Phys. Rev. E 67, 026126 (2003).
[24] Juyong Park and Albert-L´aszl´o Barab´asi, “Distribution of node characteristics in complex networks,” Proceedings of the
National Academy of Sciences 104, 17916–17920 (2007).
[25] Ginestra Bianconi, Paolo Pin, and Matteo Marsili, “Assessing the relevance of node features for network structure,”
Proceedings of the National Academy of Sciences 106, 11433–11438 (2009).
[26] Leto Peel, Daniel B Larremore, and Aaron Clauset, “The ground truth about metadata and community detection in
networks,” Science advances 3, e1602548 (2017).
[27] Darko Hric, Tiago P Peixoto, and Santo Fortunato, “Network structure, metadata, and the prediction of missing nodes
and annotations,” Physical Review X 6, 031038 (2016).
[28] Mark EJ Newman and Aaron Clauset, “Structure and inference in annotated networks,” Nature Communications 7, 11863
(2016).
[29] Sergei Maslov and Kim Sneppen, “Specificity and stability in topology of protein networks,” Science 296, 910–913 (2002).
[30] Vittoria Colizza, Alessandro Flammini, M Angeles Serrano, and Alessandro Vespignani, “Detecting rich-club ordering in
complex networks,” Nature physics 2, 110–115 (2006).
[31] Alessandro Muscoloni and Carlo Vittorio Cannistraci, “Rich-clubness test: how to determine whether a complex network
has or doesn’t have a rich-club?” arXiv preprint arXiv:1704.03526 (2017).
[32] M. ´
Angeles Serrano, “Rich-club vs rich-multipolarization phenomena in weighted networks,” Phys. Rev. E 78, 026101
(2008).
[33] Tore Opsahl, Vittoria Colizza, Pietro Panzarasa, and Jose J Ramasco, “Prominence and control: the weighted rich-club
effect,” Physical review letters 101, 168702 (2008).
[34] Vinko Zlatic, Ginestra Bianconi, Albert D´ıaz-Guilera, Diego Garlaschelli, Francesco Rao, and Guido Caldarelli, “On the
rich-club effect in dense and weighted networks,” The European Physical Journal B 67, 271–275 (2009).
[35] Julian J McAuley, Luciano da Fontoura Costa, and Tib´erio S Caetano, “Rich-club phenomenon across complex network
hierarchies,” Applied Physics Letters 91, 084103 (2007).
[36] Lucas Daniel Valdez, Pablo Alejandro Macri, H Eugene Stanley, and Lidia Adriana Braunstein, “Triple point in correlated
interdependent networks,” Physical Review E 88, 050803 (2013).
[37] Kamal Berahmand, Negin Samadi, and Seyed Mahmood Sheikholeslami, “Effect of rich-club on diffusion in complex
networks,” International Journal of Modern Physics B 32, 1850142 (2018).
[38] M´at´e Csigi, Attila K˝or¨osi, J´ozsef B´ır´o, Zal´an Heszberger, Yury Malkov, and Andr´as Guly´as, “Geometric explanation of
the rich-club phenomenon in complex networks,” Scientific Reports 7(2017).
14
[39] Ed Bullmore and Olaf Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,”
Nature Reviews Neuroscience 10, 186–198 (2009).
[40] Olaf Sporns, Dante R Chialvo, Marcus Kaiser, and Claus C Hilgetag, “Organization, development and function of complex
brain networks,” Trends in cognitive sciences 8, 418–425 (2004).
[41] Gorka Zamora-L´opez, Changsong Zhou, and J¨urgen Kurths, “Exploring brain function from anatomical connectivity,”
Frontiers in neuroscience 5, 83 (2011).
[42] Shweta Bansal, Shashank Khandelwal, and Lauren Ancel Meyers, “Exploring biological network structure with clustered
random networks,” BMC bioinformatics 10, 1 (2009).
[43] R. J. Mondragˆon, “Network null-model based on maximal entropy and the rich-club,” Journal of Complex Networks 2,
288–298 (2014).
[44] Mari´an Bogun´a, Romualdo Pastor-Satorras, and Alessandro Vespignani, “Cut-offs and finite size effects in scale-free
networks,” The European Physical Journal B 38, 205–209 (2004).
[45] David Laniado, Yana Volkovich, Karolin Kappler, and Andreas Kaltenbrunner, “Gender homophily in online dyadic and
triadic relationships,” EPJ Data Science 5, 19 (2016).
[46] Nahuel Almeira, Ana L Schaigorodsky, Juan I Perotti, and Orlando V Billoni, “Structure constrained by metadata in
networks of chess players,” Scientific reports 7, 15186 (2017).
[47] Gabor Csardi and Tamas Nepusz, “The igraph software package for complex network research,” InterJournal Complex
Systems, 1695 (2006).
[48] Christopher G. Watson, brainGraph: Graph Theory Analysis of Brain MRI Data (2018), r package version 2.6.0.
[49] Thomas U. Grund and James A. Densley, “Ethnic homophily and triad closure: Mapping internal gang struc-
ture using exponential random graph models,” Journal of Contemporary Criminal Justice 31, 354–370 (2015),
https://doi.org/10.1177/1043986214553377.
[50] Salvatore Villani, Michele Mosca, and Mauro Castiello, “A virtuous combination of structural and skill analysis to defeat
organized crime,” Socio-Economic Planning Sciences (2018), https://doi.org/10.1016/j.seps.2018.01.002.
[51] Thomas W Valente, Kathryn Coronges, Cynthia Lakon, and Elizabeth Costenbader, “How correlated are network centrality
measures?” Connections (Toronto, Ont.) 28, 16 (2008).
[52] Shahar Ronen, Bruno Gon¸calves, Kevin Z Hu, Alessandro Vespignani, Steven Pinker, and C´esar A Hidalgo, “Links that
speak: The global language network and its association with global fame,” Proceedings of the National Academy of Sciences
111, E5616–E5622 (2014).
[53] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical
Computing, Vienna, Austria (2008), ISBN 3-900051-07-0.
[54] Michael Szell and Roberta Sinatra, “Research funding goes to rich clubs,” Proceedings of the National Academy of Sciences
112, 14749–14750 (2015).
[55] Ezra W. Zuckerman and John T. Jost, “What makes you think you’re so popular? self-evaluation maintenance and the
subjective side of the ”friendship paradox”,” Social Psychology Quarterly 64, 207–223 (2001).
[56] Young-Ho Eom and Hang-Hyun Jo, “Generalized friendship paradox in complex networks: The case of scientific collabo-
ration,” Scientific reports 4, srep04603 (2014).
[57] Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon, “Network motifs: simple
building blocks of complex networks,” Science 298, 824–827 (2002).