Content uploaded by Matteo Cinelli

Author content

All content in this area was uploaded by Matteo Cinelli on Feb 20, 2019

Content may be subject to copyright.

1

Generalized Rich-Club Ordering in Networks

Matteo Cinelli 1

Department of Enterprise Engineering, Tor Vergata, Via del Politecnico, 1 - 00133 Rome, Italy

Abstract Rich-club ordering refers to the tendency of nodes with a high degree to be more interconnected than

expected. In this paper we consider the concept of rich-club ordering when generalized to structural measures that

diﬀer from the node degree and to non-structural measures (i.e. to node metadata). The diﬀerences in considering

rich-club ordering (RCO) with respect to both structural and non-structural measures is then discussed in terms

of employed coeﬃcients and of appropriate null models (link rewiring vs metadata reshuﬄing). Once a framework

for the evaluation of generalized rich-club ordering (GRCO) is deﬁned, we investigate such a phenomenon in real

networks provided with node metadata. By considering diﬀerent notions of node richness, we compare structural and

non-structural rich-club ordering, observing how external information about the network nodes is able to validate the

presence of rich-clubs in networked systems.

Keywords rich-club, node metadata, generalized, null model

I. INTRODUCTION

Networks are characterized by a number of topological properties that are able to provide important insights into

their functional aspects. Well-known examples are represented by the presence of communities [1], i.e. subgraphs

whose nodes have a higher probability to be linked to every node of the subgraph than to any other node of the

graph [2], or of core-periphery structures [3], i.e. structures that allow for the partitioning of the network into a

set of central and densely connected nodes (the core) and a set of noncentral and sparsely connected nodes (the

periphery) [4]. When the hubs of a certain network are densely interconnected (i.e. they form a tight subgraph often

referred to as core) such a network is said to display a rich-club [5]. The presence of a rich-club is quantitatively

recognized through the rich-club coeﬃcient, called φ(k), which measures the ratio between the number of links among

the nodes having degree higher than a given value kand the maximum possible number of links among such nodes.

The rich-club coeﬃcient, when compared with its expectation over a set of rewired networks with the same degree

sequence of the original one, is called φ(k)norm and a network is said to display rich-club ordering when φ(k)norm >1.

This phenomenon has been extensively investigated [6–11] as well as recognized in several real networks [12–15], with

special focus on neuroscience [16–19]. Such developments have fostered further research related to the study of the

network core such as its size [6, 20] and its contribution to network resilience [21], as well as its functional role [22].

Here we consider the concept of rich-club ordering by investigating the interconnections between nodes that are

considered as important from a number of diﬀerent perspectives. Building on this, the generalized rich-club ordering

(GRCO) refers to the tendency of important nodes (under a certain declared point of view) to form a core denser than

expected. The importance of nodes can be evaluated from a structural point of view, e.g. the node degree or other

nodal centrality measures, and from a non-structural point of view, e.g. the node metadata. Node metadata refer

to non-structural information, such as social or technical attributes, related to network nodes that possibly display

a certain correlation with the observed network structure, and their importance is increasingly being recognized in

terms of understanding networked systems [23–28]. Additionally, node metadata represent exogenous information

about the network nodes (also in weighted networks) that may be impossible to split over the network links. It follows

that the study of GRCO becomes particularly interesting when dealing with networks with various node metadata;

as such, we aim to investigate the interrelation between such node metadata and the network structure.

For instance, if we consider a social network with known individuals’ incomes, we may ﬁnd that the nodes with the

highest incomes, which are not necessarily hubs, are more interconnected than expected, while those with the highest

degree are not. Moreover, it is important to recall that, despite rich-club ordering and assortativity being two related

concepts, positive assortativity doesn’t necessarily imply rich-club ordering, and viceversa [7].

In the network from Figure 1 we have a slightly wealth-disassortative network rwealth =−0.052 in which, conversely,

the wealthiest nodes (that are 5 if we set the wealth threshold to w > 93) are tightly connected (they have 7 links

out of 10) despite the fact they are not the hubs of the considered network. Moreover, this network, which displays

rdegree =−0.282, doesn’t show rich-club ordering (to degree) for each value of k(i.e. φ(k)norm <1∀k). This means

that rich-club ordering to node metadata does not imply rich-club ordering to node degrees, and viceversa. In a more

general sense, we note that: in the case of rich-club ordering, in terms of node degree, it is easy to compute φ(k)norm ,

because we know which null model to use (degree-preserving rewiring [29]) while in terms of node wealth, the situation

1matteo.cinelli@uniroma2.it

2

88

36

8

55

57

12 81

62

15

20

49

64

10

7

44

94

97

98

99 100

68

79

51

6

23

Wealth

1

4

3

3

4

53

2

4

1

1

2

1

1

4

4

2

4

32

1

4

1

2

2

Degree

FIG. 1. Two toy networks with the same topology for which rich-club ordering can be evaluated with respect to structural and

non-structural measures. On the left, the node labels correspond to their wealth, while on the right, the node labels correspond

to their degree.

becomes trickier since the wealth can’t be directly considered a structural property. For this reason, in the following

sections, we provide a framework as a way of determining the evaluation of GRCO together with speciﬁc null models

for evaluating the signiﬁcance of rich-club ordering in the case of node metadata.

II. RELATED WORKS

The concept of rich club ordering was initially introduced by Zhou and Mondrag´on [5] in order to analyze the

Internet topology at the Autonomous Systems level and to provide a reasonable explanation as to why such kind of

network includes tightly interconnected hubs. In order to investigate the presence of a rich-club, the authors of [5]

introduced the rich-club coeﬃcient φ(r) in terms of rank rof the node (the sequence of ranks reﬂects the sequence of

node degrees arranged in non-increasing order).

After the contribution of [5], Colizza et al. [30] considered the rich-club coeﬃcient φ(k) in terms of degree kof

the node (here the sequence of the degree is opposite to the degree sequence, i.e. the node degrees are arranged in

non-decreasing order). Since the work of [30], the rich-club coeﬃcient has been mostly exploited in its φ(k) version;

however, the two coeﬃcients yield identical results.

Despite the diﬀerent formulations of the rich-club coeﬃcient, the fundamental contribution of [30] derives from

exploiting a null model used to detect the presence of the rich-club. Such null model exploits the procedure of degree-

preserving rewiring introduced in [29] in order to compute the expected rich-club coeﬃcient φ(k)norm. The reason

behind the necessity of a null model for studying rich-club ordering is the observation of the monotonically increasing

behavior of the rich-club coeﬃcient φ(k). Such a statement was subsequently denied by the authors of [7], who also

discussed the interrelation of the assortativity coeﬃcient [23] with the rich-club structure and introduced a null model

able to preserve the density of the rich-club.

Other works concerning the evaluation of rich-club ordering are related to its statistical signiﬁcance [10, 31] in terms

of p-value under diﬀerent null models; to its eﬀect on other structural measures, such as the clustering coeﬃcient

and degree assortativity [8] that are strongly inﬂuenced by the rich-club density; to the improvement of the rich-club

coeﬃcient itself [6], thus allowing one to consider the constraints introduced by the degree sequence of the network.

Together with its implementation on unweighted networks, the concept of rich-club ordering has been also extended

to weighted networks by using various null models which are able to preserve diﬀerent aspects of the network, including

degree and strength distribution [32, 33]. Other extensions of the rich-club involve dense [34], hierarchical [35]

and interdependent networks [36]. Moreover, other contributions discuss the importance of a rich-club in network

robustness [21], spreading processes [37], as well as the generative processes and dynamics that lead to networks

displaying a rich-club [9, 38].

Finally, another important aspect related to the presence of a rich-club is to measure its size in terms of number

of nodes. This has been achieved through the persistence probability of a random walker in the network [20] and

the number of nodes that are necessary to realize a complete subgraph from the degree sequence of the considered

network [6].

3

III. EVALUATING RICH-CLUB ORDERING FOR THE NODE DEGREE

Rich-club ordering can be quantiﬁed using the coeﬃcient φ(k):

φ(k) = 2E>k

N>k(N>k −1) (1)

where E>k is the number of links among the N>k nodes having degree higher than a given value kand N>k(N>k −1)

2is

the maximum possible number of links among the N>k nodes. Therefore, φ(k) measures the fraction of links connecting

the N>k nodes out of the maximum number of links they might possibly share. This implies that φ(k) = 1 when the

N>k nodes are arranged into a clique. When rich-club ordering is investigated, the rich-club coeﬃcient needs to be

compared against a null model in order to evaluate its signiﬁcance (i.e. to test that the presence of rich-club ordering is

not a natural consequence of the considered degree sequence). The use of null models and of the normalization process

of structural measures in complex networks represents a practice widely used to comprehend whether an observed

pattern could have arisen by chance. For this reason, the normalization of the rich-club coeﬃcient, suggested in [30]

and adopted in many further studies [10, 17, 34, 35, 39–41], is a necessary procedure that has to be adopted in order to

take into account the signiﬁcance of this index. The normalization procedure of φ(k) involves an ensemble of rewired

networks which have the same degree sequence of the one under investigation and that, if generated in a suﬃciently

large number, provide a null distribution of the rich-club coeﬃcient. The rewiring procedure itself is simple since it

chooses two arbitrary edges at each step ((a,b) and (c,d) for instance) and changes their endpoints (such that we

obtain (a,d) and (c,b)) [29]; in cases whereby one or both of these new links already exist in the network, this step is

aborted and a new pair of links is selected. The described procedure has been widely adopted since it preserves an

important network parameter represented by nodes degree; however, other procedures that aim at preserving other

parameters may be adopted [7, 9, 31, 42, 43].

In general, the normalized rich-club coeﬃcient [30] is deﬁned as:

φ(k)norm =φ(k)

φ(k)rand

(2)

where φ(k)rand is the average rich-club coeﬃcient across the set of rewired networks (typically 1000 networks [15, 18,

19]) and we observe rich-club ordering when φ(k)norm >1.

Additionally, in [7] it is argued that when the considered network is made up of nodes whose maximum degree

kmax is larger than the cut-oﬀ degree ks[44] (i.e. the quantity for which it is impossible to obtain networks with

no degree-degree correlation) the degree-preserving rewiring could produce randomized networks with a rich-club

coeﬃcient that is too close to the initial one. This is because the rewiring procedure, in which couples of links are

uniformly sampled, could cause the disruption of several high-degree to low-degree connections with the consequent

creation of high-degree to high-degree connections. Indeed, since a high proportion of links is attached to hubs the

probability of picking a couple of links whose endpoints are hubs is relatively high, particularly when there are nodes

with degree kmax > ks.

IV. NULL MODELS FOR THE EVALUATION OF RICH-CLUB ORDERING

The evaluation of rich-club ordering in the case of degree exploits a null model that rewires the network while

keeping its degree sequence. As the degree can be considered a structural attribute of the node, a null model that

evaluates diﬀerent network topologies (i.e. alters the original network structure while keeping certain fundamental

properties) constitutes a reasonable choice. The same choice seems to be reasonable also in the case of other structural

properties of the node (such as centrality measures) even if the degree-preserving rewiring doesn’t keep the same value

of centrality over the nodes due to the topology being subject to change.

In the case of non-structural attributes (i.e. node metadata), the structural rewiring doesn’t seem to be the unique

option. Indeed, we may be interested in knowing if diﬀerent arrangements of the node metadata over the same network

structure are able to unveil rich-club ordering as well. In other words, we may also be interested in using a null model

that keeps the original network structure while reshuﬄing the node metadata.

More intuitively, when we evaluate rich-club ordering with link rewiring we are basically asking the question: does

the considered network possess a topology so unusual that it allows room for rich-club ordering? Alternatively, if

we evaluate rich-club ordering with metadata reshuﬄing we are basically asking the question: does the considered

network possess an arrangement of node metadata so unusual that it allows room for rich-club ordering?

As an example, let us suppose that a relatively large clique of wealthy nodes is present in a certain social network,

like that in the example of Section I. The two questions from above then become:

4

1. Is it so peculiar, given the degree of the wealthy nodes, to observe a realization that contains a clique made up

of such nodes?

2. Is it so peculiar, given the network structure, to observe a distribution of the node metadata such that the

wealthy nodes are arranged into a clique?

Consequently, we may be interested in understanding if the metadata distribution is related to the presence of

structural rich-club ordering, and if rich-club ordering, evaluated with respect to node metadata, can be interpreted

as a reinforcement (or a weakening) of the evidence of structural rich-club ordering. Indeed, the exploitation of node

metadata takes into account an additional layer of information that derives from the coupling between the network

structure and the node metadata.

Thus, in order to evaluate rich-club ordering in the case of node metadata we suggest a comparison between the

number of links observed among the rich nodes and the average number of links observed among such rich nodes over

two diﬀerent ensembles: the former made up of networks and obtained via the rewiring of links; the latter made up

of vectors of metadata and obtained via the reshuﬄing of node attributes.

These two methods generate diﬀerent ensembles into which diﬀerent aspects of the original network are kept. In

the ﬁrst case (link rewiring) we lose the original network topology while we keep its degree sequence and its degree-

attribute correlation. In the second case we lose the degree-attribute correlation but we keep the original network

structure. Both the methods, despite their clear diﬀerences, seem to provide a valid basis of comparison in the case of

node metadata. Moreover, when we evaluate GRCO considering node metadata, if the node metadata and the node

degrees don’t display a signiﬁcant correlation (either positive or negative), we could observe a set of rich nodes with

very heterogeneous degrees. This doesn’t imply, however, the absence of a dense subgraph made up of rich nodes.

As an example, if we suppose that a subgraph of 15 nodes has the highest metadata values, then the degree of such

nodes has to be at least 14 in order to realize a clique that is connected to the rest of the graph; thus, considering

a suﬃciently large network, such nodes don’t have to be necessarily hubs in order to establish a rich-club from the

metadata point of view. It follows that a positive correlation between node metadata and node degree has an eﬀect

on the rich-club evaluation and that, in certain cases, the presence of rich-club ordering with respect to node degree

could also indicate rich-club ordering with respect to node metadata.

As an extreme case, if we have perfect positive correlation among the considered attribute and the node degree

ρx,k = 1, the sorting of the node metadata will correspond to the sorting of the node degrees (i.e. to the degree

sequence). Therefore, we will evaluate rich-club ordering with the same sorting of nodes but with implications

that will diﬀer depending on the null model that we choose to adopt. Additionally, we should consider that the

degree-attribute correlation, when signiﬁcant, represents an important feature of the considered network which, if not

completely dropped, represents a remarkable element to further stress the presence of rich-club ordering. Therefore,

rich-club ordering in the case of node metadata could be evaluated with respect to random reshuﬄing, and this would

provide us an ensemble of reshuﬄed labels that would be, in general, uncorrelated with the network structure in

terms of node degrees, but also stressed with respect to a reshuﬄing procedure that, similarly as in [45], keeps the

distribution of the degree-attribute correlation somewhat closer to that of the original network.

In order to address this point, in Section IX we introduce a procedure which, by keeping a certain node-attribute

correlation, aims at further stressing the presence of rich-club ordering by comparing it with a somewhat unfavorable

set of metadata shuﬄes.

V. EVALUATING RICH-CLUB ORDERING FOR STRUCTURAL MEASURES DIFFERENT FROM

DEGREE

When evaluating rich-club ordering (i.e. to compute φnor m) with respect to structural measures diﬀerent from node

degree, we should take into account that the degree-preserving rewiring entails the two following aspects:

1. The structural measure of node imay change its value due to rewiring.

2. The number of nodes that retain a value of the considered structural measure above the threshold for which we

evaluate rich-club ordering may change.

In order to address this problem we evaluate rich-club ordering by creating a ranking of nodes; in other words, we

consider the rich-club coeﬃcient as a measure of position. We thus rank, for each network in the random ensemble, the

nodes in non-decreasing order of the considered measure and we assign each of them to a position pwith p∈[1, N ].

Then we consider the number of links among the nodes that have a rank greater than a given value p. In other

words, while the degree sequence is ﬁxed across rewired networks, the rewiring procedure may alter the values of

the structural measure associated to each node and consequently the number of nodes with a certain value of a such

5

measure. In order to address this issue and consider the same amount of nodes at each iteration (which corresponds to

keeping the denominator of φiconstant for a certain i) both in the original network and in the randomized ensemble,

we evaluate rich-club ordering by creating a ranking of such nodes. Therefore, the nodes of the original network and of

its randomized instances are ranked in non-decreasing order of the considered structural measure and assigned with a

position p∈[1, N ]. In such a way, for each network, the node with the lowest value of the considered measure will be

in position 1 while that with the highest value will be in position N, despite the possible diﬀerences of highest/lowest

values among diﬀerent networks.

Therefore, in order to compute φ(p) we compute the density of connections among nodes whose index of position

is greater than p:

φ(p) = 2E>p

N>p(N>p −1) (3)

where E>p is the number of edges among the N>p nodes with centrality value greater then the value in position p

and N>p(N>p −1)

2is the maximum possible number of edges among the N>p nodes.

By using this procedure we obtain φ(p)norm =φ(p)

φ(p)rand where φ(p)rand is the average of φ(p) over the random

ensemble. It is worth mentioning that this way of computing the coeﬃcient φ(p) (as a measure of position) is similar

to that proposed in the paper that originally discussed rich-club ordering [5]. Additionally, this measure is also

related to the rich-club coeﬃcient for weighted networks (i.e. networks with non-binary links) proposed by [33]. In

[33], indeed, they consider structural measures such as the node strength (i.e. the sum of the weights attached to

the links of a certain node) and the average weight (i.e. the ratio between the node strength and the node degree),

and they normalize the rich-club coeﬃcient with a method that reshuﬄes the weights over the links and then links

themselves.

VI. EVALUATING RICH-CLUB ORDERING FOR NON-STRUCTURAL MEASURES

The rich-club coeﬃcient in the case of node metadata can be computed only for scalar metadata as we need a

quantity which, like the node degree or other structural measures, can be sorted in a certain order. The coeﬃcient can

be easily derived from the case of node degree by considering, instead of the degree k, a certain value mcorresponding

to the value of the node metadata. Therefore, rich-club ordering can be discovered via the coeﬃcient φ(m):

φ(m) = 2E>m

N>m(N>m −1) (4)

where E>m is the number of edges among the N>m nodes having metadata value higher than a given value mand

N>m(N>m −1)

2is the maximum possible number of edges among the N>m nodes.

The normalized rich-club coeﬃcient, φ(m)norm, can be derived by considering mas the value corresponding to a

certain value of the node metadata, whilst considering φ(m)rand from two diﬀerent perspectives. In other words, in

the case of node metadata, we obtain two values of φ(m)r and that depend on the null model that we use. In the case

of link rewiring, φ(m)rand is called φ(m)rew

rand and we use, as for example in [46], the coeﬃcient:

φ(m)rew

norm =φ(m)

φ(m)rew

rand

(5)

while in the case of metadata reshuﬄing, φ(m)rand is called φ(m)resh

rand and we use the coeﬃcient:

φ(m)resh

norm =φ(m)

φ(m)resh

rand

(6)

Finally it is worth adding that, in the case of non-structural measures, both the rewiring and reshuﬄing procedures

do not obviously aﬀect the values of the metadata vector whose entries, in the latter case, are only modiﬁed in terms

of position.

6

VII. A FRAMEWORK FOR THE EVALUATION OF GENERALIZED RICH-CLUB ORDERING

(GRCO)

In Table VII we propose a framework for the evaluation of generalized rich-club ordering (GRCO 2).

Consider a certain node feature. It can be:

1 Structural

2 Non-structural

if Structural, it can be:

degree Compute φ(k)norm with degree-preserving rewiring

as in Equation 2

6= degree Compute φ(p)norm with degree-preserving rewiring

as in Equation 3

if Non-Structural:

Compute φ(m)rew

norm with degree-preserving rewiring

as in Equation 5

Compute φ(m)resh

norm with metadata reshuﬄing

as in Equation 6

TABLE I. Generalized framework for the evaluation of rich-club ordering

VIII. APPLICATION

A. Social Network

We test the introduced framework in the case of a criminal social network [49]. The network is made up of the

relationships (m= 315 that we consider unweighted) among conﬁrmed members (n= 54) of a London street gang

between 2005-2009. We choose this network as it comes with various node metadata (which are an important piece

of information in criminal networks [50]) such as age, number of arrests and convictions. We compute the rich-club

coeﬃcient for two structural characteristics of nodes, degree and eigenvector centrality, and for three non-structural

characteristics, corresponding to the node metadata using degree-preserving rewiring (ks'kmax = 25) and node

metadata reshuﬄing. In Figure 2, we observe that the considered network displays rich-club ordering φ(k)norm >1

to degree. Thus, in such a network, hubs happens to be more connected than what we observe, on average, across

the rewired network ensemble. The network displays also what we can call power-club ordering, as the nodes with

highest eigenvector centrality are also tightly connected. The latter result is, however, expected since the degree and

the eigenvector centrality are, in general, positively correlated [51].

When we consider the node age and the number of arrests we also observe rich-club ordering. We especially observe

how, in the two cases, the metadata reshuﬄing entails a stronger rich-club ordering φ(m)resh

norm ≥φ(m)rew

norm >1 than

the link rewiring. This means that the metadata are arranged in a way that elicits rich-club ordering and that this

arrangement is hard to replicate via random label reshuﬄing. The fact that the two measures are both in favour of

rich-club ordering denotes that the presence of this phenomenon is far from being random from diﬀerent perspectives,

thus underlining the importance of the interplay among the node metadata and the network topology. Conversely,

we observe slightly discordant results when the number of convictions is taken into account. For a certain value of

k(k= 9), rich-club ordering appears to be absent from the structural point of view and present from the metadata

point of view. Such a discrepancy is due to the fact that nodes with the highest number of convictions (>9 in

this case) also have heteregeneous degrees. Indeed, there are ﬁve nodes with more than nine convictions, and they

have degree d= [2,2,2,14,16]. The maximum number of links that those ﬁve nodes could share is ten, while in

the actual network they only share two links. Such a small amount of links is also due to the presence of nodes

with very low degree. Moreover, since about 90% of the nodes in the actual network have degree higher than 2 and

the network, being relatively dense, displays several complete subgraphs of size 5, then we should expect reshuﬄed

2R code for the evaluation of GRCO exploits the libraries igraph [47] and brainGraph [48] and is available at https://github.com/

cinHELLi

7

0.0

0.5

1.0

1.5

2.0

5 10 15 20 25

k (degree)

φnorm(k)

0.0

0.5

1.0

1.5

2.0

0 10 20 30 40 50

p (eigenvector)

φnorm(p)

0

1

2

3

16 18 20 22 24

m (age)

φnorm

rew (m)

0

1

2

3

16 18 20 22 24

m (age)

φnorm

resh (m)

0.0

0.5

1.0

1.5

2.0

0 5 10 15 20

m (number of arrests)

φnorm

rew (m)

0.0

0.5

1.0

1.5

2.0

0 5 10 15 20

m (number of arrests)

φnorm

resh (m)

0

1

2

3

0.0 2.5 5.0 7.5

m (number of convictions)

φnorm

rew (m)

0

1

2

3

0.0 2.5 5.0 7.5

m (number of convictions)

φnorm

resh (m)

FIG. 2. Curves of the coeﬃcient φnorm for the criminal social network. The dashed line occurs in correspondence with

φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,

age (rewiring), age (reshuﬄing), arrests (rewiring), arrests (reshuﬄing), convictions (rewiring), convictions (reshuﬄing).

instances displaying a rich-club coeﬃcient below one. In other words, we should expect a higher number of links

among highly convicted nodes in randomized networks. In this case, the value of the rich-club coeﬃcient φ(k)resh

norm

indicates that highly convicted elements tend to avoid each other.

More technically, this analysis entails that the degree heterogeneity of the nodes that we take into account, when

related to other elements such as the network density, is able to explain the discrepancies observed between the

coeﬃcients φ(k)rew

norm and φ(k)resh

norm. A discordant result between the two coeﬃcients is also partially explained by the

low value of the correlation coeﬃcient between the degree and the number of convictions, which is ρdeg,conv = 0.058.

B. Linguistic Network

We consider the global language network in which each node represents a language and links connect languages

that are likely to be co-spoken [52]. In more detail, languages are connected according to the frequency of book

translations, i.e. two languages are connected if, at the very least, a book is translated from one language to the

other. The data are pre-processed in order to consider only the largest connected component of the language network

compatibly with the availability of node metadata. The resulting network has n= 54 nodes and m= 104 links, and

the node metadata are represented by two elements: the GDP (gross domestic product) per capita for a language and

the number of speakers of a certain language. As described in [52] the GDP per capita for a language is measured

as the average contribution of a single speaker of language lto the world GDP, and is calculated by adding the

contributions of speakers of lto the GDP of every country, and dividing the sum by the number of speakers of l. The

number of speakers of a certain language are computed using the speaker estimates from the June 14, 2012 version

of the Wikipedia Statistics page, as explained in [52]. In Figure 3 we observe how the global language network tends

to display rich-club ordering from both a structural and non-structural point of view. It is interesting how languages

with a relatively high number of speakers are, in some instances, less interconnected than expected, meaning that

certain books written originally in a widely spoken language were not directly translated into the other most spoken

languages. It is likely that such books were ﬁrst translated into other less spoken languages, which acted as mediums.

8

0.9

1.0

1.1

5 10

k (degree)

φnorm(k)

0.0

0.5

1.0

1.5

2.0

0 20 40

p (eigenvector)

φnorm(p)

1.0

1.5

2.0

2.5

1000 10000

m (GDP per capita in $)

φnorm

rew (m)

2

4

6

1000 10000

m (GDP per capita in $)

φnorm

resh (m)

1.0

1.1

1.2

1.3

10 1000

m (Speakers in millions)

φnorm

rew (m)

4

8

12

16

10 1000

m (Speakers in millions)

φnorm

resh (m)

FIG. 3. Curves of the coeﬃcient φnorm for the global language network. The dashed line occurs in correspondence with

φnorm = 1, the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector,

GDP per capita (rewiring), GDP per capita (reshuﬄing), amount of speakers (rewiring), amount of speakers (reshuﬄing).

C. Transportation Network

We also consider the case of US airports network of domestic ﬂights in December 2010 (the network is considered

in its undirected/unweighted version with n= 745, m= 4618 and ks< kmax = 166) 3in which the number of

ﬂights departing from each airport and the number of passengers leaving a certain airport are used as node metadata.

These two quantities (reasonably) show a very high correlation with the node degrees ρ(pass, deg)=0.906 and

ρ(dep, deg)=0.928. For this reason, when we evaluate rich-club ordering with respect to node metadata, we observe

both positive but very diﬀerent values of φ(m)rew

norm and φ(m)resh

norm. In more detail, in Figure 4, we observe very high

values of φ(m)resh

norm that depend on the fact that a random reshuﬄing of the node metadata causes a complete loss of

the observed correlation between the metadata and the degree. The US airports network displays rich-club ordering

to degree, to eigenvector centrality (from a certain point) and to the metadata values. In the latter case, we observe

that, because of the random reshuﬄing, the obtained values are clearly on a diﬀerent scale than those obtained in

the case of link rewiring. In other words, the number of links among nodes with the highest metadata values is far

greater than that expected by chance. This implies that the arrangement of node metadata deriving from the original

network is signiﬁcant and diﬃcult to replicate (because of the degree-attribute correlation) using the current null

model (i.e. a model that randomly redistributes the node metadata). Moreover, this result conﬁrms the presence and

the signiﬁcance of interconnections among important airports from a wide array of perspectives connected to both

the traﬃc generated by the airports, as well as the airports themselves.

D. Technological Network

We consider a network obtained from the data of the seventh Framework Programme for Research and Technological

Development (FP7), provided by the European Commission (EC). FP7 was run from 2007 until 2013 with a total

budget of over 50 billions. Most of the budget was spent on grants to both European and global research institutions

to co-ﬁnance research, technological development and demonstration pro jects. Among the diﬀerent lines of funding

of FP7, we consider the data of projects related to the call for environmental issues.

Using such data we ﬁrst build a bipartite network in which one partition is made up of projects while the other

is made up of participants of projects. A link between the partitions exists if an institution participated in project.

3The network is available in the igraphdata package for R [53]

9

0.0

0.5

1.0

1.5

2.0

1 10 100

k (degree)

φnorm(k)

0.0

0.5

1.0

1.5

2.0

10 100 800

p (eigenvector)

φnorm(p)

0.0

0.5

1.0

1.5

2.0

10 1000

m (number of departures)

φnorm

rew (m)

0

20

40

60

10 1000

m (number of departures)

φnorm

resh (m)

0.0

0.5

1.0

1.5

2.0

10 1000 100000

m (passengers)

φnorm

rew (m)

0

20

40

60

80

10 1000 100000

m (passengers)

φnorm

resh (m)

FIG. 4. Curves of the coeﬃcient φnorm for the US airports network. The dashed line occurs in correspondence with φnorm = 1,

the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, number

of departures (rewiring), number of departures (reshuﬄing), number of passengers leaving (rewiring), number of passengers

leaving (reshuﬄing).

Then we perform a one-mode projection of the bipartite network in a way such that two institutions are connected if

they participated in the same project. The resulting network has n= 2739 and m= 45667 and we consider as node

metadata the contribution of the EC to each institution, measured in euros.

The network of institutions that we take into account has a very peculiar structure in that the participants in

each project are connected in a complete subgraph, while institutions participating in multiple projects connect such

dense substructures. This network, made up of several interconnected cliques, is particularly apt to being studied in

terms of rich-club ordering to node metadata due to its particular structure. Indeed, we can foresee how the rewiring

procedure would break up the multiple cliques, which represent each ﬁnanced project and thus provide evidence for

structural rich-club ordering. When we analyze rich-club ordering in terms of contribution of the EC, i.e. when we

ask ourselves if the richest nodes (in terms of received funds) are arranged into a rich-club , the reshuﬄing procedure

represents a more suitable null model since it preserves the network structure while changing the degree-metadata

correlation. In Figure 5, we observe that the considered network displays rich-club ordering from both a structural

and non-structural point of view, conﬁrming, as also observed in [13] for projects funded by the Engineering and

Physical Sciences Research Council of United Kingdom, how research funds are allocated to rich-clubs. The fact that

an elite circle of academic institutions tends to over-attract funding [54] represents a major problem in research that

needs to be investigated in other datasets and addressed with proper measures and interventions aimed at reducing

evident inequalities.

IX. AN ALTERNATIVE TO RANDOM RESHUFFLING

Considering the reasoning outlined above, we suggest a procedure that, based on a certain parameter, is able to

reshuﬄe the node metadata while keeping a degree-metadata correlation proﬁle closer to that of the original network.

Indeed, we aim at investigating GRCO discerning between two cases: where rich-club ordering is discovered due to

a distribution of the node metadata that is signiﬁcant with respect to an appropriate null model for the considered

case and where rich-club ordering is discovered due to the comparison against networks whose attributes distribution

is too far from the original one.

The procedure is based on the idea of swapping a couple of metadata values whose corresponding entries in the

metadata vector are at a certain distance sfrom one another, and it is made up of the following steps:

1. Consider the vector of metadata of length Nand choose randomly an entry in position i∈[1, N ]

2. Select the parameter s∈[1, N ] which determines the range of the metadata swap. In other words, sdetermines

10

1.0

1.1

1.2

10 100

k (degree)

φnorm(k)

1.0

1.1

1.2

1.3

10 1000

p (eigenvector)

φnorm(p)

1.0

1.1

1.2

10000 1000000

m (EC Contribution in Euros)

φnorm

rew (m)

0

20

40

60

80

10000 1000000

m (EC Contribution in Euros)

φnorm

resh (m)

FIG. 5. Curves of the coeﬃcient φnorm for the FP7 projects network. The dashed line occurs in correspondence with φnorm = 1,

the threshold above which we observe rich-club ordering. From top-left we compute GRCO for: degree, eigenvector, European

Commission contribution (rewiring), European Commission contribution (reshuﬄing).

the distance of the randomly chosen entry, in position i, from the candidate entry, in position i0=i±s, which

will be selected for the swap

3. Select the direction, δ∈ {0,1}, of the swap with a Bernoulli trial with probability p= 0.5

If δ= 0 set i0=i−s

If δ= 1 set i0=i+s

4. If i−s < 1 and δ= 0 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random

one entry in position [1, i −1] if i6= 1, or in position i0= 1 if i= 1. Swap the entries in position iand i0

5. If i+s > N and δ= 1 there is no available entry, in position i0, for the swap. Thus, pick uniformly at random

one entry in position [i+ 1, N ] if i6=N, or in position i0=Nif i=N. Swap the entries in position iand i0

6. Else swap the entries in position iwith the entry in position i0where i0=i±sdepending on the value of δ

7. Repeat the steps from 3 to 6 O(M) times

In Figure 6 we pictorially display an iteration of the proposed procedure while in Figure 7 we show three distributions

of degree-attribute correlation in the case of the US airports networks, considering as node metadata the passengers

leaving each airport (node). The three represented cases are: random reshuﬄing; reshuﬄing with the described

procedure using as a parameter the mean degree s=k; reshuﬄing with the described procedure using as a parameter

the square root of the degree of the selected node s=√ki.

It is worth noting that, asymptotically, the proposed procedure and the random shuﬄing should end up with

somewhat equivalent distributions of the node metadata. Indeed, by iterating the procedure for a number of times

which tends to inﬁnity, we should observe reshuﬄed vectors displaying a degree attribute correlation which is close to

that of a randomized vector, regardless of the value of s. Nonetheless, since we are comparing the two cases of link

rewiring and metadata reshuﬄing we should also consider that in the former case the number of performed rewirings

is, in general, O(M) where Mis the number of links. This implies that, in practical contexts by performing O(M)

iterations, the proposed procedure would produce a correlation proﬁle which diﬀers from the random one. Indeed,

in Figure 7 we observe how the proposed procedure, regardless of the chosen parameter, keeps a higher correlation

proﬁle that overlaps with the random one only in its left tail. By using this procedure we can further test the presence

of rich-club ordering on diﬀerent normalized ensembles, thus obtaining the results of Figure 8. The obtained results

still conﬁrm the presence of rich-club ordering in the case of node metadata but they are clearly on a diﬀerent scale

with respect to the case of random reshuﬄing, as displayed in Figure 4. The results of this stress test provide further

evidence for the presence of a tight core in the considered airport network.

11

= 2

10

7

1

9

6

14

1

2

27

1

= 4 ! = 1

"= 6

10

7

1

9

6

14

1

2

27

1

= 4

!= 6

10

7

1

14

6

9

1

2

27

1

FIG. 6. Example iteration of the procedure of metadata reshuﬄing on a random metadata vector. A entry in position i= 4

is randomly selected and the parameter sis set to s=2. We suppose that the result of the Bernoulli trial is δ= 1 and thus

i0=i+s= 6. Finally the entries in position iand i0are switched and another iteration is repeated using the new metadata

vector.

0

50

100

150

200

-0.1 0.0 0.1 0.2 0.3 0.4

Correlation

Frequency

k

r

k

FIG. 7. Histograms displaying the frequencies of correlation values computed using 1000 shuﬄed vectors of node metadata.

We choose 1000 shuﬄed vectors as we also consider 1000 rewired networks when computing the normalized rich-club coeﬃcient

in the case of structural measures. In the legend, rrefers to the random mixing of the node metadata while kand √krefer to

the mixing parameters of the proposed procedure of metadata shuﬄing.

X. DISCUSSION

In this paper we discussed the generalization of the concept of rich-club ordering, considering both node structural

attributes and metadata. This allowed room for the evaluation of such a phenomenon from a number of diﬀerent

perspectives that embed external information about nodes and that can be useful in the study of real networks. For

instance, when studying economic networks, such as trade networks or interbanks networks, one may be interested

in noticing whether the richest agents (in an economic sense) do actually form a rich-club whilst not being hubs. In

other words, whether they tend to saturate their degree by connecting only to other rich-members, thus minimizing

their feeder (i.e. rich-club to non rich-club) connections. The study of such feeder connections, whose endpoints are

nodes outside the rich-club, i.e. nodes which can be in a certain proportion considered eligible to join the rich-club,

has proved to be important in conﬁrming the presence of rich-club ordering [6] and it could provide insights for the

12

0

2

4

6

8

10 1000 100000

m (passengers)

φnorm

resh (m)

k

0

2

4

6

8

10 1000 100000

m (passengers)

φnorm

resh (m)

k

FIG. 8. Results for the US airports network. We compute GRCO in the case of node metadata (number of passengers leaving)

by using the presented procedure of node metadata reshuﬄing. We use as parameters of the procedure the average degree (left)

and the square root of the node degree (right).

understanding of the dynamical properties of the rich-club which are, like the growth, still largely unexplored [20].

Moreover, GRCO can be easily extended to the case of weighted (i.e. networks with edge metadata) and directed

networks by using the right null models for these speciﬁc cases [32–34].

This generalization also aims at shedding more light on the relationship that exists between topological and non-

topological patterns in real networks, as well as at emphasizing the importance of node metadata. Given the current

possibility to collect and store increasingly richer datasets and networks, the metadata are indeed gaining attention

in Network Science and many topological phenomena such as the Friendship Paradox [55] (which states that your

friends have, on average, more friends than you have), are now being generalized considering the presence of node

characteristics [56]. The use of such metadata has also been extended to other topological network properties, such

as motifs [57], that are now enriched considering their functional aspects when examined in real networks [18].

Additionally, we discussed the importance of testing rich-club ordering with the appropriate null models, which

can provide us with a deeper understanding of the numerous facets of this problem. However, such an approach

always implies a trade-oﬀ between what can be kept and what can be dropped regarding the network structure and

its relation to the node metadata.

ACKNOWLEDGMENTS

The author thanks Leto Peel, Ra`ul J. Mondrag´on and Antonio Iovanella for their insightful suggestions and com-

ments.

[1] Mark EJ Newman and Michelle Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E 69,

026113 (2004).

[2] Santo Fortunato and Darko Hric, “Community detection in networks: A user guide,” Physics Reports 659, 1 – 44 (2016),

community detection in networks: A user guide.

[3] Robert M May, “Will a large complex system be stable?” Nature 238, 413 (1972).

[4] Peter Csermely, Andr´as London, Ling-Yun Wu, and Brian Uzzi, “Structure and dynamics of core/periphery networks,”

Journal of Complex Networks 1, 93–123 (2013).

[5] Shi Zhou and Ra´ul J. Mondrag´on, “The rich-club phenomenon in the internet topology,” IEEE Communications Letters

8, 180–182 (2004).

13

[6] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Rich-club ordering and the dyadic eﬀect: Two interrelated

phenomena,” Physica A: Statistical Mechanics and its Applications 490, 808 – 818 (2018).

[7] Shi Zhou and Ra´ul J. Mondrag´on, “Structural constraints in complex networks,” New Journal of Physics 9, 173 (2007).

[8] Xiao-Ke Xu, Jie Zhang, and Michael Small, “Rich-club connectivity dominates assortativity and transitivity of complex

networks,” Phys. Rev. E 82, 046117 (2010).

[9] Ra´ul J Mondrag´on and Shi Zhou, “Random networks with given rich-club coeﬃcient,” The European Physical Journal B

85, 328 (2012).

[10] Zhi-Qiang Jiang and Wei-Xing Zhou, “Statistical signiﬁcance of the rich-club phenomenon in complex networks,” New

Journal of Physics 10, 043002 (2008).

[11] Christopher Ansell, Renata Bichir, and Shi Zhou, “Who says networks, says oligarchy? oligarchies as” rich club” networks.”

Connections (02261766) 35 (2016).

[12] Shi Zhou and Ra´ul J. Mondrag´on, “Accurately modeling the internet topology,” Phys. Rev. E 70, 066108 (2004).

[13] Athen Ma, Ra´ul J. Mondrag´on, and Vito Latora, “Anatomy of funded research in science,” Proceedings of the National

Academy of Sciences 112, 14760–14765 (2015), http://www.pnas.org/content/112/48/14760.full.pdf.

[14] Giovanna Ferraro and Antonio Iovanella, “Revealing correlations between structure and innovation attitude in inter-

organisational innovation networks,” International Journal of Computational Economics and Econometrics 6, 93–113

(2016).

[15] Manlio De Domenico and Alex Arenas, “Modeling structure and resilience of the dark network,” Phys. Rev. E 95, 022313

(2017).

[16] Martijn P Van Den Heuvel and Olaf Sporns, “Rich-club organization of the human connectome,” Journal of Neuroscience

31, 15775–15786 (2011).

[17] Logan Harriger, Martijn P. van den Heuvel, and Olaf Sporns, “Rich club organization of macaque cerebral cortex and its

role in network communication,” PLOS ONE 7, 1–13 (2012).

[18] Martijn P van den Heuvel, Ren´e S Kahn, Joaqu´ın Go˜ni, and Olaf Sporns, “High-cost, high-capacity backbone for global

brain communication,” Proceedings of the National Academy of Sciences 109, 11372–11377 (2012).

[19] Guusje Collin, Olaf Sporns, Ren´e CW Mandl, and Martijn P van den Heuvel, “Structural and functional aspects relating

to cost and beneﬁt of rich club organization in the human cerebral cortex,” Cerebral cortex 24, 2258–2267 (2014).

[20] Athen Ma and Ra´ul J. Mondrag´on, “Rich-cores in networks,” PLOS ONE 10, 1–13 (2015).

[21] Matteo Cinelli, Giovanna Ferraro, and Antonio Iovanella, “Resilience of core-periphery networks in the case of rich-club,”

Complexity 2017 (2017).

[22] David S. Grayson, Siddharth Ray, Samuel Carpenter, Swathi Iyer, Taciana G. Costa Dias, Corinne Stevens, Joel T. Nigg,

and Damien A. Fair, “Structural and functional rich club organization of the brain in children and adults,” PLOS ONE

9, 1–13 (2014).

[23] Mark EJ Newman, “Mixing patterns in networks,” Phys. Rev. E 67, 026126 (2003).

[24] Juyong Park and Albert-L´aszl´o Barab´asi, “Distribution of node characteristics in complex networks,” Proceedings of the

National Academy of Sciences 104, 17916–17920 (2007).

[25] Ginestra Bianconi, Paolo Pin, and Matteo Marsili, “Assessing the relevance of node features for network structure,”

Proceedings of the National Academy of Sciences 106, 11433–11438 (2009).

[26] Leto Peel, Daniel B Larremore, and Aaron Clauset, “The ground truth about metadata and community detection in

networks,” Science advances 3, e1602548 (2017).

[27] Darko Hric, Tiago P Peixoto, and Santo Fortunato, “Network structure, metadata, and the prediction of missing nodes

and annotations,” Physical Review X 6, 031038 (2016).

[28] Mark EJ Newman and Aaron Clauset, “Structure and inference in annotated networks,” Nature Communications 7, 11863

(2016).

[29] Sergei Maslov and Kim Sneppen, “Speciﬁcity and stability in topology of protein networks,” Science 296, 910–913 (2002).

[30] Vittoria Colizza, Alessandro Flammini, M Angeles Serrano, and Alessandro Vespignani, “Detecting rich-club ordering in

complex networks,” Nature physics 2, 110–115 (2006).

[31] Alessandro Muscoloni and Carlo Vittorio Cannistraci, “Rich-clubness test: how to determine whether a complex network

has or doesn’t have a rich-club?” arXiv preprint arXiv:1704.03526 (2017).

[32] M. ´

Angeles Serrano, “Rich-club vs rich-multipolarization phenomena in weighted networks,” Phys. Rev. E 78, 026101

(2008).

[33] Tore Opsahl, Vittoria Colizza, Pietro Panzarasa, and Jose J Ramasco, “Prominence and control: the weighted rich-club

eﬀect,” Physical review letters 101, 168702 (2008).

[34] Vinko Zlatic, Ginestra Bianconi, Albert D´ıaz-Guilera, Diego Garlaschelli, Francesco Rao, and Guido Caldarelli, “On the

rich-club eﬀect in dense and weighted networks,” The European Physical Journal B 67, 271–275 (2009).

[35] Julian J McAuley, Luciano da Fontoura Costa, and Tib´erio S Caetano, “Rich-club phenomenon across complex network

hierarchies,” Applied Physics Letters 91, 084103 (2007).

[36] Lucas Daniel Valdez, Pablo Alejandro Macri, H Eugene Stanley, and Lidia Adriana Braunstein, “Triple point in correlated

interdependent networks,” Physical Review E 88, 050803 (2013).

[37] Kamal Berahmand, Negin Samadi, and Seyed Mahmood Sheikholeslami, “Eﬀect of rich-club on diﬀusion in complex

networks,” International Journal of Modern Physics B 32, 1850142 (2018).

[38] M´at´e Csigi, Attila K˝or¨osi, J´ozsef B´ır´o, Zal´an Heszberger, Yury Malkov, and Andr´as Guly´as, “Geometric explanation of

the rich-club phenomenon in complex networks,” Scientiﬁc Reports 7(2017).

14

[39] Ed Bullmore and Olaf Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,”

Nature Reviews Neuroscience 10, 186–198 (2009).

[40] Olaf Sporns, Dante R Chialvo, Marcus Kaiser, and Claus C Hilgetag, “Organization, development and function of complex

brain networks,” Trends in cognitive sciences 8, 418–425 (2004).

[41] Gorka Zamora-L´opez, Changsong Zhou, and J¨urgen Kurths, “Exploring brain function from anatomical connectivity,”

Frontiers in neuroscience 5, 83 (2011).

[42] Shweta Bansal, Shashank Khandelwal, and Lauren Ancel Meyers, “Exploring biological network structure with clustered

random networks,” BMC bioinformatics 10, 1 (2009).

[43] R. J. Mondragˆon, “Network null-model based on maximal entropy and the rich-club,” Journal of Complex Networks 2,

288–298 (2014).

[44] Mari´an Bogun´a, Romualdo Pastor-Satorras, and Alessandro Vespignani, “Cut-oﬀs and ﬁnite size eﬀects in scale-free

networks,” The European Physical Journal B 38, 205–209 (2004).

[45] David Laniado, Yana Volkovich, Karolin Kappler, and Andreas Kaltenbrunner, “Gender homophily in online dyadic and

triadic relationships,” EPJ Data Science 5, 19 (2016).

[46] Nahuel Almeira, Ana L Schaigorodsky, Juan I Perotti, and Orlando V Billoni, “Structure constrained by metadata in

networks of chess players,” Scientiﬁc reports 7, 15186 (2017).

[47] Gabor Csardi and Tamas Nepusz, “The igraph software package for complex network research,” InterJournal Complex

Systems, 1695 (2006).

[48] Christopher G. Watson, brainGraph: Graph Theory Analysis of Brain MRI Data (2018), r package version 2.6.0.

[49] Thomas U. Grund and James A. Densley, “Ethnic homophily and triad closure: Mapping internal gang struc-

ture using exponential random graph models,” Journal of Contemporary Criminal Justice 31, 354–370 (2015),

https://doi.org/10.1177/1043986214553377.

[50] Salvatore Villani, Michele Mosca, and Mauro Castiello, “A virtuous combination of structural and skill analysis to defeat

organized crime,” Socio-Economic Planning Sciences (2018), https://doi.org/10.1016/j.seps.2018.01.002.

[51] Thomas W Valente, Kathryn Coronges, Cynthia Lakon, and Elizabeth Costenbader, “How correlated are network centrality

measures?” Connections (Toronto, Ont.) 28, 16 (2008).

[52] Shahar Ronen, Bruno Gon¸calves, Kevin Z Hu, Alessandro Vespignani, Steven Pinker, and C´esar A Hidalgo, “Links that

speak: The global language network and its association with global fame,” Proceedings of the National Academy of Sciences

111, E5616–E5622 (2014).

[53] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical

Computing, Vienna, Austria (2008), ISBN 3-900051-07-0.

[54] Michael Szell and Roberta Sinatra, “Research funding goes to rich clubs,” Proceedings of the National Academy of Sciences

112, 14749–14750 (2015).

[55] Ezra W. Zuckerman and John T. Jost, “What makes you think you’re so popular? self-evaluation maintenance and the

subjective side of the ”friendship paradox”,” Social Psychology Quarterly 64, 207–223 (2001).

[56] Young-Ho Eom and Hang-Hyun Jo, “Generalized friendship paradox in complex networks: The case of scientiﬁc collabo-

ration,” Scientiﬁc reports 4, srep04603 (2014).

[57] Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon, “Network motifs: simple

building blocks of complex networks,” Science 298, 824–827 (2002).