Page 1

Global efficiency of local immunization

on complex networks

Laurent He ´bert-Dufresne, Antoine Allard, Jean-Gabriel Young & Louis J. Dube ´

De ´partement de Physique, de Ge ´nie Physique, et d’Optique, Universite ´ Laval, Que ´bec (Que ´bec), Canada G1V 0A6.

Epidemicsoccurinallshapesandforms:infectionspropagatinginoursparsesexualnetworks,rumoursand

diseasesspreadingthroughourmuchdensersocialinteractions,orvirusescirculatingontheInternet.With

the advent of large databases and efficient analysis algorithms, these processes can be better predicted and

controlled. In this study, we use different characteristics of network organization to identify the influential

spreaders in 17 empirical networks of diverse nature using 2 epidemic models. We find that a judicious

choice of local measures, based either on the network’s connectivity at a microscopic scale or on its

community structure at a mesoscopic scale, compares favorably to global measures, such as betweenness

centrality, in terms of efficiency, practicality and robustness. We also develop an analytical framework that

highlights a transition in the characteristic scale of different epidemic regimes. This allows to decide which

local measure should govern immunization in a given scenario.

E

(cascading extinctions in food webs7). With a network representation, these completely different processes can

be modelled as the propagation of a given agent on a set of nodes (the population) and links (the interactions).

Different systems imply networks with different organizations, just as different agents require different epidemic

models.

There has long been significant interest in identifying the influential spreaders in networks. Which nodes

should be the target of immunization efforts in order to optimally protect the network against epidemics?

Unfortunately, most studies feature two significant shortcomings. Firstly, the proposed methods are often based

onoptimizationorheuristicalgorithmsrequiringnearlyperfectinformationonastaticsystem8,9;thisisrarelythe

case. Secondly, methods are usually tested on small numbers of real systems using a particular epidemic scen-

ario10,11; this limits the scope of possible outcomes.

We first present a numerical study, perhaps the largest of its kind to date, where we argue that, depending on

thenatureofthenetworkandofthedisease,differentimmunizationtacticshavetobetakenintoconsideration.In

sodoing,weformalizethenotionofnodeinfluenceandillustratehowlocalknowledgearoundaparticularnodeis

usuallysufficienttoestimateitsroleinanepidemic. Wealsoshowhow,incertaincases,theinfluenceofanodeis

notnecessarilydictatedbyitsnumberofconnections,butratherbyitsroleinthenetwork’scommunitystructure

(see Fig. 1). Far from trivial, it follows that an efficient immunization strategy can be obtained solely from local

measures, which are easily estimated in practice and robust to noisy or incomplete information. We further

develop an analytical formalism ideally suited to test the effects of local immunization on realistic network

structures. Combining the insights gathered from the numerical study and this formalism, we finally formulate

a readily applicable approach which can easily be implemented in practice.

pidemics never occur randomly. Instead, they follow the structured pathways formed by the interactions

and connections of the host population1,2. The spreading processes relevant to our everyday life take place

on networks of all sorts: social (e.g. epidemics3,4), technological (e.g. computer viruses5,6) or ecological

Results

Models and measures. There exist two standard models emulating diverse types of epidemics: the susceptible-

infectious-recovered (SIR) and susceptible-infectious-susceptible (SIS) dynamics. In both, an infectious node has a

given probability of eventually infecting each of its susceptible neighbors during its infectious period, which is

terminated by either death/immunity leading to the recovered state (SIR) or by returning to a susceptible state

(SIS).IntheSIRdynamics,foragiventransmissionprobabilityT,thequantityofinterestisthemeanfractionRfof

recovered nodes once a disease, not subject to a stochastic extinction, has finished spreading (i.e. we focus on the

giant component12). Since each edge can only be followed once, this dynamics investigates how a population is

vulnerabletotheinvasionofanewpathogen.IntheSISdynamics,weareinterestedintheprevalenceI*(fraction

OPEN

SUBJECT AREAS:

COMPLEX NETWORKS

APPLIED MATHEMATICS

EPIDEMIOLOGY

PHASE TRANSITIONS AND

CRITICAL PHENOMENA

Received

1 May 2013

Accepted

18 June 2013

Published

10 July 2013

Correspondence and

requests for materials

should be addressed to

L.J.D. (Louis.Dube@

phy.ulaval.ca)

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

1

Page 2

of infectious nodes) of the disease at equilibrium (equal amounts of

infections and recoveries) as a function of the ratio l 5 a/b of

infection rate a and recovery rate b. This particular dynamics

permits the study of how a given network structure can sustain an

already established epidemic.

Should a fraction e of the population be fully immunized, our

objective is to identify the nodes whose absence would minimize Rf

and I*. The epidemic influence of a node — that is the effect of its

removal on Rfand I* — depends mainly on its role in the organiza-

tion of the network. Hence to efficiently immunize a population, we

must first understand its underlying structure.

Network organization can be characterized on different scales,

eachofwhichaffectthedynamicsofpropagation.Atthemicroscopic

level, the most significant feature is the degree of a node (its number

of links, noted k) which in turn defines the degree distribution of the

network. The significance of the high-degree nodes (the hubs) for

network structure in general13, for network robustness to random

failure14and for epidemic control15has long been recognized.

At the macroscopic level, the role of a node can be described by its

centrality, which may be defined in various ways. Frequently used in

the social sciences is the betweenness centrality (b), quantifying the

contributionsofagivennodetotheshortestpathsbetweeneverypair

of nodes in the network16. Arguably, this method should be among

the best estimate of a node’s epidemic influence as it directly mea-

suresitsroleinthedifferentpathwaysbetweenallotherindividuals17,

yet at a considerable computational cost. A simpler method, the k-

core (or k-shell) decomposition18,19, assigns nodes to different layers

(or coreness c) effectively defining the core and periphery of a net-

work (high and low c respectively). It has recently been shown that

coreness is well suited to identify nodes that are the most at risk of

being infected during the course of an epidemic20. In light of our

results, we will be able to discuss the distinction between a node’s

vulnerability to infection and its influence on the outcome of an

epidemic.

Themesoscopicscalehasrecentlybeenthesubjectofconsiderable

attention.Atthisleveloforganization,thefocusisontheredundancy

of connections forming dense clusters referred to as the community

structure of the network21,22. Nodes can be distinguished by their

membership number m, i.e., the number of communities to which

they belong. We will consider that two links of a given node are part

of one community if the neighbours they reach lead to significantly

overlapping neighbourhoods21. This definition is directly relevant to

epidemic dynamics as links within communities do not lead to new

potentialinfections.Wecallstructuralhubsthenodesconnectingthe

largest number of different communities. These nodes act as bridges

facilitating the propagation of the disease from one dense cluster to

another. Targeting structural hubs to hinder propagation in struc-

turedpopulationshasbeenpreviouslyproposedandinvestigated10,11,

but has yet to be tested extensively.

Note that the microscopic and mesoscopic levels (as defined

above) are characterized by local measures in the sense that they

do not require a complete knowledge of the network, in contrast to

global measures like the betweenness centrality. Moreover, as we will

see, local measures are less sensitive to incomplete or incorrect

information. Adding, removing or rewiring a link only affects the

degree or membership of nodes directly in the neighbourhood of the

modification; whereas the same alterations can potentially affect

the centrality of nodes anywhere in the network through cascading

effects. Furthermore, even if community detection often requires the

tuning of a global resolution parameter, we will see that this

additional step does not affect the identification of structural hubs,

meaningthatlocalinformationissufficienttoaccuratelydeterminea

node’s memberships.

In our numerical simulations we will have a perfect knowledge of

static networks. This will allow us to use global measures as a ref-

erence to test the efficiency of local measures best suited in practice.

We therefore ask without discrimination: which of the degree k, the

coreness c, the betweenness centrality b or the membership number

m is the best identifier of the most influential nodes on the outcome

of an epidemic? To answer this question, we have simulated SIR and

SIS dynamics with Monte Carlo calculations on 17 real-world net-

works.Ineachcase,afractioneofthenodeswasremovedindecreas-

ing order of the nodes’ score for each of the four different measures.

By comparing their efficiency to reduce Rfor I*as afunction of e,we

are able to establish which measure is best suited for a given scenario

characterized by a network structure, a propagation dynamics and a

disease transmissibility (i.e. probability of transmission).

Case study: a data exchange network. We first illustrate our

methods using the network of users of the Pretty-Good-Privacy

Figure 1 | Protein interactions of S. cerevisiae (subset)22. The three black nodes correspond to the ones with the highest degree, and the three red ones

have the highest membership number. In this particular example, it is readily seen that the latter are structurally more influent.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

2

Page 3

algorithm for secure information interchange (hereafter, the PGP

network)23, which could be the host of the propagation of

computer viruses, rumors or viral marketing campaigns. Results

for the 16 other networks are presented and discussed in the next

section as well as in the Supporting Information (SI) document.

Communitiesinthenetworkareextractedwiththelinkcommun-

ity algorithm of Ahn et al.21. This algorithm groups links — and

therefore the nodes they join — into communities based on the

overlap of their respective neighbouring nodes. It is this overlap that

reduces the number of new potential infections in a community

structure,asopposedtoarandomnetwork.Thismethodthusreflects

our understanding of how communities affect disease propagation.

While it may not directly detect the social groups or functional

modules of a network, it identifies significant clusters of redundant

links. This redundancy or overlap is quantified through a Jaccard

coefficient, and two links are grouped into the same community

when their coefficient exceeds a certain threshold. The threshold

value acts as a resolution, enabling to look at different levels of

organization. As suggested21, the value of the threshold is chosen

to maximize the average density r of the communities (see

Methods). As this choice may seem arbitrary, Fig. 2 investigates

the similarity between the nodes with the highest membership num-

bers, for different thresholds. It suggests that the membership num-

ber is fairly robust around the threshold. Moreover, Fig. 2 also

demonstrates that the effect of the removal of the structural hubs

onaSISepidemicsisveryrobusttothechoiceofthethreshold.Thus,

we will henceforth use the membership numbers obtained with the

threshold value corresponding to the highest community density.

The differences, if any, between the efficiency of the different

methods are due to the immunized nodes not being the same.

Figure 3 (top) investigates the correlations between the different

properties (k, b, c and m) of each node. Perhaps the most important

result here is that nodes with a high membership number may have

relatively small degree, coreness and betweenness centrality. Hence,

weexpecttheimmunizingmethodbasedoncommunitystructure to

have a different influence on the outcome of epidemics. Figure 3

(bottom) shows the consistensy (or lack thereof) of a given measure,

dependingonthequalityoftheavailabledata.Therobustnessoflocal

(micro and meso) measures is of obvious practical advantage. Both

robustness and correlations are further investigated in the SI.

Tostudyvariousepidemicscenarios,weconsiderbothSISandSIR

dynamics (which may behave quite differently) with different values

of the transmission probability (l and T for SIS and SIR, respect-

ively).Infact,eachnetworkfeaturesanepidemicthreshold,i.e.critical

values lc

equivalent infinite network ensemble. As we will show, the observed

behavior can differ significantly depending whether or not l and T

are close to their critical value.

Figure 4 presents results of different immunization methods

against SIS dynamics for different values of l. On the top figure,

where l is near lc, the most successful method of intervention is

to target nodes according to their degree. At low transmissibility, the

disease follows only a very small fraction of all links. The shortest

24and Tc

25, below which I* and Rfvanish to zero in an

0

10

20

30

40

50

60

0 10 20 30 40 50 60 70 80 90 100

Jaccard threshold (%)

Density ρ (%)

PGP

0.75

0.8

0.85

0.9

0.95

0 10 20 30 40 50 60 70 80 90 100

Jaccard threshold (%)

I*

Figure 2 | Robustness of structural hubs in the PGP network. (top)

Community density (r) obtained through different Jaccard thresholds.

(middle) Robustness of the structural hubs identification methods.

Element (i,j) gives the overlap (normalized) between the structural hubs

(top 1%) selected with thresholds i and j. The highest line and last column

ofthematrixcorrespondtothecasewherethemembershipnumberequals

the degree. (bottom) Prevalence I* of SIS epidemics with l 5 5 when the

top1%ofstructuralhubsareremoved(comparedwiththeresultswithout

removal in blue or with random targets in orange).

0.5

0.6

0.7

0.8

0.9

1

0 10 20 30 40 50

Jaccard in top 20%

% of incompleteness

PGP

k

mb

Figure 3 | Difference in immunization targets for the PGP network.

(top) We present correlations between the degree (k, right axis), the

coreness (c, left axis), the betweenness centrality (b, vertical axis) and the

membership number (m, color) for each nodes. Each measure is

normalizedaccordingtothehighestvaluefoundinthenetwork.Eachnode

is represented in this 4-dimensional space and a simple triangulation

procedurethenyieldsamoreintelligibleappearance.Structuralhubs(dark

red) can be found even at relatively small degree (, kmax/2), coreness

(,cmax/5)andcentrality(,bmax/3).(bottom)Jaccardcoefficientbetween

theensembleofnodesidentifiedaspartofthetop20%accordingtoagiven

measure (k, m or b) on two versions of the network: the original complete

network and a network ensemble where a certain percentage of links has

been randomly removed (horizontal axis). The shorter the range of a

measure, the more robust it is to incomplete information.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

3

Page 4

paths are seldom used and the poor performance of betweenness

centrality follows. Moreover, the disease will not be affected by the

community structure, because even in dense neighbourhoods, most

linkswillnotbetravelled.Wethensaythatthedisease,unaffectedby

link clustering, follows a tree-like structure (without loops), where

community memberships are insignificant. It is therefore better to

simply remove as many links as possible.

As l increases beyond lc, we see that immunization based on

membership numbers quickly outperforms the other methods. As

more links are travelled, the disease is more likely to follow super-

fluous links in already infected communities. Hubs sharing their

many links within few communities are therefore not as efficient

in causing secondary infections as one might expect. Similarly,

targeting through betweenness centrality also performs better with

higher l, albeit not as well as membership-targeting in this case. For

l?lc,immunizationbasedonmembershipnumbers(local)andon

betweenness centrality (global) converge toward similar efficiency,

significantly outperforming degree-based immunization.

Another interesting feature of our results is the poor performance

of immunization based on node coreness. A previous study had

clearly shown that epidemics mostly flourished within the core of

the network (see Fig. 5) because of its density20. Ironically, this den-

sity also implies redundancy. While the core nodes are highly at risk

ofbeinginfected,theirremovalhasalimitedeffectbecausethereexist

alternative paths within their neighbourhood: the core offers a per-

fect environment to the disease and is consequently robust to node

removal.It istherefore moreeffective to stopthe disease fromreach-

ing,orleaving,thecore byremovingthenodesbridging otherneigh-

bourhoods (i.e. the structural hubs).

Similar conclusions are drawn for the SIR dynamics. As T moves

away from Tc, the most significant level of organisation shifts from

the degree (microscopic) to communities (mesoscopic) as member-

ship-based immunization progressively outperforms the other

strategies.

Results on networks of diverse nature. In this section, we highlight

different behaviours observed in social, technological and commu-

nication networks using 7 other datasets (full results for the 17

datasets are available in the SI): subset of the World Wide Web

(WWW)13, MathSciNet co-authorship network (MathSci)27, Wes-

tern States Power Grid of the United States (Power Grid)28,

Internet Movie Database since 2000 (IMDb)29, cond-mat arXiv co-

authorshipnetwork(arXiv)22,e-mailinterchangesbetweenmembers

oftheUniversityRovira iVirgili(Email)30andGnutellapeer-to-peer

network (Gnutella)31.

The results for the WWW, MathSci and IMDb networks fur-

ther support our previous conclusions, with the exception that

membership-based immunization performs surprisingly better than

the degree-based variant even near the epidemic threshold of the

network (see WWW and MathSci). The betweenness-centrality-

based immunization was not tested on IMDb because of computa-

tional constraints (itscomputation required over 800 hours with our

available ressources and a standard algorithm32), which illustrates a

significant limit of this measure. Approximations could have been

used33, but the intricate (and mostly unknown) relationship between

the efficiency of the measure and the accuracy of the approximation

would have only caused additional uncertainties.

The results presented for the Power Grid network illustrate a

fundamental differencebetweentheSISandtheSIRdynamics: while

we are interested in the fraction of the network sustaining an estab-

lished epidemic in SIS, it is the fraction of nodes invaded by a new

0

0.001

0.4

0.01

0.02

0.03

I*

0.04

0.05

0.01 0.1

r

l

0

0.001

0.9

0.1

0.2

0.3

0.01 0.1

I*

l

0.3

0.4

0.5

0.6

0.7

0.8

0.001 0.01 0.1

Fraction of nodes remove d

I*

r

r

l

Figure 4 | Efficiency of the immunization methods against an SIS

epidemicsonthePGPnetwork. Nodesareremovedindecreasingorderof

their score according to each method: coreness (green pentagons), degree

(black circles), betweenness centrality (blue triangles) and memberships

(red diamonds) and the effect of removal isthen quantified in terms of the

decrease of the prevalence I*. The prevalence of the epidemics when the

removed nodes are chosen at random (grey squares) has been added for

comparison. Figures are presented in increasing order of transmissibility

(l) from top to bottom.

Figure 5 | k-core decomposition of the PGP network. Representation

(based on26) of the k-shells in the PGP network with nodes colored

according to their total infectious period during a given time interval. Red

nodesaremorelikelytobeinfectiousatanygiventimethangreennodesas

the color is given by the square of the fraction of time spent in infectious

state.Notehowthecentralnodes(thecore)ofthenetworkaremostatrisk.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

4

Page 5

diseasethatisrelevantinSIR.Infact,thestructureofthePowerGrid,

achain of small, easily disconnected modules, enhances the qualitat-

ivediscrepancybetweentheepidemicinfluenceofnodessubjectedto

these two dynamics. For the SIS dynamics, the membership-based

intervention is the most efficient because it weakens all modules,

limiting the prevalence of the disease. In distinction, targeting

throughbetweennesscentralitymerelyseparatesthemodules,sothat

they indiviually remain infected. For the SIR dynamics, separating

the modules is the best approach as it directly stops the infection

from spreading; while weakened – but connected – modules still

providepathways.Thiseffectisadirectconsequenceoftheparticular

structure of the Power Grid and is insignificant on other networks.

Finally,thelastsetofresults,onarXiv,EmailandGnutella,present

the effect of the community density r on the performance of mem-

bership-based immunization. For very small r, the paths within

communities do not qualitatively differ from the links bridging

neighborhoods in their effect on the disease propagation. This tar-

getingmethodisthereforeexpectedtoconvergetowarddegree-based

immunization if m and k are strongly correlated. However, as most

testednetworkshadfairlydensecommunities,r$0.3,therelevance

of memberships should not be understated.

Investigationoftheepidemicregimestransition.Theresultsofthe

previous sections suggest that local information (i.e., degree,

0

0.001

0.06

0.05

0.1

0.01

I*

r

r

r

r

r

r

l

l

l

l

l

l

0

0.001

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.01 0.1

I*

0

0.001

0.9

0.01

0.02

0.03I*

0.04

0.05

0.01

0.3

0.4

0.5

0.6

0.7

0.8

0.001

0.9

0.01 0.1

I*

0.5

0.6

0.7

0.8

0.001

0.5

0.01 0.1

I*

0.5

0.6

0.7

0.8

0.001

0.7

0.01 0.1

Rf

0

0.001

0.5

0.1

0.2

0.3

0.4

0.01 0.1

I*

0

0.001

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.01 0.1

Rf

0

0.001

1

0.1

0.2

0.3

0.4

0.01 0.1

Rf

0.5

0.6

0.7

0.8

0.001

0.9

0.01 0.1

Rf

0.7

0.8

0.9

0.001 0.01 0.1

Fraction of nodes removed

Rf

0.5

0.6

0.7

0.8

0.001 0.01 0.1

Fraction of nodes removed

Rf

r

r

r

r

r

r

Figure 6 | EfficiencyoftheimmunizationmethodsagainstSISandSIRepidemicsonseveralnetworks. Nodesareremovedindecreasingorder oftheir

score according to each method: coreness (green pentagons), degree (black circles), betweenness centrality (blue triangles) and memberships (red

diamonds) to measure efficiency by the decrease of I* or Rf. The size of the epidemics for random removal of nodes (gray squares) is added for

comparison. Error bars have been omitted for clarity of the SIR results on the Power Grid, but are shown in the SI.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

5

Page 6

membership) is often sufficient for a nearly optimal global

immunization. More precisely, we found these two methods to

outperform or to be as efficient as the betweenness centrality (the

globalmethodusedforcomparison)in62ofthe68studiedscenarios

(i.e., 17 networks / 2 dynamics / 2 transmissibility regimes). This

implies that membership (e.g., on PGP), degree (e.g., Gnutella) or

both (e.g. MathSci) lead to an immunization at least as efficient as

global methods while having the noteworthy advantage of requiring

much less information and of being less sensitive to incomplete

information. This section focuses on the conditions guiding the

choice between the degree-based or the membership-based

immunization strategy. In this respect, Figs. 4 and 6 provide a

useful hindsight: the membership-based strategy is more efficient

than the degree-based one when transmissibility is high and/or

when communities are dense. To further our understanding and

test this hypothesis, we introduce a random network model

featuring a community structure, and exactly solve its final state

(Rf) under SIR dynamics using generating functions.

Our model is a slightly modified version of the configuration

model12,34where nodes are connected either through single links or

through motifs (see Fig. 7 for an example). Motifs are used to simu-

latetheeffectofacommunitystructure,thatistheredundancyofthe

neighbourhoods of nodes. Our motifs are composed of M nodes, all

connected to each other, and a node belongs to i motifs and has j

single links with probability p(i,j). This node therefore has a degree

(k) equal to (M 2 1)i 1 j and a membership (m) equal to i 1 j.

Networks are generated using a stub pairing scheme: a node belong-

ingtoimotifsand havingjsingle linkshasi‘‘motif stubs’’ andj‘‘link

stubs’’. Groups and single links are then formed by randomly choos-

ing M motif stubs and 2 link stubs, respectively, and then by linking

the corresponding nodes to one another. This last step is repeated

until none of the motif and link stubs remains. The distribution

p i, j

ð

ensemble, and the results obtained are averaged over this ensemble.

Extending previous work35, we compute the expected value of Rf

for the network ensemble just defined where nodes and links are

randomly removed to simulate immunization and disease transmis-

sion (SIR dynamics), respectively. Full details are given in the SI.

Using typical values for {p(i,j)}, our model illustrates and confirms

our hypothesis by clearly showing in Fig. 8a transition of efficiency

betweenthedegree-basedandthemembership-based immunization

strategy. Initially less efficient when the transmissibility is low (i.e.,

higher threshold, lower value of Rf), membership progressively out-

performs degree as the transmissibility increases. As mentionned

above, for lowervalues of T, the bestoption is thereforeto immunize

the hubs (high k) to shift the degree distribution towards lower

degrees. For higher values of T, targeting structural hubs (high m)

thatactasbridgesbetween‘‘independent’’neighbourhoodsleadstoa

Þ

fgi,j[Ntherefore defines a maximally random network

more efficient immunization as it reduces the number of paths

between different regions of the network. Note that we do not expli-

citly model the effect of community density. This could have been

done by letting links exist independently with a given probability g.

This is however identical to letting the disease propagate with prob-

abilitygT.Thus,thevalue ofTinFig.8isrelatedtothedensityofthe

communities, and our conclusions can therefore be extended to the

cases of low/high community densities.

Discussion

One of the main contributions of this work is to offer a formal

definition of the epidemic influence of nodes, i.e. the effect of its

removal on I* of Rf, which is open to diverse methods of approxi-

mation. Our results confirm that standard measures such as the

degree or betweenness centrality are not always the best indicators

of a node’s influence. Moreover, we have highlighted that the core-

ness, which has recently been proposed as an indicator of nodes’

influence20, offers poor performances. This has brought us to distin-

guish between individual risk and global influence. We have also

illustrated how a universal approach is still wanting, since different

networks and different diseases require different methods of inter-

vention.

Consequently, the fact that the numbers of links and/or com-

munities to which a node belongs are excellent measure of its epi-

demic influence — at times better, at times equivalent, but never

much worse than global centrality measures — is a particularly

important result. The fact that they both are local measures is espe-

cially relevant considering that we rarely have access to the exact

network structure of a system, either because it is simply too large

(WWW), too dynamic (email networks) or because the links them-

selves are ill-defined (social networks). Not only are local measures

computable from a limited subset of a network (which is often the

onlyavailableinformation),butacoarse-grained measurelikemem-

bership is even more interesting as it is easier to estimate than a

node’s actual degree. For instance, consider how much simpler it is

to enumerate your social groups (work, family, etc.) than the totality

of your acquaintances.

Finally,theexistenceofatransitionbetweentwoepidemicregimes

with different characteristic scales may well be the single most

important conclusion of this work. In the first regime, for low trans-

missibility and sparse communities, the microscopic structural

Figure 7 | Synthetic networks with tunable community structure.

Orange links belong to motifs of size M 5 4, and single links are shown in

blue.Thedegreekandmembershipmofafewselectednodesareindicated.

Theybelongtoi5(k2m)/(M22)motifsandhavej5[(M21)m2k]/

(M 2 2) single links.

0

0.2

0.4

0.6

0.8

1

0 0.2

Effective transmissibility

0.4 0.6 0.8 1

Rf

model

SIR

ε = 0.00

ε = 0.05

ε = 0.10

Figure 8 | Results of local immunization methods on synthetic networks.

Final sizes of SIR epidemics after immunization of various fractions e of

nodes on synthetic networks with M 5 4 and an heterogeneous degree

distribution (details in SI). Near the epidemic threshold, targeting by

degree (dotted curves) is the better choice whereas targeting by

memberships (solid curve) should be preferred for higher transmissibility.

Monte Carlo simulations were also performed to validate the formalism

and indicated on the curves (the case e 5 0.05 is omitted not to clutter the

graph) with circles (targeting by degree) and squares (targeting by

membership).

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

6

Page 7

features (i.e. node connectivity or degree) offer the most relevant

information;whileforhighertransmissibilityanddensercommunit-

ies, mesoscopic features (i.e node communities or membership)

appear more relevant. We expect to see an equivalent transition

between any pair of measures which oppose the micro and meso

scales (e.g. different range-limited measures of centrality36).

Based on our empirical and analytical results, we thus propose a

simple procedure on how to judge which local measure can be

expected to yield the best results in a given situation. From the

available subset of a given network:

1.Obtain the degree distribution to estimate the transmissibility

of the disease in relation to the epidemic threshold lc

If easily transmissible (l ? lcor T ? Tc), evaluate the net-

work’s community structure; otherwise, go to 4.

If the community density is high (r*

according to their memberships; otherwise, go to 4.

Foratransmissibilityneartheepidemicthreshold,orforsparse

communities (low r), immunize according to the degree of the

nodes.

24or Tc

25.

2.

3.

w0:3), immunize nodes

4.

The analytical and numerical frameworks used in this work are

expectedtoguideimmunizationeffortstowardsimpler,moreprecise

andefficientstrategies.Likewise,theintroductionofanodeinfluence

classification scheme opens a new avenue for finding better local

estimates of a node’s role in the global state of its system.

Methods

Betweenness centrality. For all pairs (a,b) of nodes excluding i, list the na,bshortest

paths between a and b. Let na,b(i) be the number of these paths containing i. The

betweenness centrality biof node i is then given by:

bi~

X

a,b

ðÞ

na,bi ð Þ

na,b

:

ð1Þ

Coreness. The coreness of node i is the highest integer cisuch that the node is part of

the set of all nodes with at least cilinks within the set.

Communitydetection.Twolinks,eijandeik,fromagivennodei,aresaidtobelongto

the same community if their Jaccard coefficient J(eij,eik) (similarity measure) is above

a given threshold Jc:

J eij, eik

??~nz j ð Þ \ nzk ð Þ

nz j ð Þ | nzk ð ÞwJc,

ð2Þ

where n1(u) is the set containing the neighbors of u including u.

Community density. The density riof a community i of ni. 2 nodes and dilinks is

the proportion of the possible redundant links that do exist; i.e., the fraction of

existinglinksexcludingtheminimalni–1linksthatareneededforthiscommunityto

be connected:

ri~

di{ ni{1

ð

nini{1

ð

2

Þ

Þ

{ ni{1

ðÞ

:

ð3Þ

The community density r is then calculated according to

r~1

D

X

i

diri,

ð4Þ

where D is the total number of links not belonging to single link communities, for

which ri5 021.

Immunization. To perform the immunization of a fraction e of the network

accordingtoacertainmeasureC,weremovetheeNnodeswiththehighestC.Whena

choice must be made (nodes with equal C), all decisions are taken randomly and

individually for each simulated epidemics.

MonteCarlo simulations.Toinvestigate the fractionofanetwork which can sustain

an epidemics, SIS simulations start with all nodes in an infectious state and are then

relaxed untilanequilibrium isreached. Toinvestigatethe meanfractionofa network

which a disease can invade, SIR simulations start with a single randomly chosen

infectiousnodeandrununtiltherearenomoreinfectiousnodes.Resultsshowninthe

figures are obtained by averaging over the outcome of several numerical simulations

untiltheminimalpossiblestandarddeviation(limitedbynetworkstructureandfinite

size) is obtained. For the SIR dynamics, only the simulations leading to largescale

epidemics(atleast1%ofthenodes)wereconsidered.Thecompleteprocedureisgiven

in the SI.

1. Caldarelli, G. & Vespignani, A. Large Scale Structure and Dynamics of Complex

Networks. World Scientific Publishing Company, Singapore (2007).

2. Keeling, M. J. & Rohani, P. Modeling Infectious Diseases in Humans and Animals

Princeton University Press, Princeton (2008).

3. Anderson, R. M. & May, R. M. Infectious Diseases of Humans: Dynamics and

Control Oxford University Press, New York (1991).

4. Keeling, M. J. & Eames, K. T. D. Networks and epidemic models. J R Soc Interface

2, 295–307 (2005).

5. Pastor-Satorras, R.&Vespignani, A.Epidemic SpreadinginScale-FreeNetworks.

Phys. Rev. Lett. 86, 3200–3203 (2001).

6. Go ´mez-Garden ˜es, J., Echenique, P. & Moreno, Y. Immunization of real complex

communication networks. Eur. Phys. J. B 49, 259–264 (2006).

7. Dunne, J. A. & Williams, R. J. Cascading extinctions and community collapse in

model food webs. Philos Trans R Soc Lond B Biol Sci 364, 1711–1723 (2009).

8. Gallos, L. K., Liljeros, F., Argyrakis, P., Bunde, A. & Havlin, S. Improving

immunization strategies. Phys. Rev. E 75, 045104(R) (2007).

9. Chen, Y., Paul, G., Havlin, S., Liljeros, F. & Stanley, H. E. Finding a Better

Immunization Strategy. Phys.Rev. Lett. 101, 058701 (2008).

10. Salathe ´, M. & Jones, J. H. Dynamics and Control of Diseases in Networks with

Community Structure. PLoS comp. biol. 6, e1000736 (2010).

11.Masuda,N.Immunizationofnetworkswithcommunitystructure.NewJ.Phys11,

123018 (2009).

12. Newman, M. E. J., Strogatz, S. H. & Watts, D. J. Random graphs with arbitrary

degree distributions and their applications. Phys. Rev. E 64, 026118 (2001).

13. Baraba ´si, A.-L. & Albert, R. Emergence of scaling in random networks. Science

286, 509–512 (1999).

14. Albert, R., Jeong, H. & Baraba ´si, A.-L. Error and attack tolerance of complex

networks. Nature 406, 378–382 (2000).

15. Pastor-Satorras, R. & Vespignani, A. Immunization of complex networks. Phys.

Rev. E 65, 036104 (2002).

16. Freeman L. Centrality in social networks: Conceptual clarification. Social

Networks 1, 215–239 (1979).

17. Barthe ´lemy, M. Betweenness centrality in large complex networks. Eur. Phys. J. B

38, 163–168 (2004).

18. Batagelj, V. & Zavers ˇnik, M. Generalized Cores. arXiv:cs/0202039v1.

19. Batagelj, V. & Zavers ˇnik, M. An O(m) Algorithm for Cores Decomposition of

Networks. arXiv:cs/0310049v1.

20. Kitsak, M. et al. Identification of influential spreaders in complex networks.

Nature Physics 6, 888–893 (2010).

21. Ahn, Y.-Y., Bagrow, J. P. & Lehmann, S. Link communities reveal multiscale

complexity in networks. Nature 466, 761–764 (2010).

22. Palla, G., Dere ´nyi, I., Farkas, I. & Vicsek, T. Uncovering the overlapping

community structure of complex networks in nature and society. Nature 435,

814–818 (2005).

23. Bogun ˜a ´, M., Pastor-Satorras, R., Dı ´az-Guilera, A. & Arenas, A. Models of social

networks based on social distance attachment. Phys. Rev. E 70, 056122 (2004).

24. He ´bert-Dufresne, L., Noe ¨l, P.-A., Allard, A., Marceau, V. & Dube ´, L. J.

Propagationdynamicsonnetworksfeaturingcomplextopologies.Phys.Rev.E82,

036115 (2010).

25.Newman,M.E.J.Spreadofepidemicdiseaseonnetworks.Phys.Rev.E66,016128

(2002).

26. Alvarez-Hamelin, I., Dall’Asta, L., Barrat, A. & Vespignani, A. k-core

decomposition: A tool for the visualization of large scale networks. Advances in

Neural Information Processing Systems 18, 41–50 (2006).

27. Palla, G., Farkas, I. J., Pollner, P., Dere ´nyi, I. & Vicsek, T. Fundamental statistical

features and self-similar properties of tagged networks. New J. Phys. 10, 123026

(2008).

28.Watts,D.J.&Strogatz,S.H.Collectivedynamicsofsmall-worldnetworks.Nature

393, 440–442 (1998).

29. He ´bert-Dufresne, L., Allard, A., Marceau, V., Noe ¨l, P.-A. & Dube ´, L. J. Structural

Preferential Attachment: Network Organization beyond the Link. Phys. Rev. Lett.

107, 158702 (2011).

30. Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar

community structure in a network of human interactions. Phys. Rev. E 68,

065103(R) (2003).

31.Ripeanu,M.,Foster,I.&Iamnitchi,A.MappingtheGnutellaNetwork:Properties

of Large-Scale Peer-to-Peer Systems and Implications for System Design. IEEE

Internet Computing Journal 6, 50–57 (2002).

32.Brandes,U.AFasterAlgorithmforBetweennessCentrality.J.Math.Sociol.25(2),

163–177 (2001).

33. Madduri, K., Ediger, D., Jiang, K., Bader, D. A. & Chavarrı ´a-Miranda, D. G. A

Faster Parallel Algorithm and Efficient Multithreaded Implementations for

Evaluating Betweenness Centrality on Massive Datasets. Third Workshop

MTAAP (2009).

34.Newman,M.E.J.Properties ofhighlyclustered networks.Phys.Rev.E68,026121

(2003).

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

7

Page 8

35. Allard, A., He ´bert-Dufresne, L., Noe ¨l, P.-A., Marceau, V. & Dube ´, L. J. Bond

percolationonaclassofcorrelatedandclusteredrandomgraphs.J.Phys.A:Math.

Theor. 45, 405005 (2012).

36. Ercsey-Ravasz, M., Lichtenwalter, R. N., Chawla, N. V. & Toroczkai, Z. Range-

limited centrality measures in complex networks. Phys. Rev. E 85, 066103 (2012).

Acknowledgements

The authors wish to thank Louis Roy for the development of a k-core visualization tool;

Yong-YeolAhnetal.fortheirlinkcommunityalgorithm;alltheauthorsofthecitedpapers

for providing their network data; and Calcul Que ´bec for computing facilities. This research

was funded by CIHR, NSERC and FRQ-NT.

Author contributions

L.H.-D.andA.A.designedthestudy.L.H.-D.,A.A.andJ.-G.Y.performedthecomputations.

All authors have contributed to the analysis and wrote the manuscript.

Additional information

Supplementary information accompanies this paper at http://www.nature.com/

scientificreports

Competing financial interests: The authors declare no competing financial interests.

How to cite this article: He ´bert-Dufresne, L., Allard, A., Young, J. & Dube ´, L.J. Global

efficiency of local immunization on complex networks. Sci. Rep. 3, 2171; DOI:10.1038/

srep02171 (2013).

This work is licensed under a Creative Commons Attribution-

NonCommercial-ShareAlike 3.0 Unported license. To view a copy of this license,

visit http://creativecommons.org/licenses/by-nc-sa/3.0

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 3 : 2171 | DOI: 10.1038/srep02171

8