Conference PaperPDF Available

On the Trade-offs between User-to-Replica Distance and CDN Robustness to Link Cut Attacks

Authors:

Abstract

Content Delivery Networks (CDNs) are a key enabler for geographically-distributed content delivery with high throughput and low latency. As CDNs utilize the underlying optical core network infrastructure, they inherit its vulnerability to targeted link cut attacks which can cause severe service degradation. One of the fundamental problems in CDN configuration is deciding on the placement of content replicas across the underlying network of data centers, which should obtain balance among multiple, often conflicting performance criteria. This paper investigates the implications of minimizing the average distance between the users and the content replicas on the CDN robustness to targeted link cuts. To this end, we compute Pareto-optimal replica placement solutions with minimal user-to-replica distance and maximal robustness to link cut attacks of the highest damaging potential. k-best replica placement solutions in terms of the user-to-replica distance are calculated by formulating the problem as an Integer Linear Programming (ILP) exact method. For each replica placement solution, the worst case link cut attack scenario is identified by defining the Critical Link Set Detection (CLSD) problem. CLSD returns the link set whose cutting disconnects the maximal number of nodes from the content. We develop an ILP model for the CLSD and evaluate the robustness of the resulting CDN attack scenario in terms of mean content accessibility. The approach is evaluated through extensive simulations on real-world reference topologies, indicating that it is possible to improve the robustness to link cuts at the expense of small user-to-replica distance penalties. Moreover, the improvement of robustness is more significant for topologies with smaller average node degree and when cuts involve a larger number of links.
This is the author’s version of the work. It is posted here for personal use only. Not for redistribution. The final version of this work was published in
the 10th International Workshop on Resilient Networks Design and Modeling (RNDM), 2018, DOI: 10.1109/RNDM.2018.8489838.
On the Trade-offs between User-to-Replica Distance
and CDN Robustness to Link Cut Attacks
Carlos Natalino, Amaro de Sousa, Lena Wosinska, and Marija Furdek
Optical Networks Laboratory (ONLab), KTH Royal Institute of Technology, Stockholm, Sweden.
E-mail: {carlosns, wosinska, marifur}@kth.se
Instituto de Telecomuncações/DETI, Universidade de Aveiro, Aveiro, Portugal.
E-mail: asou@ua.pt
Abstract—Content Delivery Networks (CDNs) are a key en-
abler for geographically-distributed content delivery with high
throughput and low latency. As CDNs utilize the underlying
optical core network infrastructure, they inherit its vulnerability
to targeted link cut attacks which can cause severe service
degradation. One of the fundamental problems in CDN config-
uration is deciding on the placement of content replicas across
the underlying network of data centers, which should obtain
balance among multiple, often conflicting performance criteria.
This paper investigates the implications of minimizing the average
distance between the users and the content replicas on the CDN
robustness to targeted link cuts.
To this end, we compute Pareto-optimal replica placement
solutions with minimal user-to-replica distance and maximal
robustness to link cut attacks of the highest damaging potential.
k-best replica placement solutions in terms of the user-to-replica
distance are calculated by formulating the problem as an Integer
Linear Programming (ILP) exact method. For each replica
placement solution, the worst case link cut attack scenario is
identified by defining the Critical Link Set Detection (CLSD)
problem. CLSD returns the link set whose cutting disconnects the
maximal number of nodes from the content. We develop an ILP
model for the CLSD and evaluate the robustness of the resulting
CDN attack scenario in terms of mean content accessibility.
The approach is evaluated through extensive simulations on
real-world reference topologies, indicating that it is possible to
improve the robustness to link cuts at the expense of small
user-to-replica distance penalties. Moreover, the improvement of
robustness is more significant for topologies with smaller average
node degree and when cuts involve a larger number of links.
I. INTRODUCTION
The ceaseless growth of network traffic and the proliferation
of services with strict performance requirements are propelling
the growth and significance of the distributed cloud networking
paradigm. A large portion of services, such as on-demand
and live media streaming, social networks, or web content
utilize Content Delivery Networks (CDNs) to bring the content
closer to the end users by replicating it across multiple,
geographically distributed data centers. In this way, CDNs
support scalable traffic growth, reduce network congestion,
and improve network and service performance in terms of,
e.g., throughput, latency and availability.
The optical network infrastructure that underpins CDNs
comes with a set of physical-layer vulnerabilities which can
be exploited by malicious attackers to disrupt the aggregated
upper-layer services. Examples of such malicious attacks in-
clude, for example, insertion of harmful optical signals or
disabling the critical network infrastructure [1]. A relatively
straightforward method of physical-layer attacks on the optical
infrastructure is the targeted cutting of optical fiber links [2].
This type of attack can cause wide-area disruption not only
by severing the connectivity, but also by overloading the re-
maining network elements and degrading service performance
in terms of, e.g., throughput and latency.
Due to the instantiation of multiple content replicas at di-
verse locations, to which users can connect in an anycast man-
ner, CDNs intrinsically support higher resiliency to failures.
However, if the awareness of the underlying physical-layer
vulnerabilities to targeted attacks is not taken into account
during CDN planning, the network can still exhibit a great
degree of security vulnerability in spite of a high number of
deployed replicas [3].
One of the essential problems in CDN planning is the replica
placement problem (RPP). It typically encompasses decisions
on the number of content replicas to be deployed, and on
the data center nodes to host them. RPP can be solved with
respect to multiple objectives and constraints. Minimization of
the distance between the users and replicas is often prioritized
by the operators in order to reduce communication latency
and resource usage (provided there are sufficient resources to
accommodate the shortest physical paths). Another operator
concern refers to guaranteeing resiliency from failures and
attacks. Network robustness must be high in order to avoid
large-scale service disruption and the related losses. These
and other CDN design criteria often incur trade-offs with one
another and it is very difficult to address all of them by one
multi-objective approach.
The goal of this work is to investigate the impact of prioritiz-
ing the distance between the users and the content replicas as
the guiding principle of RPP on the overall network robustness
to targeted link cut attacks. To achieve a comprehensive
evaluation of the trade-offs between distance minimization and
robustness to link cut attacks, we formulate two optimization
problems as compact Integer Linear Programmings (ILPs). We
first develop an ILP for the k-best replica placement problem
aimed at minimizing the average user-replica distance. We then
define the Critical Link Set Detection (CLSD) problem for the
obtained replica placement solutions, which allows us to iden-
tify the pmost critical links whose cutting causes maximum
disruption. To quantify the CDN disruption caused by one at-
tack instance, we use the Average Content Accessibility (ACA)
[3], while µ-ACA [4] gauges CDN robustness over a set of
attack scenarios. Using the described approach, we identify
Pareto-optimal solutions for two real-world network topologies
and analyze their performance. The simulation results indicate
that it is possible to improve the robustness to link cuts at the
expense of small user-to-replica distance penalties and that the
robustness improvement is more significant for topologies with
smaller average node degree and when cuts involve a larger
number of links.
The remainder of the paper is organized as follows. Section
II presents an overview of the related work. Section III presents
the ILP formulation of the k-best replica placement problem.
Section IV introduces the approach for CDN robustness eval-
uation based on CLSD and µ-ACA. Section Vanalyses the
simulation results, while Section VI concludes the paper.
II. RE LATE D WOR K
The RPP as well as the critical link detection problem have
been studied individually in the literature. For instance, the
RPP can be modeled so as to optimize the user-to-replica and
replica-to-replica resources required to support the CDN [5].
In [6], the authors propose ILP and heuristic solutions for
the disaster-aware integrated RPP and Routing Assignment
(RA) in networks, and disaster zones are modeled as Shared
Risk Group (SRG). The work in [7] proposes a model that
dynamically adapts the replica placement according to the as-
sessment of the experienced disaster events and to the current
user demand. A probabilistic disaster model is assumed, where
devices in a SRG fail with a given probability. These works
consider the RPP in the context of disaster-aware design of
networks, where SRGs are known a priori due to the high
(e.g., geographical) correlation of the network elements that
can fail at the same time (see [8] for a more detailed survey).
However, SRGs are not suitable in the context of targeted
attacks because, in attack-aware networking, the importance of
each element depends more on the traffic the element carries
than on the geographical location of the element.
An assessment of robustness for unicast traffic in several
network topologies is presented in [9]. The work in [3]
focuses on the robustness measures for anycast traffic (more
suitable for CDNs), and proposes the ACA measure, evaluating
the performance of four content placement replicas under
targeted link cut attacks. The investigated replica placement
strategies are based on traditional graph measures or in clus-
tering approaches. In [4], the ACA is extended to evaluate
the performance of network topologies over several different
attack scenarios. In both works, link betweenness centrality
is used to select which links belong to the targeted attacks,
in a procedure similar to the community detection presented
in [10]. Moreover, the works focus on the definition of
new measures to assess the robustness of CDNs, and RPP
solutions considered are non-optimal. However, evaluating the
robustness of network topologies considering optimal solutions
of the RPP as well as the worst-case attack (e.g., by optimally
solving the CLSD) are still not addressed in the literature.
Concerning the identification of critical elements on a given
topology, the first works addressed the critical node detection
problem, defined as the identification of a node set that
minimizes a given connectivity metric if removed from the
topology [11], [12], [13], [14]. More recently, the critical
link detection counterpart has also been addressed in different
contexts, as the minimization of the pairwise connectivity of
communication networks [15], minimization of the spread of
infections over a population [16] and the influence propagation
in social networks [17]. Here, the critical link set detection
problem is defined in the context of CDNs and used to model
the worst-case link cut attacks.
III. k-BES T REP LI CA PLACEMENT PROBLEM
For a given core network, we consider the problem of
selecting Rnodes to host the replica of a particular content.
The quality of a replica placement solution is determined as the
average shortest path distance from every node to its closest
replica. The objective of the k-best Replica Placement Problem
(k-RPP) is to enumerate the best ksolutions sorted in a non-
decreasing order of their quality. The kreplica placement
solutions can be computed in kiterations. At iteration i, the
ith solution is obtained by solving an ILP model that takes
into consideration all previous solutions.
The ILP model defining the optimal replica placement
solution is as follows. Consider a network defined by a directed
graph G=(V,A). Set Vis the node set, with a number of nodes
n=|V|. Set Ais the arc set where arc (i,j) Arepresents a
directed link from node iVto node jV. Set Viis the set of
nodes adjacent to iin G. Consider also a node discrimination
parameter ti
s(with sVand iV) that is equal to 1 if s=i,
and to 0 if s,i. The length of each arc (i,j) Ais denoted
as dij . Consider the following binary variables:
rqis equal to 1 if node qVis selected to host a replica,
or 0 otherwise;
yq
sis equal to 1 if the replica hosted in node qVis the
closest replica to node s, or 0 otherwise;
zs
ij is equal to 1 if arc (i,j) Ais contained in the path from
node sVto its closest replica, or 0 otherwise.
For the computation of the ith iteration, consider also the
set Swith |S|=i1, which represents the solutions obtained
by solving the i1previous problems. Each r0Srepresents
one solution, where r0
qis equal to 1 if node qVis selected
to host a replica in the solution r0. For a given number of
replicas R, the replica placement solution that minimizes the
average distance from each node to its closest replica is the
optimal solution of the following replica placement problem
with minimal distance (RPP-minD) model:
RPP-minD(G,R,S)
Minimize Õ
sV
Õ
(i,j)∈ A
dij zs
ij (1)
Subject to:
Õ
qV
rq=R(2)
Õ
qV
yq
s=1,sV(3)
yq
srq,sV,qV(4)
Õ
jVi
(zs
ji zs
i j )=yi
sti
s,sV,iV(5)
Õ
qV
r0
qrqR1,r0S(6)
rq {0,1},qV(7)
yq
s {0,1},sV,qV(8)
zs
ij {0,1},sV,(i,j) A(9)
The objective function (1) is the minimization of the sum of
the lengths of all arcs contained in all paths. Note that the
average shortest path distance from every node to its closest
replica is the value (1) divided by the number of nodes n.
Constraint (2) guarantees that Rreplica locations are se-
lected. Constraints (3) guarantee that one location qV
is selected as the closest replica to each node sVand
constraints (4) guarantee that the selected location qVis
a replica location. Constraints (5) are the path conservation
constraints for the arcs of the path from each node sVto
its closest replica location defined by yq
s. Constraints (6) are
responsible to remove the solutions obtained in the previous
iterations from the current solution. Finally, constraints (7)–(9)
are the variable domain constraints.
The k-best replica placements are then computed with Alg.
1. First, the set of kbest replica placements is initialized to
an empty set (line 1). For each one of the kbest replica
placements (line 2), the algorithm solves the ith RPP-minD
problem excluding all the ki1solutions previously
computed (line 3). Once a new solution is computed, it is
added to the list of computed solutions (line 4). When all
the kproblems are computed, the set with all the solutions
is returned. Once the set of k-best replica placements that
minimize the user-to-replica distance for the given topology
is found, we need to identify the worst-case link cut scenario
and quantify the robustness of the obtained solutions.
IV. MEA N CON TE NT ACCESSIBILITY BASED ON CRITICAL
LINK SET DE TE CT IO N
The problem of identifying the worst-case link cut scenario
that incurs the maximum damage in a CDN is formulated as
an optimization problem, which we refer to as Critical Link
Set Detection (CLSD). The damage from a set of worst-case
Algorithm 1: k-best replica placements algorithm
Data: G,R,k
Result: Set of k-best replica placements (S)
1S ;
2for i=1to kdo
3siRPP-minD(G,R,S);
4SSsi;
5return S;
attacks with different extents is then quantified in terms of
mean content accessibility (µ-ACA) defined in [4].
For a given CDN network and attack intensity p, the
objective of the CLSD problem is to determine the set of
plinks whose removal from the network leaves a minimum
number of nodes connected to any content replica. We consider
the network modeled by an undirected graph G=(V,E),
where Vis the set of nodes of size n=|V|, and Eis the
set of undirected network links with elements defined by their
end nodes (i,j),i,jVand i<j. In addition, set Ecis the
set of non-adjacent node pairs (i,j),i<j. Again, Viis the
set of nodes adjacent to iin G. Then, set Vij is defined as
the set of nodes adjacent to the node with the lower degree
between iand j(i.e., set Vij is equal to Viif |Vi| |Vj|, and
Vjotherwise).
The placement of replicas is defined by the set of nodes
DVwhich host the content, and are found, e.g., by solving
the RPP-minD problem. Based on D, we define set Fas the
set of node pairs (i,j)such that one node is in D(i.e., it hosts
a replica) and the other node is in V\D(i.e., it does not host
a replica and needs to connect to one to access the content).
To define the CLSD problem, we use the following binary
variables:
xij is equal to 1 if link (i,j) Eis included in the critical
link set, and 0 otherwise;
uij is equal to 1 if nodes iand j,i<j, can be connected
when the critical link set is removed from G, and 0
otherwise;
viis equal to 1 if node iV\Dcan be connected to at
least one node in Dwhen the critical link set is removed
from G, and 0 otherwise.
For the sake of readability, in the following formulation,
variables uij can appear for any i,jbut both notations ui j
and uji represent the same variable ui j , with i<j. For a given
number of links p, and given replica locations defined by set
D, CLSD is defined by the following ILP model:
CLSD(p,D)
Minimize Õ
iV\D
vi(10)
Subject to:
Õ
(i,j)∈E
xij =p(11)
uij 1xij ,(i,j) E(12)
uij uik +ujk 1,(i,j) Ec,kVij (13)
uij vi,(i,j) F:jD(14)
uij vj,(i,j) F:iD(15)
xij {0,1},(i,j) E(16)
uij {0,1},i=1...(n1),j=(i+1)...n(17)
vi {0,1},iV\D(18)
The objective function (10) is the minimization of the number
of nodes which do not host a replica and can connect to at
least one node in D. Note that the total number of connected
nodes is the value (10) plus the number of replicas |D|since
nodes hosting a replica are always connected to its replica.
Constraint (11) guarantees that the set of identified critical
links contains plinks. Constraints (12) guarantee that the end
nodes iand jof a link (i,j) Eare connected if the link is
not included in the critical link set (i.e., if xi j =0). Constraints
(13) guarantee that non-adjacent nodes iand j(i.e.,(i,j) Ec)
are connected if there exists a node kthat is connected to
both iand j. In general, we can define one constraint (13)
for any k,i,j. We minimize the number of constraints by
considering only nodes kadjacent to either ior j, the one
with the lowest degree, as defined by Vij . For each node pair
(i,j) Fsuch that node jis the replica host, constraints (14)
set the value of variable vito 1 if node iis connected to j
and can, thus, access the content (constraints (15) account for
the cases where node iis the replica host). Finally, constraints
(16)–(18) are the variable domain constraints.
The evaluation of robustness of a CDN network solution
with Rreplicas placed according to RPP-minD over multiple
link cut attack scenarios is performed by calculating the
corresponding Mean Content Accessibility (µ-ACA). For a
given network defined by graph G, a set Ddefining the replica
placement and a range of attack intensities defined by a lower
and an upper bound on the number of cut links, i.e., pmi n
and pmax , respectively, we first solve the CLSD problem for
each p,pmin ppma x . The Mean Content Accessibility
(µ-ACA) for the considered set of attacks is then given by:
µ-ACA =
1
pmax pmi n +1
pma x
Õ
p=pmi n
(CLSD(p,D)+|D|) (19)
In the above equation, CLSD(p,D)+|D|is the number of nodes
that can connect to a replica location in the worst possible
attack of plink cuts. The µ-ACA averages these values over
all worst-case attack scenarios for all values of pranging from
pmin to pma x .
V. COMPUTATIONAL RE SU LTS
The simulations whose results are reported in this section
were carried out on two publicly available reference networks:
Germany50 topology [18] shown in Fig. 1and Coronet Conus
topology [19] shown in Fig. 2. To compute link lengths, we
consider that each link follows the shortest path over a sphere
surface representing Earth1.
A custom-built Java-based tool was developed, using the
CPLEX 12.6.3 callable library to solve all ILP models. In each
test case, CPLEX solved the models to the optimal solution,
i.e., no gap was allowed, was set with 4 parallel threads while
using default values for the rest of the settings. For a given
topology and values of k,R,pmin and pm ax , the tool runs
the k-best replica placements with Rlocations and stores, for
each solution, the set of replica locations Dand the average
shortest-path user-to-replica distance. Then, for each replica
1For the determination of geographical distances, see http://www.movable-
type.co.uk/scripts/latlong.html describing appropriate methods.
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
Longitude (degrees)
Latitude (degrees)
Fig. 1. Germany50: 50 nodes, 88 links and average node degree 3.52.
−120 −110 −100 −90 −80 −70
25
30
35
40
45
Longitude (degrees)
Latitude (degrees)
Fig. 2. Coronet Conus: 75 nodes, 99 links and average node degree 2.64.
placement solution, the tool solves all CLSD(p,D) models,
for p=pmin, .. ., pmax and computes the value of µ-ACA as
defined in (19). All computational results were obtained using
a workstation running Red Hat Enterprise Linux (RHEL) with
an 8-cores 16-threads Intel Xeon processor clocked at 3 GHz
and 64 GB of RAM.
For both topologies, we consider the replica placement
problem for R= 3, 4, 5, and 6 content replicas and compute the
500-best replica placement solutions. In all cases, the µ-ACA
is computed based on pmi n =2(since both topologies are 2-
connected, p=1would not disconnect any node). Concerning
pmax , recall that this parameter is the maximum number of
links that are simultaneously cut by a malicious attack. To
evaluate the influence of this parameter on CDN robustness,
we consider the values of pma x = 6, 9, and 12. For all of
the considered test cases, we record the µ-ACA, the average
user-to-replica distance, and the computational running times.
Among all 500 replica placements of each case, we compute
the Pareto-optimal solutions (i.e., the solutions representing
different trade-offs between the average shortest path distance
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(a) 1st solution for p=6
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(b) 1st solution for p=9
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(c) 1st solution for p=12
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(d) 209s t solution for p=6
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(e) 209s t solution for p=9
6 7 8 9 10 11 12 13 14
48
49
50
51
52
53
54
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Longitude (degrees)
Latitude (degrees)
(f) 1=209s t solution for p=12
Fig. 3. Replica placements (nodes in blue), critical links (lines in red) and disconnected nodes (nodes in red) of the 1s t solution (top row) and 209t h solution
(bottom row) for R=5replicas and p=6, 9, and 12 most critical link cuts.
and µ-ACA measure).
Fig. 3shows two representative solutions for the Germany50
network and R=5replicas. The solution shown in the top row
of the figure is the one with minimal user-to-replica distance,
i.e., the first solution of k-RPP. The solution in the bottom
row is the one with the highest robustness to link cut attacks,
and the replica placement ranks 209th among the 500 kbest
solutions in terms of user-to-replica distance. In each case, the
network nodes that host the replicas of the content are denoted
with blue. The average shortest path distance to replica is
115.7 km in the first case and 120.7 km in the second case
(representing a distance penalty of 4.3%).
For each of the replica placement solutions, Fig. 3shows
the CLSD solution for p=6, 9, and 12 most critical links,
respectively. The identified critical links are denoted with red,
as are the nodes which are left without access to the content
upon cutting the critical links. When the min-distance design
is compared to the max-robustness one (among the 500-best
replica placement solutions), the number of nodes that cannot
access any replica decreases in the latter from 8 to 7 for p=6,
from 14 to 10 for p=9, and from 21 to 15 for p=12. These
results clearly indicate that a small relaxation of the priority
of distance minimization can lead to a substantial increase in
the CDN robustness to a worst-case link cut attack.
Fig. 4shows the Pareto-optimal solutions for all the test
cases of the Germany50 and Coronet Conus topologies with
R=3 to 6 replicas and pma x =6, 9, and 12 critical links.
As a first observation, Fig. 4shows that for both topologies
100 120 140 160
Avg. Distance (km)
0.750
0.775
0.800
0.825
0.850
0.875
0.900
0.925
-ACA
R=6
R=5
R=4
R=3
pmax = 6
pmax = 9
pmax = 12
(a) Germany50
500 550 600 650 700 750 800 850
Avg. Distance (km)
0.5
0.6
0.7
0.8
-ACA
R=6
R=5
R=4
R=3
pmax = 6
pmax = 9
pmax = 12
(b) Coronet Conus
Fig. 4. Average shortest path distance vs. µ-ACA of Pareto-optimal solutions. Line styles identify the maximum number of cut links (pm a x ) and markers
identify the number of replicas (R).
0 100 200 300 400 500
k
0.5
1.0
1.5
Avg. Runtime (sec)
R=3
R=4
R=5
R=6
(a) Germany50
0 100 200 300 400 500
k
0.5
1.0
1.5
2.0
2.5
Avg. Runtime (sec)
R=3
R=4
R=5
R=6
(b) Coronet Conus
Fig. 5. Average CPLEX runtime to solve the RPP-minD problem for different number of replicas (R) as a function of the previous kbest replica placements.
2 3 4 5 6 7 8 9 10 11 12
p
0.0
0.5
1.0
1.5
Avg. Runtime (sec)
R=3
R=4
R=5
R=6
(a) Germany50
2 3 4 5 6 7 8 9 10 11 12
p
0
1
2
3
Avg. Runtime (sec)
R=3
R=4
R=5
R=6
(b) Coronet Conus
Fig. 6. Average CPLEX runtime to solve the CLSD problem for different number of replicas (R) as a function of the number of simultaneous link cuts (p).
the mean content accessibility µ-ACA decreases for higher
values of pma x and a fixed number of replicas R, which is
expected since µ-ACA is an average over attacks involving
a larger number of links. In both topologies, it is almost
always possible to trade between user-to-replica distance and
robustness to link cuts (the only exception is Germany50
for pmax = 12 where the optimal user-to-replica distance
solution is almost optimal in terms of µ-ACA). Another
conclusion common to both topologies is that the improvement
in robustness, i.e., µ-ACA values, is higher for more intensive
attacks with higher values of pma x .
Comparing the results between the two topologies, the ro-
bustness improvements in Coronet Conus are higher at the cost
of smaller user-to-replica distance penalties. To understand
this result, recall that the average node degree of Germany50
(which is 3.52) is much higher than Coronet Conus one (which
is 2.64). This means that Germany50 is better connected with
a larger average number of routing paths between node pairs
and is, therefore, on average, more robust to the same number
of link cuts. On the other hand, the Coronet Conus topology is
more vulnerable to link cuts and, therefore, trading off on the
distance during replica placement yields greater improvements
of robustness to link cuts.
The average running times of solving the different ILP mod-
els for the Germany50 topology are shown in Figs. 5a and 6a.
Fig. 5a shows the evolution of the average CPLEX runtime per
solution of the RPP-minD problem for the different numbers
of replicas R. Larger values of kincrease the average runtime
(recall that a new constraint is added to the model at each
iteration, making it harder to solve), but the runtime growth
becomes almost linear for values of kabove 100. Moreover,
the runtime evolution is similar for the different number of
replicas and amount to around 1.5 seconds per solution for
k=500. The average running time to solve the CLSD problem
is shown in Fig. 6a, for p=2, ..., 12 and R=3, ..., 6. On
average, a CLSD problem instance is solved in less than 2
seconds, while the maximum recorded run time was 6 seconds.
Fig. 5b shows the evolution of the average runtime per
solution of the RPP-minD problem for the different numbers
of replicas Rin the Coronet Conus network. Similarly to
the Germany50 case, the average runtime becomes longer for
larger values of kand grows almost linearly for values of k
above 100. The runtime evolution is similar for all values of R
and amounts to around 2.5 seconds per solution for k=500.
The average running time to solve the CLSD problem, shown
in Fig. 6b remains below 4 seconds for all cases, while the
maximum observed runtime was 12 seconds.
VI. CONCLUSIONS
In this paper, we have addressed to content replica location
problem in the context of CDNs, aiming to find solutions
providing different trade-offs between average user-to-replica
distance and robustness to multiple link cut attacks.
To this aim, we first developed ILP models both for the
k-best replica placement solutions in terms of user-to-replica
distance and to the critical link set detection problem, as a
means to compute the link cut attack of highest damaging
potential. Then, an exact approach based on the previous
ILP models was proposed to compute Pareto-optimal replica
placement solutions with minimal user-to-replica distance and
maximal robustness to link cut attacks of the highest damaging
potential.
We have used the proposed approach in two real-world
reference topologies. The results have shown that it is possible
to improve the robustness to link cuts at the cost of small
user-to-replica distance penalties. Moreover, the robustness
improvement is more significant for topologies with smaller
average node degree and when cuts involve a larger number
of links. The computational experiments also showed that the
proposed ILP models are efficiently solved by standard branch-
and-cut techniques, as provided by available solvers.
ACKNOWLEDGMENT
This article is based upon work from COST Action
CA15127 (“Resilient communication services protecting end-
user applications from disaster-based failures RECODIS”)
supported by COST (European Cooperation in Science
and Technology) and the Celtic-Plus project SENDATE-
EXTEND funded by VINNOVA. Amaro de Sousa has
also been supported by FCT, Portugal, under the project
UID/EEA/50008/2013.
REFERENCES
[1] N. Skorin-Kapov, M. Furdek, S. Zsigmond, and L. Wosinska, “Physical-
layer security in evolving optical networks, IEEE Commun. Mag.,
vol. 54, no. 8, pp. 110–117, Aug. 2016.
[2] J. Rak, D. Hutchison, E. Calle, T. Gomes, M. Gunkel, P. Smith,
J. Tapolcai, S. Verbrugge, and L. Wosinska, “RECODIS: Resilient Com-
munication Services Protecting End-user Applications from Disaster-
based Failures,” in International Conference on Transparent Optical
Networks (ICTON), July 2016, pp. 1–4.
[3] C. Natalino, A. Yayimli, L. Wosinska, and M. Furdek, “Content accessi-
bility in optical cloud networks under targeted link cuts, in International
Conference on Optical Network Design and Modeling (ONDM), May
2017, pp. 1–6.
[4] ——, “Link addition framework for optical CDNs robust to targeted link
cut attacks,” in International Workshop on Resilient Networks Design
and Modeling (RNDM), Sept 2017, pp. 1–7.
[5] A. Muhammad, N. Skorin-Kapov, and M. Furdek, “Manycast, anycast
and replica placement (MARP) in optical inter-datacenter networks,
IEEE/OSA J. Optical Commun. Netw., vol. 9, no. 12, pp. 1161–1171,
2017.
[6] M. F. Habib, M. Tornatore, M. De Leenheer, F. Dikbiyik, and B. Mukher-
jee, “Design of disaster-resilient optical datacenter networks, J. Lightw.
Technol., vol. 30, no. 16, pp. 2563–2573, aug 2012.
[7] S. Ferdousi, F. Dikbiyik, M. F. Habib, M. Tornatore, and B. Mukherjee,
“Disaster-aware datacenter placement and dynamic content management
in cloud networks,” IEEE/OSA J. Opt. Commun. Netw., vol. 7, no. 7,
pp. 681–694, July 2015.
[8] C. Colman-Meixner, C. Develder, M. Tornatore, and B. Mukherjee, “A
survey on resiliency techniques in cloud computing infrastructures and
applications,” IEEE Commun. Surveys Tuts., vol. 18, no. 3, pp. 2244–
2281, 2016.
[9] D. F. Rueda, E. Calle, and J. L. Marzo, “Robustness comparison of
15 real telecommunication networks: Structural and centrality measure-
ments,” Journal of Network and Systems Management, vol. 25, no. 2,
pp. 269–289, Apr 2017.
[10] M. Girvan and M. E. J. Newman, “Community structure in social and
biological networks,” Proceedings of the National Academy of Sciences,
vol. 99, no. 12, pp. 7821–7826, 2002.
[11] A. Arulselvan, C. W. Commander, L. Elefteriadou, and P. M. Pardalos,
“Detecting critical nodes in sparse graphs,” Computers & Operations
Research, vol. 36, no. 7, pp. 2193 2200, 2009.
[12] A. Veremyev, V. Boginski, and E. L. Pasiliao, “Exact identification
of critical nodes in sparse networks via new compact formulations,
Optimization Letters, vol. 8, no. 4, pp. 1245–1259, Apr 2014.
[13] M. Di Summa, A. Grosso, and M. Locatelli, “Branch and cut algo-
rithms for detecting critical nodes in undirected graphs,” Computational
Optimization and Applications, vol. 53, no. 3, pp. 649–680, Dec 2012.
[14] D. Santos, A. de Sousa, and P. Monteiro, “Compact models for critical
node detection in telecommunication networks,” Electronic Notes in
Discrete Mathematics, vol. 64, pp. 325–334, 2018, 8th International
Network Optimization Conference - INOC 2017.
[15] T. N. Dinh, Y. Xuan, M. T. Thai, P. M. Pardalos, and T. Znati,
“On new approaches of assessing network vulnerability: Hardness and
approximation,” IEEE/ACM Transactions on Networking, vol. 20, no. 2,
pp. 609–619, April 2012.
[16] A. K. Nandi and H. R. Medal, “Methods for removing links in a
network to minimize the spread of infections,” Computers & Operations
Research, vol. 69, pp. 10 24, 2016.
[17] C. P. Gillen, A. Veremyev, O. A. Prokopyev, and E. L. Pasiliao, “Critical
arcs detection in influence networks,” Networks, Aug 2017.
[18] S. Orlowski, R. Wessäly, M. Pióro, and A. Tomaszewski, “SNDlib 1.0
survivable network design library,” Networks, vol. 55, no. 3, pp. 276–
286, May 2010.
[19] J. Simmons, Optical network design and planning, 2nd ed. Springer,
2014.
... The anycast traffic assumption is suitable for many of the current applications [25], such as video streaming and gaming, and allows to increase the robustness of the applications while reducing latency between user and application server. In Table 2.1, superscripts besides cities indicate the nodes that are selected by the optimal DC placement model (we refer to [20] for more details) when β DCs are placed in the network (β = 4, 5, 6). ...
... The Routing, Modulation and Spectrum Assignment (RMSA) is performed selecting the shortest Table 2.1 List of nodes and their associated population [23]. Superscripts besides cities indicate which nodes were selected to host content replicas for the cases with 4, 5, and 6 replicas according to the optimal DC placement model presented in [20] ID City Population * ID City Population * 4 Nodes selected to host a content replica for β = 4 replicas. 5 Nodes selected to host a content replica for β = 5 replicas. ...
Chapter
This chapter is dedicated to the description of structural methods aiming to improve the robustness of anycast communications to large-scale failures, either due to natural disasters or malicious human activities. The chapter considers both software-defined networks (SDNs) where the anycast nodes are the nodes hosting SDN controllers, and content delivery networks (CDNs) where the anycast nodes are the nodes hosting content replicas. Most of the structural methods described in this chapter aim to optimally select the anycast nodes in a given network. The chapter first addresses the robustness of anycast communications to natural disasters based on geodiversity routing. Then, different methods are described to select the SDN controller locations aiming to maximize the SDN control plane robustness to malicious node attacks. Finally, the chapter addresses the robustness of CDNs to malicious link cuts by describing methods for the network upgrade (based either on the addition of new links or new replica locations) and for the optimal selection of content replica locations.
... The anycast traffic assumption is suitable for many of the current applications [25], such as video streaming and gaming, and allows to increase the robustness of the applications while reducing latency between user and application server. In Table 2.1, superscripts besides cities indicate the nodes that are selected by the optimal DC placement model (we refer to [20] for more details) when β DCs are placed in the network (β = 4, 5, 6). ...
... The Routing, Modulation and Spectrum Assignment (RMSA) is performed selecting the shortest Table 2.1 List of nodes and their associated population [23]. Superscripts besides cities indicate which nodes were selected to host content replicas for the cases with 4, 5, and 6 replicas according to the optimal DC placement model presented in [20] ID City Population * ID City Population * 4 Nodes selected to host a content replica for β = 4 replicas. 5 Nodes selected to host a content replica for β = 5 replicas. ...
Chapter
Disasters can cause, intentionally or unintentionally, the failure of several network components at the same time. A vast body of literature focuses on understanding the impact of disasters on the network infrastructure to enable the design of more robust networks. However, these multiple failures also affect the applications running over the network infrastructure. Even when the impact of a disaster on the structural performance indicators is insignificant, the functional implications can be substantial. More importantly, a small degradation in network performance can result in severe disruptions of overlay applications, or even completely prevent their proper functioning. Therefore, it is essential to analyze the impact of a disaster on the functional aspects of the network, i.e. the Quality of Service (QoS) offered to the applications and the Quality of Experience (QoE) perceived by the users. In this chapter, we review the functional metrics for evaluating the impact of disasters on applications and users. We specify relevant packet- and network-based functional metrics as well as perceived subjective metrics, and demonstrate the impact of disasters on QoS and QoE metrics in a case study.
... In this work, we formulate the CLSD problem tailored to 5G-enabled CDN which can host content in cDCs and eDC. A preliminary version of this study was presented in [29], where we investigated the trade-offs between user-to-content distance and robustness in traditional CDNs with cDCs only. In this paper, we extend our previous work by considering 5G-enabled CDNs, where cDCs and eDCs can be simultaneously used to provide lower latency and improve robustness. ...
... Graphical representations of the topologies are omitted for the sake of space, and can be found in our previous work[29].2 The page http://www.movable-type.co.uk/scripts/latlong.html describes the appropriate method to compute the distance. ...
Article
Full-text available
High throughput, resilience, and low latency requirements drive the development of 5G‐enabled content delivery networks (CDNs) which combine core data centers (cDCs) with edge data centers (eDCs) that cache the most popular content closer to the end users for traffic load and latency reduction. Deployed over the existing optical network infrastructure, CDNs are vulnerable to link cut attacks aimed at disrupting the overlay services. Planning a CDN to balance the stringent service requirements and increase resilience to attacks in a cost‐efficient way entails solving the content placement problem (CPP) across the cDCs and eDCs. This article proposes a framework for finding Pareto‐optimal solutions with minimal user‐to‐content distance and maximal robustness to targeted link cuts, under a defined budget. We formulate two optimization problems as integer linear programming (ILP) models. The first, denoted as K‐best CPP with minimal distance (K‐CPP‐minD), identifies the eDC/cDC placement solutions with minimal user‐to‐content distance. The second performs critical link set detection to evaluate the resilience of the K‐CPP‐minD solutions to targeted fiber cuts. Extensive simulations verify that the eDC/cDC selection obtained by our models improves network resilience to link cut attacks without adversely affecting the user‐to‐content distances or the core network traffic mitigation benefits.
... In [15], a probability disaster model is assumed and a disaster-aware dynamic content-management algorithm is proposed to dynamically adapt the replica placement according to disaster probability updates so that the expected content loss is reduced at any time (see [16] for a more detailed survey). In [17][18][19], the CDN robustness to link cut attacks is studied which includes appropriate vulnerability measures [17][18] and a method to determine RPP Pareto-optimal solutions between user-to-replica distance and attack robustness [18]. Different works have also addressed resilient CPP variants in the context of SDN. ...
... In a data center environment, where replicas of content are distributed across multiple geographical locations and users can connect to any network node that hosts the desired content in an anycast manner, selecting the number of replicas and their locations plays an important role in network robustness to targeted link cuts. Approaches for identification of critical fiber links whose cutting maximizes the connectivity interruption and reduces accessibility to content, as well as sparse link addition and content placement strategies to increase network robustness can be found in [2] and [3], respectively. ...
Conference Paper
Full-text available
Given the pivotal role of optical networks in supporting critical societal services, their robustness to deliberate attacks targeting disruption at the physical layer requires advanced approaches for security assurance, diagnostics and response. This paper analyzes the necessary advancements in optical network security needed to achieve secure and self-diagnosable systems. Keywords-optical network security, attack signature, security diagnostics, attack-aware network design, attack detection, attack recovery.
Article
In telecommunication networks, full connectivity resilience to multiple link failures is too costly as it requires a network topology with too many redundant links. Alternatively, the connectivity resilience of a telecommunications network can be improved by resorting to available third‐party networks for temporary additional connectivity until the failing links are restored. In this approach, some nodes of the network must be selected in advance to act as gateway nodes to the third‐party networks when a multiple link failure event occurs. For a given network topology and a cost associated with each node to turn it into a gateway node to each of the third‐party networks, the aim is to select the gateway nodes providing maximum connectivity resilience at minimum cost. The Gateway Node Selection is defined as a bi‐objective optimization problem such that its Pareto‐optimal solutions represent different trade‐offs between cost and connectivity resilience. In this work, the connectivity resilience is modeled by the Critical Link Detection optimization problem. An exact optimization algorithm is proposed, based on a row generation algorithm and on set cover cuts. The computational results demonstrate the effectiveness of the proposed algorithm on four well‐known telecommunication network topologies.
Article
Network cascades represent a number of real-life applications: social influence, electrical grid failures, viral spread, and so on. The commonality between these phenomena is that they begin from a set of seed nodes and spread to other regions of the network. We consider a variant of a critical node detection problem dubbed the robust critical node fortification problem, wherein the decision maker wishes to fortify nodes (within a budget) to limit the spread of cascading behavior under uncertain conditions. In particular, the arc weights—how much influence one node has on another in the cascade process—are uncertain but are known to lie in some range bounded by a worst-case budget uncertainty. This problem is shown to be [Formula: see text]-hard even in the deterministic case. We formulate a mixed-integer program (MIP) to solve the deterministic problem and improve its continuous relaxation via nonlinear constraints and convexification. The robust problem is computationally more difficult, and we present an MIP-based expand-and-cut exact solution algorithm, in which the expansion is enhanced by cutting planes, which are themselves tied to the expansion process. Insights from these exact solutions motivate two novel (interrelated) centrality measures, and a centrality-based heuristic that obtains high-quality solutions within a few seconds. Finally, extensive computational results are given to validate our theoretical developments as well as provide insights into structural properties of the robust problem and its solution.
Chapter
In this chapter, we consider the vulnerability evaluation of a given network, composed by a set of nodes and a set of links, based on the connectivity still provided by the network if its critical elements fail simultaneously. This evaluation requires to solve the optimization problem of identifying the set of network elements (either nodes or links) that are the most critical in the sense that if they are removed from the network, the resulting connectivity is minimized. The chapter describes both exact methods based on integer linear programming and centrality-based heuristics. Although exact methods are preferable since they can provide optimal solutions, they do not scale for very large problem instances since the optimization problem is of combinatorial nature. In such cases, heuristics are alternative methods able to provide solutions that might be non-optimal. The chapter addresses separately the cases aiming to identify the set of critical nodes and the set of critical links. Computational results are provided enabling the comparison between exact methods and centrality-based heuristics in terms of solution optimality.
Article
Full-text available
The expanding adoption of cloud-based services in recent years puts stringent requirements on datacenters (DCs) and their interconnection networks. Optical inter-datacenter networks represent the only viable option for satisfying the huge bandwidth required to replicate and update content for cloud-based services across geographically dispersed datacenters. In addition to content replication and synchronization, optical inter-datacenter networks must also support communication between datacenters and end-users. The resulting new traffic patterns and the enormous traffic volumes call for new capacity-efficient approaches for inter-datacenter network designs that incorporate both transport and datacenter resource planning. This paper introduces an integrated approach to optimally place content replicas across DCs by concurrently solving the routing and wavelength assignment (RWA) problem for both inter-DC content replication and synchronization traffic following the manycast routing paradigm, and end-user-driven user-to-DC communication following the anycast routing paradigm, with the objective to reduce the overall network capacity usage. To attain this goal, the manycast, anycast, and replica placement (MARP) problem is formulated as an integer linear program to find optimal solutions for smaller problem instances. Due to the problem complexity, a scalable and efficient heuristic algorithm is developed to solve larger network scenarios. Simulation results demonstrate that the proposed integrated MARP strategy can significantly reduce the network capacity usage when compared to the benchmarking replica placement and RWA schemes aimed at minimizing the resources consumed by either of the two types of traffic independently.
Conference Paper
Full-text available
Content Delivery Networks (CDNs) are a key enabler for the distribution of large amounts of data with high capacity and low latency. For instance, content streaming companies extensively use geographical distribution and replication to meet the ever-growing demand for media. Optical networks are the only future-proof technology available that meets the reach and capacity requirements of CDNs. However, the robustness of optical networks becomes a point of concern, as they can be a target of deliberate link cuts that can severely degrade network connectivity and cause large-scale service disruption. To mitigate the vulnerabilities, actions can be taken in the optical and/or cloud infrastructures. The replication of content across geographically diverse data centers results in an intrinsic increase of content accessibility. At the network infrastructure level, robustness to attacks can be enhanced by increasing the topology connectivity through link addition. This work focuses on the latter approach and its effectiveness in increasing content accessibility in the presence of deliberate link cuts. The paper proposes a framework for evaluation and enhancement of content accessibility in CDNs by sparse link addition. First, a content accessibility measure called μ-ACA is introduced to gauge the content accessibility of a given network topology under a set of link cut attack scenarios. Based on this measure, a new link addition strategy is defined aimed at maximizing the content accessibility for a given number of extra links. Simulation results on real-world reference topologies show that the proposed strategy can significantly improve content accessibility by adding a very limited number of optical fiber links.
Conference Paper
Full-text available
One of the key enablers of the digital society is a highly reliable information infrastructure that can ensure resiliency to a wide range of failures and attacks. In cloud networks, replicas of various content are located at geographically distributed data centers, thus inherently enhancing cloud network reliability through diversification and redundancy of user accessibility to the content. However, cloud networks rely on optical network infrastructure which can be a target of deliberate link cuts that may cause service disruption on a massive scale. This paper investigates the dependency between the extent of damage caused by link cuts and a particular replica placement solution, as a fundamental prerequisite of resilient cloud network design that lacks systematic theoretical quantification and understanding. To quantify the vulnerability of optical cloud networks based on anycast communication to targeted link cuts, we propose a new metric called ACA. Using this metric, we analyze the impact of the number and the placement of content replicas on cloud network resiliency and identify the best and the worst case scenarios for networks of different sizes and connectivity. We evaluate the efficiency of simultaneous and sequential targeted link cuts, the latter reassessing link criticality between subsequent cuts to maximize disruption. Comparison with A2TR, an existing robustness measure for unicast networks, shows great discrepancy in the vulnerability results, indicating the need for new measures tailored to anycast-based networks.
Article
Full-text available
Multiple failures can have catastrophic consequences on the normal operation of telecommunication networks. In this sense, guaranteeing network robustness to avoid users and services being disconnected is essential. A wide range of metrics have been proposed for measuring network robustness. In this paper the taxonomy of robustness metrics in telecommunication networks has been extended and a classification of multiple failures scenarios has been made. Moreover, a structural and centrality robustness comparison of 15 real telecommunication networks experiencing multiple failures was carried out. Through this analysis the topological properties which are common for grouping networks with similar robustness are able to be identified.
Conference Paper
Full-text available
Disaster-based failures can seriously disrupt any communication network, making its services unavailable. Such disruptions may be caused by natural disasters, technology-related failures, or malicious attacks, and they are observably increasing in number, intensity and scale. When network services that are a part of critical infrastructure become unavailable, commercial and/or societal problems are inevitable. The issue of limiting the impact of disaster-based failures needs to be urgently addressed due to the lack of suitable mechanisms deployed in the current networks. The COST CA15127 (RECODIS) Action will fill this gap by developing appropriate solutions to provide cost-efficient resilient communications in the presence of disaster-based disruptions considering both existing and emerging communication network architectures. It will be driven by researchers from academia and industry in strong cooperation with governmental bodies. In this paper, we highlight the objectives of RECODIS, its structure, as well as planned outcomes.
Article
Given a network defined by a graph, a weight associated to each node pair and a positive parameter p, the CND problem addressed here is to identify a set of at most p critical nodes minimizing the total weight of the node pairs that remain connected when all critical nodes are removed. We improve previously known compact models and present computational results, based on telecommunication backbone networks, showing that the proposed models are much more efficiently solved and enable us to obtain optimal solutions for networks up to 200 nodes and p values up to 20 critical nodes within a few minutes in the worst cases.
Article
The influence class of network problems models the propagation of influence (an abstraction of cascading beliefs, behaviors, or physical phenomena) in a network. Such problems have applications in social networks, electrical networks, computer networks, viral spreading, and so on. These types of networks have also been studied through the lens of critical arcs detection; that is, which arcs (edges) are the most important for maintaining some property of the network (e.g., connectivity). We introduce a new class of problems at the intersection of these two models. Specifically, given a set of seed nodes and the linear threshold influence propagation model, our work proposes to determine which arcs (e.g., relationships in a social network or communication pathways in a telecommunication network) are most critical to the influence propagation process. We prove NP-hardness of the problem. Time-dependent and time-independent mixed-integer programming (MIP) models are introduced. Insights gleaned from MIP solutions leads to the development of an improved MIP-based exact algorithm rooted in the idea of diffusion expansion. A heuristic based upon a new centrality measure is also proposed, and computational results are presented. © 2017 Wiley Periodicals, Inc. NETWORKS, 2017
Book
Telecommunications carriers have begun to upgrade their networks with state-of-the-art optical equipment, referred to as optical-bypass technology. The ramifications of this technology are manifold, affecting the architecture, operation, and economics of the network, all of which are covered in this book. The book is oriented towards practical implementation in metro and backbone networks, taking advantage of the authors extensive experience with actual commercial equipment and carrier networks. The book starts with an overview of optical networking, including an introduction to state-of-the-art optical networks. The second chapter covers legacy optical equipment and the new optical-bypass technology, with an emphasis on the architectural impact of the equipment. For example, the discussion covers how the various types of equipment affect the economics and flexibility of the network. One of the challenges of optical-bypass technology is that it requires sophisticated algorithms in order to operate the network efficiently. Chapters three, four, and five describe such algorithms, where the focus is on techniques that have been proven to produce efficient results in realistic carrier networks. The design and planning strategies described in these chapters are readily implementable. All of the algorithms presented scale well with network size so that they are suitable for real-time design. Chapters six and seven focus on two important aspects of optical networks, namely efficient bundling of the traffic and protection of the traffic. Rather than cover every aspect of these two subjects, the book focuses on how best to perform bundling and protection in the presence of optical-bypass technology. Again, the emphasis is on techniques that have proven effective in real network environments. The final chapter explores the economics of optical networking. Several studies are presented that offer guidelines as to when and how optical-bypass technology should be deployed. The code for some of the routing algorithms is provided in the appendix, which adds to the utility of the book.
Article
Abstract—Today’s businesses increasingly rely on cloud computing, which brings both great opportunities and challenges. One of the critical challenges is resiliency: disruptions due to failures (either accidental or because of disasters or attacks) may entail significant revenue losses (e.g., US$ 25.5 billion in 2010 for North America). Such failures may originate at any of the major components in a cloud architecture (and propagate to others): (i) the servers hosting the application, (ii) the network interconnecting them (on different scales, inside a data center, up to wide-area connections), or (iii) the application itself. We comprehensively survey a large body of work focusing on resilience of cloud computing, in each (or a combination) of the server, network, and application components. First, we present the cloud computing architecture and its key concepts. We highlight both the infrastructure (servers, network) and application components. A key concept is virtualization of infrastructure (i.e., partitioning into logically separate units), and thus we detail the components in both physical and virtual layers. Before moving to the detailed resilience aspects, we provide a qualitative overview of the types of failures that may occur (from the perspective of the layered cloud architecture), and their consequences. The second major part of the paper introduces and categorizes a large number of techniques for cloud computing infrastructure resiliency. This ranges from designing and operating the facilities, servers, networks, to their integration and virtualization (e.g., also including resilience of the middleware infrastructure). The third part focuses on resilience in application design and development. We study how applications are designed, installed, and replicated to survive multiple physical failure scenarios as well as disaster failures. Index Terms—Cloud computing, resilience, virtualization, middleware, optical networks, disaster resilience.