PreprintPDF Available

Network Bandwidth Allocation Problem For Cloud Computing

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Cloud computing enables ubiquitous, convenient, and on-demand network access to a shared pool of computing resources. Cloud computing technologies create tremendous commercial values in various areas, while many scientific challenges have arisen accordingly. The process of transmitting data through networks is characterized by some distinctive characteristics such as nonlinear, nonconvex and even noncontinuous cost functions generated by pricing schemes, periodically updated network topology, as well as replicable data within network nodes. Because of these characteristics, data transfer scheduling is a very challenging problem both engineeringly and scientifically. On the other hand, the cost for bandwidth is a major component of the operating cost for cloud providers, and thus how to save bandwidth cost is extremely important for them to supply service with minimized cost. We propose the Network Bandwidth Allocation (NBA) problem for cloud computing and formulate it as an integer programming model on a high level, with which more comprehensive and rigorous scientific studies become possible. We also show that the NBA problem captures some of the major cloud computing scenarios including the content delivery network (CDN), the live video delivery network (LVDN), the real-time communication network (RTCN), and the cloud wide area network (Cloud-WAN).
Content may be subject to copyright.
NETWORK BANDWIDTH ALLOCATION PROBLEM
FOR CLOUD COMPUTING
Changpeng Yang2, Jintao You2, Xiaoming Yuan1, Pengxiang Zhao1
*Corresponding author
1Department of Mathematics, University of Hong Kong
2Algorithm Innovation Lab, Huawei
E-mails: yangchangpeng@huawei.com, youjintao5@huawei.com,
xmyuan@hku.hk, pengxiangzhao@connect.hku.hk
March 15, 2022
Abstract. Cloud computing enables ubiquitous, convenient, and on-demand network
access to a shared pool of computing resources. Cloud computing technologies create
tremendous commercial values in various areas, while many scientific challenges have
arisen accordingly. The process of transmitting data through networks is characterized by
some distinctive characteristics such as nonlinear, nonconvex and even noncontinuous cost
functions generated by pricing schemes, periodically updated network topology, as well
as replicable data within network nodes. Because of these characteristics, data transfer
scheduling is a very challenging problem both engineeringly and scientifically. On the
other hand, the cost for bandwidth is a major component of the operating cost for cloud
providers, and thus how to save bandwidth cost is extremely important for them to supply
service with minimized cost. We propose the Network Bandwidth Allocation (NBA)
problem for cloud computing and formulate it as an integer programming model on a high
level, with which more comprehensive and rigorous scientific studies become possible. We
also show that the NBA problem captures some of the major cloud computing scenarios
including the content delivery network (CDN), the live video delivery network (LVDN),
the real-time communication network (RTCN), and the cloud wide area network (Cloud-
WAN).
Keywords: cloud computing, bandwidth allocation, network, the 95th percentile billing,
integer programming, content delivery network, live video delivery network, real-time
communication, cloud wide area network
1
arXiv:2203.06725v1 [math.OC] 13 Mar 2022
1 Introduction
Nowadays, cloud computing is infrastructure of information technology industry;
it has boosted some revolutionary technologies in various areas and essentially
reshaped some economic ecosystems. The term “cloud” originates from the field
of telecommunications, over which providers offer Virtual Private Network (VPN)
service while redirect traffic to balance the load of the overall network is allowed,
see [1]. Cloud computing extends the VPN service; it enables ubiquitous, con-
venient and on-demand network access to a shared pool of computing resources,
which may include networks, servers, storage, and applications, see, e.g. [2].
Cloud computing is the pillar to many highly commercialized economic ecosys-
tems, and it creates tremendous commercial values.
Customers, cloud providers, and Internet service providers (ISPs) are three major
stakeholders in cloud computing. For a cloud provider, it targets to supply reli-
able, customized, and quality of service (QoS) guaranteed service for customers,
with the purpose of minimizing the bandwidth cost (paid to ISPs) which is indeed
the main component of the overall operating cost, see [3, 4]. Therefore, schedul-
ing data transmission to save bandwidth cost is extremely important for cloud
providers. In contrast to other traffic problems such as Vehicle Routing Prob-
lems, data transfer through computer networks has the following three distinctive
characteristics:
The cost functions generated by pricing schemes are nonlinear, nonconvex,
and noncontinuous;
Network topology updates periodically;
Data can be replicated within network nodes.
The first feature is standard in the industry. For example, as mentioned in [5, 6, 7],
the well-known 95th percentile billing is a widely used pricing scheme, which lever-
ages 95th percentile of the bandwidth distribution over monthly periods. Due to
the variability of network status, for example, congestion may occur among some
connections, network topology may update periodically to meet requirements such
as reliability and low latency. This explains the second characteristic above. As
for the third one, digital data is a kind of special merchandise that can be repli-
cated in devices without occupying transfer bandwidth. Therefore, a network
node, which is usually referred to a server or a data center, only needs to ask
for desired data from another node while it can provide the obtained data to
more than one node. An underlying consequence is that the egress bandwidth
of intermediary network nodes contained in the path of transmitting the same
2
content is usually more than its ingress bandwidth. To the best of our knowledge,
it is the first time to consider this characteristic for modelling cloud computing
problems. With the concern of saving bandwidth cost in the industry as well as
these important attributes of data transfer over computer networks, we propose
the Network Bandwidth Allocation (NBA) problem for various cloud computing
problems, and initiate the effort of studying it from the optimization perspective.
Formally speaking, the NBA problem is defined on a network G= (V, E )dur-
ing a billing cycle P, in which V={1,2, . . . , n}is the set of network nodes
abstracted from servers or data centers; E={(i, j)|i, j V, i 6=j}is the set
of directed edges abstracted from network links; and Pis a given period of
time separated by a set of sampling time points T={1,2, . . . , p}. The egress
bandwidth of the node iVdenoted by φiis the summation of bandwidth
allocated to directed edges {(u, v)|u=i, v V , (u, v)E}; and its ingress band-
width denoted by ψiis the summation of bandwidth allocated to directed edges
{(u, v)|uV, v =i, (u, v)E}. We denote by cout
iand cin
ithe admissible max-
imum egress bandwidth and maximum ingress bandwidth for the node iV,
respectively. During the billing cycle P, the pricing scheme is associated with the
bandwidth distribution over T. For the node iV, let bout
ibe the 95th percentile
of its egress bandwidth distribution {φ(t)
i}p
t=1; let bin
ibe the 95th percentile of
its ingress bandwidth distribution {ψ(t)
i}p
t=1. The cost incurred on iis given by
ui·max({bout
i, bin
i}), where ui>0is the unit-price and max(·)is the operation
to get the maximum element of a number set. Assume that in the time slot
[t, t + 1], t T, the set of available edges is E(t)E; the set of source nodes
providing data is S(t)V; the set of destination nodes requiring data that is
originally from the source node sS(t)is D(t)
sV\ {s}; and the bandwidth
required to transmit such data with an acceptable latency between two nodes is
at least w(t)
s>0. The NBA problem consists of determining bandwidth allocation
plans for the network Gto minimize the total bandwidth cost during P, while
the following conditions are satisfied in every time slot [t, t + 1], t T:
Data can be transmitted out from all source nodes;
Data can be transmitted into all destination nodes;
The egress bandwidth allocation to a node does not exceed the admissible
maximum egress bandwidth, and the ingress bandwidth allocation to it does
not exceed the admissible maximum ingress bandwidth;
To transmit data that is originally from the same source node, the required
ingress bandwidth of a node is no more than the required egress bandwidth
unless it is a destination node.
3
Note that the last condition takes the data replication ability of network nodes
into account. The NBA problem is an abstract and generic model, and it can
be extended flexibly to suit various scenarios in practice. The NBA problem is
challenging because of its distinctive characteristics mentioned above, and also
because of the high dimensionality of its variables in real applications.
In Section 2, we review some related works. Then, we focus on the integer pro-
gramming formulation of the NBA problem in Section 3. In Section 4, we show
some real-life cloud computing applications which can be captured by the NBA
problem and its extensions. Finally, some discussions are included in Section 5.
2 Literature review
To save bandwidth cost, it is conventional to apply a centralized controller to
schedule data traffic in a network. In [8], a centralized software-driven wide area
network (SWAN) system is proposed to coordinate the sending rates and network
paths among data centers globally. Besides, a centralized traffic engineering sys-
tem named B4 is proposed in [9] to improve the utilization of Google’s global
data centers, in which traffic flows are split into multiple paths for balance. To
balance the requirements of decreasing latency and saving cost, a system called
video delivery network (VDN) is proposed in [5] to allow cloud providers to control
stream placement and bandwidth allocation dynamically. The proposed VDN in-
corporates a central controller and local agents to the traditional content delivery
network infrastructure, where the central controller generates bandwidth alloca-
tion plans based on the current network state and the local agents deal with the
incoming requests in real time. These very interesting works are essentially based
on engineering techniques, and they inspire the need of optimization models with
more rigorous theoretical analysis.
Complexity of the pricing scheme in the real world also brings challenges to sav-
ing bandwidth. The 95th percentile billing is a widely used pricing scheme of
ISPs, which is considered to reflect the required capacity of the connections in a
network better than other methods. Because most networks are over-provisioned,
there is usually reserved balance to handle bursting traffic. The 95th percentile
billing refers to ignoring the top 5% samples and charging the bandwidth used
by the leaving 95% samples. To illustrate, for a monthly billing, the samples of
consumed bandwidth are sorted at the end of each month, and the top 36 hours
(i.e., the top 5% of the overall 720 hours) of peak traffic will not be charged. It
means that the bandwidth could be used for free at a higher rate for up to 72
4
minutes per day. Therefore, mathematically the cost function generated by 95th
percentile billing is noncontinuous and nonconvex, and researchers have shown
that optimizing the burstable billing of the 95th percentile usage is NP-hard, see,
e.g. [10, 11]. To save cost charged by the 95th percentile billing, a mixed integer
linear programming (MILP) model is proposed in [6]. In [7], the authors propose
another MILP model to further relax this problem. Note that the mentioned
characteristics of transmitting data through computer networks are not fully con-
sidered in these models. For the NBA problem to be proposed, we consider the
95th percentile billing and the distinctive characteristics of data delivery through
networks. Besides, the NBA problem can be scalable to other pricing schemes if
the objective function is appropriately adjusted. Since attributes of data trans-
mission through networks are considered, feasible solutions of the NBA problem
have special properties, which will be delineated in the following sections.
3 Formulation and model
In this section, we propose the NBA problem with details, and formulate it as an
integer programming model.
3.1 Problem identification
Let the NBA problem be defined on a network G= (V, E)during a billing cycle
P, where V={1,2, . . . , n}is the set of network nodes abstracted from servers
or data centers; E={(i, j)|i, j V, i 6=j}is the set of edges abstracted from
network links; and Pis a given period of time separated by a set of sampling time
points T={1,2, . . . , p}. Components are given as follows.
Definitions:
bandwidth of an edge: the bandwidth of an edge is the amount of data trans-
mitted over it in a given amount of time, which is calculated in megabits per
second (Mbps).
ingress bandwidth of a node: the ingress bandwidth of the node iVis the
summation of bandwidth of directed edges {(u, v)|uV, v =i, (u, v)E}.
egress bandwidth of a node: the egress bandwidth of the node iVis the
summation of bandwidth of directed edges {(u, v)|u=i, v V, (u, v )E}.
5
ingress bandwidth capacity of a node: the ingress bandwidth capacity of a node
is its admissible maximum ingress bandwidth.
egress bandwidth capacity of a node: the egress bandwidth capacity of a node
is its admissible maximum egress bandwidth.
bandwidth allocation plan: a bandwidth allocation plan refers to a feasible so-
lution of the NBA problem.
optimal bandwidth allocation plan: the optimal bandwidth allocation plan refers
to the optimal solution of the NBA problem.
Pricing scheme:
During a billing cycle P, the pricing scheme is associated with the bandwidth
distribution over T. For the node iV, let bout
ibe the 95th percentile of
its egress bandwidth distribution {φ(t)
i}p
t=1; let bin
ibe the 95th percentile of its
ingress bandwidth distribution {ψ(t)
i}p
t=1. The cost incurred on iis given by
ui·max({bout
i, bin
i}), where ui>0is the unit-price and max(·)is the operation
to get the maximum element of a number set.
Parameters:
ui>0is the unit-price of bandwidth of the node iV.
cin
i>0is the ingress bandwidth capacity of the node iV.
cout
i>0is the egress bandwidth capacity of the node iV.
E(t)Eis the set of available edges in the time slot [t, t + 1], t T.
S(t)Vis the set of source nodes providing data in the time slot [t, t+1], t T.
D(t)
sV\ {s}is the set of destination nodes requiring data that is originally
from the source node sS(t)in the time slot [t, t + 1], t T.
w(t)
s>0is the least bandwidth required to deliver data that is originally from
the source node sS(t)between two nodes in the time slot [t, t + 1], t T.
Decision variables:
f(t)
sij = 1 if an edge (i, j)E(t)is contained in the bandwidth allocation plan to
deliver the data that is originally from the source node sS(t)in the time slot
[t, t + 1], t T; and 0otherwise.
6
Objective function:
The total bandwidth cost during Pshould be minimized. Denote the egress
bandwidth distribution of the node iVover Tby
Bout
i={X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV
f(t)
sij )}p
t=1,(1)
and the ingress bandwidth distribution of the node iVover Tby
Bin
i={X
sS(t)
(w(t)
s·X
(h,i)E(t)
hV
f(t)
shi)}p
t=1.(2)
Let Q95(·)be the operation to get the 95th percentile of a set of numbers. The
goal is represented by
min X
iV
(ui·max({Q95(Bout
i), Q95(Bin
i)})),(3)
where max(·)is the operation to get the maximum elements of a number set.
Constraints:
Data can be transmitted out from all source nodes in the time slot [t, t+1], t T:
X
jV
f(t)
ssj 1,sS(t).(4)
Data that is originally from a source node can be transmitted into all destination
nodes requiring it in the time slot [t, t + 1], t T:
X
(i,j)E(t)
iV
f(t)
sij 1,sS(t),jD(t)
s.(5)
The egress bandwidth of a node does not exceed its egress bandwidth capacity:
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV
f(t)
sij )cout
i,iV. (6)
The ingress bandwidth of a node does not exceed its ingress bandwidth capacity:
X
sS(t)
(w(t)
s·X
(h,i)E(t)
hV
f(t)
shi)cin
i,iV. (7)
7
To transmit data that is originally from the same source node, the required
ingress bandwidth of a node is no more than the required egress bandwidth
unless it is a destination node:
X
(i,j)E(t)
iV
f(t)
sij X
(j,k)E(t)
kV
f(t)
sjk ,sS(t),jV\D(t)
s.(8)
Observation 1. As data can be replicated within network nodes, it is sufficient
to transmit a copy of data into a node by just one edge. Then, for the optimal
bandwidth allocation plan for transmitting data that is originally from the source
node sS(t)in the time slot [t, t + 1], t T, there is
X
(i,s)E(t)
iV
f(t)
sis = 0,sS(t),(9)
and
X
(i,j)E(t)
iV
f(t)
sij 1,sS(t),jV\ {s}.(10)
Proof. As the node sS(t)already contains the transmitted data, (9) holds. To
transmit data that is originally from the source node sS(t)in the time slot
[t, t + 1], t T, consider the node jV\ {s}and a bandwidth allocation plan.
If data is not transmitted over j, there is
X
(i,j)E(t)
iV
f(t)
sij = 0.(11)
If data is transmitted over j, and assume that there is
X
(i,j)E(t)
iV
f(t)
sij =m, (12)
where m2. Then, (12) indicates that there are medges directed to the node
j, which are denoted by (i1, j),(i2, j),...,(im, j), respectively. Select the node
ik, k ∈ {1,2, . . . , m}and delete the edge (ik, j), data can be still transmitted
into the node jthrough the remaining edges as m2. Besides, in the time slot
[t, t+1], t T, the egress bandwidth of the node ikdecrease by w(t)
safter deletion.
Also, it holds that
Q95({φ(1)
ik, φ(2)
ik, . . . , φ(t)
ikw(t)
s,...φ(p)
ik})Q95({φ(1)
ik, φ(2)
ik, . . . , φ(t)
ik,...φ(p)
ik}),
(13)
8
where φ(t)
ikis the egress bandwidth of the node ikin the time slot [t, t + 1], t T.
Similarly, in the time slot [t, t + 1], t T, the ingress bandwidth of the node j
decreases by w(t)
safter deletion. Also, there is
Q95({ψ(1)
ik, ψ(2)
ik, . . . , ψ(t)
ikw(t)
s,...ψ(p)
ik})Q95({ψ(1)
ik, ψ(2)
ik, . . . , ψ(t)
ik,...ψ(p)
ik}),
(14)
where ψ(t)
ikis the ingress bandwidth of node jin the time slot [t, t + 1], t T.
According to (3), (13) and (14), we get an equivalent or better bandwidth alloca-
tion plan after deletion. By induction, there is an equivalent or better bandwidth
allocation plan than the current one such that
X
(i,j)E(t)
iV
f(t)
sij = 1.(15)
Combining (11) and (15) yields (10).
According to Observation 1, (5) can be reformulated as
X
(i,j)E(t)
iV
f(t)
sij = 1,sS(t),jD(t)
s,(16)
which means that it is sufficient to transmit a copy of data that is originally
from the source node sS(t)into its destination nodes D(t)
sby just one edge
respectively.
3.2 Integer programming model
With the specifications in Section 3.1, the NBA problem can be formulated as an
integer programming model as the following:
min X
iV
ui·max({Q95({X
sS(t)
w(t)
s·X
(i,j)E(t)
jV
f(t)
sij }p
t=1), Q95({X
sS(t)
w(t)
s·X
(h,i)E(t)
hV
f(t)
shi}p
t=1)})
subject to X
jV
f(t)
ssj 1,sS(t),tT, (17)
X
(i,j)E(t)
iV
f(t)
sij = 1,sS(t),jD(t)
s,tT, (18)
9
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV
f(t)
sij )cout
i,iV, tT , (19)
X
sS(t)
(w(t)
s·X
(h,i)E(t)
hV
f(t)
shi)cin
i,iV, tT , (20)
X
(i,j)E(t)
iV
f(t)
sij X
(j,k)E(t)
kV
f(t)
sjk 0,sS(t),jV\D(t)
s,tT, (21)
f(t)
sij ∈ {0,1},sS(t),(i, j)E(t),tT . (22)
To deal with this problem, the following propositions may be useful for algorithmic
design.
Proposition 1. Denote a bandwidth allocation plan for transmitting data that
is originally from the source node sS(t)in the time slot [t, t + 1], t T, by a
graph G(t)
s= (V(t)
s, E(t)
s), where V(t)
s={i|{f(t)
sij 1or f(t)
sji = 1, i, j V}, E(t)
s=
{(i, j)|(i, j )E(t), f(t)
sij = 1, i, j V(t)
s}. Then, there is no cycle in G(t)
s.
Proof. We prove it by contradiction. Let G(t)
sbe the underlying undirected graph
of G(t)
s. Assume that there is a cycle contained in G(t)
s. Without loss of generality,
we denote it by
(iu, iu+1)→ · ·· → (iv, iu).(23)
Also, we denote by Vcthe set of nodes contained in the cycle (23). Then, it follows
from (10) that
X
(i,j)E(t)
iV
f(t)
sij = 1,jVc.(24)
By (9) and (24), we have s /Vc, which means that nodes in Vcdo not contain
transmitted data originally. Then, to transmit data into the cycle (23) from
V(t)
a\Vc, there must be at least one node jVcsuch that
X
(i,j)E(t)
iV
f(t)
sij >1,(25)
which is a contradiction to (24). Therefore, there is no cycle in G(t)
s. Moreover,
there is no cycle in G(t)
s.
10
Proposition 2. Denote a bandwidth allocation plan for transmitting data that
is originally from the source node sS(t)in the time slot [t, t + 1], t T, by a
graph G(t)
s= (V(t)
s, E(t)
s), where V(t)
s={i|f(t)
sij 1or f(t)
sji = 1, i, j V}, E(t)
s=
{(i, j)|(i, j )E(t), f(t)
sij = 1, i, j V(t)
s}. Then, G(t)
sis a directed tree.
Proof. Consider G(t)
s, which is the underlying undirected graph of G(t)
s. By Propo-
sition 1, G(t)
sis an undirected graph in which any two nodes are connected by
exactly one path. Therefore, by the definition of tree, G(t)
sis an undirected tree.
Moreover, G(t)
sis a directed tree.
4 Real applications
In this section, we show some major real-world cloud computing scenarios that
are applications of the NBA problem. The scenarios include the content delivery
network (CDN), the live video delivery network (LVDN), the real-time communi-
cation network (RTCN), and the cloud wide area network (Cloud-WAN).
4.1 Content delivery network
4.1.1 Background
The content delivery network (CDN) is a geographically distributed network of
servers. The goal is to decrease latency and thus improve QoS by distributing the
service spatially relative to customers, as a shorter transmission distance usually
means lower latency [12]. Figure 1 shows the logical architecture of the CDN. For
the CDN, there are two major components:
Server nodes of the cloud provider, whose network is a tree;
Customers that use various devices.
For servers of the cloud provider whose network topology is a tree, there are
three categories: (1) the source server; (2) central servers; (3) edge servers. Let
the source server be the root of the tree, and there be llayers. The contents
are imported into the CDN through the source server; central servers forward
contents layer by layer; and edge servers are responsible for transmitting content
to customers. As contents can be replicated and cached within servers, it is
unnecessary to build a transmitting path from the source server to customers at
11
Source server
Layer 1
Central servers
Layer 2 to l-1
Edge servers
Layer l
Cloud provider
Customers
Figure 1: The logical architecture of the content delivery network (CDN).
all times. In other words, it only needs to build a transmitting path from a server
that already cached the desired contents to customers. Therefore, the process to
build the transmitting path for a customer can be separated into the following
two steps:
Step 1: build a connection between this customer and an appropriate edge
server (the green arrow in Figure 1);
Step 2: build connections from the edge server to an upper server recursively
until the desired contents are available (the red arrow in Figure 1).
Once the transmitting path is built by these two steps, the desired contents can
be delivered to the customer through this path (the blue arrow in Figure 1). For
Step 2, the path is deterministic with the given start and end, since any two nodes
of a tree are connected by exactly one path. Therefore, decisions of bandwidth
allocation are actually made at Step 1 for edge servers. The bandwidth allocation
of the source server and central servers can be estimated by the bandwidth allo-
cation to edge servers and the occurrence rate of Step 2. In the CDN, the egress
bandwidth of a server is usually greater than its ingress bandwidth. Therefore,
the amount of bandwidth for the charge is counted by the amount of egress band-
width. According to these specific attributes, we extend the NBA problem for the
CDN scenario, which is named CDN Bandwidth Allocation (CDN-BA) problem.
12
4.1.2 Formulation
Consider a CDN-BA problem which is defined on G= (V, E)during a billing cycle
P, where Gis a tree; V={1,2, . . . , n}is the set of servers; E={(i, j )|i, j
V, i 6=j}is the set of edges abstracted from network links; and Pis a given period
of time separated by a set of sampling time points T={1,2, . . . , p}. Components
are given as follows.
Parameters:
VsVis the set of the source server and central servers; the source server is
the root node of G, and central servers are internal nodes of G.
Ve=V\Vsis the set of edge servers; the edge servers are leaf nodes of G.
D(t)={n+ 1, n + 2, . . . , n +m(t)}is the set of customers in the time slot
[t, t + 1], t T.
E(t)={(i, j)|iVe, j D(t)}is the set of available connections among edge
servers and customers in the time slot [t, t + 1], t T.
ui>0is the unit-price of bandwidth of the server iV.
ci>0is the egress bandwidth capacity of the server iV.
w(t)
i>0is the least bandwidth required to deliver the contents between an
edge server and the customer iD(t)in the time slot [t, t + 1], t T.
r(t)
i[0,1] is the probability that the desired content is not cached at the server
iVin the time slot [t, t + 1], t T.
Decision variables:
f(t)
ij = 1 if an edge (i, j)E(t)is contained in the bandwidth allocation plan to
transmit data in the time slot [t, t + 1], t T; and 0otherwise.
Objective function:
The total bandwidth cost during Pshould be minimized. The bandwidth cost of
edge servers is measured exactly, while the bandwidth cost of central servers and
the source server in the time slot [t, t + 1], t Tis estimated through r(t)
i, i V.
Denote the egress bandwidth distribution of the edge server iVeover Tby
Bi={X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }p
t=1.(26)
13
Let Q95(·)be the operation to get the 95th percentile of a set of numbers. The
cost of all edge severs is represented by
X
iVe
ui·Q95(Bi).(27)
Then, the bandwidth of central servers and the source server can be estimated
layer by layer. For example, let kVsbe a server node whose children are edge
server nodes. Then, the egress bandwidth of kin the time slot [t, t + 1], t Tis
estimated by
X
(k,i)E
(r(t)
i·X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij ).(28)
For simplicity, we denote the egress bandwidth of kVsin the time slot [t, t +
1], t Tby b(t)
k, which is obtained by an operation Rgiven the egress bandwidth
of all edge servers and {r(t)
i}iV:
b(t)
k=R(k, {X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }iVe,{r(t)
i}iV).(29)
Then, the total cost of central servers and the source server is denoted by
X
kVs
uk·Q95({b(t)
k}p
t=1).(30)
Therefore, the goal is
min X
iVe
ui·Q95(Bi) + X
kVs
uk·Q95({b(t)
k}p
t=1).(31)
Constraints:
Data can be delivered into all customer nodes from edge servers in the time slot
[t, t + 1], t T:
X
(i,j)E(t)
iVe
f(t)
ij = 1,jD(t).(32)
The egress bandwidth of a server node does not exceed its egress bandwidth
capacity:
X
(i,j)E(t)
jD(t)
w(t)
jf(t)
ij ci,iVe;(33)
14
R(k, {X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }iVe,{r(t)
i}iV)ci,kVs.(34)
Remark 1. For the CDN, there are some servers that have already cached the
desired content. Once the connection between a customer and an edge server is
built, a path to transmit desired contents is established recursively. Therefore, the
constraint (4) in the NBA problem is always satisfied and thus it can be omitted
in the CDN scenario.
Remark 2. As the structure of the CDN is a tree, by Proposition 2, the constraint
(8) in the NBA problem is always satisfied and thus it can also be omitted in the
CDN scenario.
Based on the discussions above, the CDN-BA problem is formulated as
min X
iVe
ui·Q95({X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }p
t=1)
+X
kVs
uk·Q95({R(k, {X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }iVe,{r(t)
i}iV)}p
t=1)
subject to X
(i,j)E(t)
iVe
f(t)
ij = 1,jD(t),tT, (35)
X
(i,j)E(t)
jD(t)
w(t)
jf(t)
ij ci,iVe,tT, (36)
R(k, {X
(i,j)E(t)
jD(t)
w(t)
j·f(t)
ij }iVe,{r(t)
i}iV)ck,kVs,tT, (37)
f(t)
ij ∈ {0,1},(i, j)E(t),tT . (38)
4.2 Live video delivery network
4.2.1 Background
Live video means an online video that is recorded and transmitted over a network
in real-time. Delivery of live video is challenging because of the unavailability of
15
cache technology and the low latency requirement. With the increment of network
bandwidth, many applications related to live video delivery have gradually become
into reality such as live news, live shows, and live courses. Recent surveys in
[13, 14] have shown that Internet users had watched an accumulation of 1.1 billion
hours of live video in 2019 and the traffic of live video had reached 82%of overall
Internet traffic by the end of 2020. Also, it is estimated in [15] that the video
streaming market will hit 223.98 billion dollars by 2028. The diversity and volume
of live video bring not only business opportunities, but also scientific challenges
to save bandwidth cost for cloud providers.
Cloud provider Video
viewers
Video
producers
Figure 2: The logical architecture of the live video delivery network (LVDN).
Figure 2 shows the logical architecture of the live video delivery network (LVDN).
There are three major stakeholders in the LVDN: (1) video producers; (2) the
cloud provider; (3) video viewers. Video producers create live videos in various
roles such as live news reporters, actors in live shows, and teachers in live courses.
Cloud providers such as Amazon, Microsoft, Alibaba, and Huawei pay to ISPs
to deliver live video from video producers to viewers. For example, red arrows
in Figure 2 represent paths to transmit a live video from a video producer to
viewers through the cloud provider’s servers. In the process of live video delivery,
video producers and viewers demand high quality of video (e.g., video resolution,
video frame rate, color depth), instant accessibility, and low buffering ratios, while
the cloud provider targets to provide QoS guaranteed transmission service with
minimized bandwidth cost.
To cast the live video delivery problem into the NBA problem, a video producer
in the LVDN is a source node, and the related viewers are destination nodes. It
16
means that the servers of cloud providers are intermediary nodes, and the ingress
bandwidth of them is always no more than their egress bandwidth considering the
data replication ability of servers. Therefore, the amount of bandwidth for the
charge is counted by the amount of egress bandwidth. Besides, a video producer
is admissible to access only one server to upload video in this scenario. By group-
ing video producers and viewers appropriately, we extend the NBA problem for
the LVDN scenario, which is named LVDN Bandwidth Allocation (LVDN-BA)
Problem.
4.2.2 Formulation
Consider a LVDN-BA problem which is defined on G= (V, E)during a billing
cycle P,V={1,2, . . . , n}is the set of servers; E={(i, j)|i, j V, i 6=j}is the
set of edges abstract from inner-domain network links; and Pis a given period of
time separated by a set of sampling time points T={1,2, . . . , p}. Components
are given as follows.
Parameters:
S(t)={n+ 1, n + 2, . . . , n +m(t)}is the set of video producers in the time slot
[t, t + 1], t T.
D(t)
sZ+\(VS(t))is the set of viewers of the video producer sS(t)in the
time slot [t, t + 1], t T, where Z+is the set of positive integers.
V(t)represents VS(t)(SsS(t)D(t)
s).
E(t)={(i, j)|i, j V(t), i 6=j}is the set of available connections among servers,
video producers, and video viewers in the time slot [t, t + 1], t T.
ui>0is the unit-price of bandwidth of the server iV.
ci>0is the egress bandwidth capacity of the server iV.
w(t)
s>0is the least bandwidth required to deliver live video created by video
producer sS(t)in the time slot [t, t + 1], t T.
Decision variables:
f(t)
sij = 1 if an edge (i, j)E(t)is contained in the bandwidth allocation plan
to deliver the live video created by the video producer sS(t)in the time slot
[t, t + 1], t T; and 0otherwise.
17
Objective function:
The total bandwidth cost during Pshould be minimized. Denote the egress
bandwidth distribution of node iVover Tby
Bi={X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )}p
t=1.(39)
Let Q95(·)be the operation to get the 95th percentile of a set of numbers. The
goal is represented by
min X
iV
ui·Q95(Bi).(40)
Constraints:
Data can be transmitted out from video producers in the time slot [t, t + 1], t
T:
X
jV
f(t)
ssj = 1,sS(t).(41)
Note that a video producer is admissible to access only one server to upload
video in this scenario.
Data can be transmitted into all viewer nodes in the time slot [t, t + 1], t T:
X
(i,j)E(t)
iV
f(t)
sij = 1,sS(t),jD(t)
s.(42)
The egress bandwidth of a server node does not exceed its egress bandwidth
capacity:
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )ci,iV. (43)
The ingress bandwidth of a server node in the path to transmit data that is
originally from the same video producer is no more than its egress bandwidth:
X
(i,j)E(t)
iV(t)
f(t)
sij X
(j,k)E(t)
kV(t)
f(t)
sjk ,sS(t),jV. (44)
18
Based on the discussions above, the LVDN-BA problem is formulated as
min X
iV
ui·Q95({X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )}p
t=1)
subject to X
jV
f(t)
ssj = 1,sS(t),tT, (45)
X
(i,j)E(t)
iV
f(t)
sij = 1,sS(t),jD(t)
s,tT, (46)
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )ci,iV, tT , (47)
X
(i,j)E(t)
iV(t)
f(t)
sij X
(j,k)E(t)
kV(t)
f(t)
sjk 0,sS(t),jV , tT, (48)
f(t)
sij ∈ {0,1},sS(t),(i, j)E(t),tT . (49)
Remark 3. The process of live video delivery can be separated into two steps: (1)
build connections between customers (video producers and viewers) and servers;
(2) build a path with the given start and end among servers. These two steps are
intrinsically related, while we may consider solving them individually in a “divide-
and-conquer” manner because it is too challenging to solve them as a whole.
4.3 Real-time communication network
4.3.1 Background
Real-time communication (RTC) is a mode of interchanging information, where
participants exchange information instantly or with negligible latency. The trans-
mitted contents include audio, videos, texts, files, and so on. Developments of
RTC technologies have boosted various application scenarios into reality, e.g.,
video chats, live conferences, smart factory, and cloud gaming. Many cloud com-
puting business have been thus created. For cloud providers supplying RTC ser-
vice, saving the bandwidth cost is also a major task, along with other purposes
such as increasing the reliability and capability of the network transmission.
Figure 3 shows the logical architecture of the real-time communication network
(RTCN). As shown in Figure 3, three customers are geographically distributed.
19
Cloud provider RTC
customers
RTC
customers
Figure 3: The logical architecture of the real-time communication network
(RTCN).
Assume that all of them are in the same RTC group (e.g., a live conference), the
cloud provider needs to build peer-to-peer paths for the customers to exchange
information. In Figure 3, example paths are labeled by arrows with different col-
ors. Besides, an RTC participant is admissible to access only one server to upload
contents in this scenario. In analogy with LVDN, RTC participants play roles
in both video producers and viewers. Therefore, the LVDN-BA problem can be
applied in an RTC group by regarding every participant as a video producer while
the remaining participants viewers. According to this idea, we extend the NBA
problem for the RTCN scenario, which is named RTCN Bandwidth Allocation
(RTCN-BA) problem.
4.3.2 Formulation
Consider an RTCN-BA problem which is defined on G= (V, E)during a billing
cycle P, where V={1,2, . . . , n}is the set of servers; E={(i, j)|i, j V, i 6=j}
is the set of edges abstracted from network links; and Pis a given period of time
separated by a set of sampling time points T={1,2, . . . , p}. Components are
given as follows.
Parameters:
S(t)={n+ 1, n + 2, . . . , n +m(t)}is the set of all RTC participants in the time
slot [t, t + 1], t T.
20
A(t)={A(t)
1, A(t)
2, . . . , A(t)
g(t)}, where A(t)
1, A(t)
2, . . . , A(t)
g(t)S(t)are RTC groups
in the time slot [t, t + 1], t T.
V(t)represents VS(t).
E(t)={(i, j)|i, j V(t), i 6=j}is the set of available connections among RTC
participants and servers in the time slot [t, t + 1], t T.
ui>0is the unit-price of bandwidth of the server iV.
ci>0is the egress bandwidth capacity of the server iV.
w(t)
s>0is the least bandwidth required to deliver contents created by the
participant sS(t)in the time slot [t, t + 1], t T.
Decision Variables:
f(t)
sij = 1 if an edge (i, j)E(t)is contained in the bandwidth allocation plan to
deliver contents created by the participant sS(t)in the time slot [t, t+ 1], t T;
and 0otherwise.
Objective Function:
The total bandwidth cost during the billing cycle Pshould be minimized. Denote
the egress bandwidth distribution of node iVover Tby
Bi={X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )}p
t=1.(50)
Let Q95(·)be the operation to get the 95th percentile of a set of numbers. The
goal is represented by
min X
iV
ui·Q95(Bi).(51)
Constraints:
Data can be transmitted out from all RTC participants in the time slot [t, t +
1], t T:
X
jV
f(t)
ssj = 1,sS(t).(52)
Note that an RTC participant is admissible to access only one server to upload
contents.
21
Data can be transmitted to other participants in an RTC group:
X
(i,j)E(t)
iV
f(t)
sij = 1,AgA(t),sAg,jAg\ {s}.(53)
The egress bandwidth of a node does not exceed its egress bandwidth capacity:
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )ci,iV. (54)
The ingress bandwidth of a server node in the path to transmit data that is
originally from the same RTC participant is no more than its egress bandwidth:
X
(i,j)E(t)
iV(t)
f(t)
sij X
(j,k)E(t)
kV(t)
f(t)
sjk ,sS(t),jV. (55)
Based on the discussions above, the RTCN-BA problem is formulated as
min X
iV
ui·Q95({X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )}p
t=1)
subject to X
jV
f(t)
ssj = 1,sS(t),tT, (56)
X
(i,j)E(t)
iV
f(t)
sij = 1,AgA(t),sAg,jAg\ {s},tT,
(57)
X
sS(t)
(w(t)
s·X
(i,j)E(t)
jV(t)
f(t)
sij )ci,iV, tT , (58)
X
(i,j)E(t)
iV(t)
f(t)
sij X
(j,k)E(t)
kV(t)
f(t)
sjk 0,sS(t),jV , tT, (59)
f(t)
sij ∈ {0,1},sS(t),(i, j)E(t),tT . (60)
22
4.4 Cloud wide area network
4.4.1 Background
A wide area network (WAN) refers to a telecommunication network that extends
over a large geographic area. The cloud wide area network (Cloud-WAN) enables
cloud providers to supply services for clients that are geographically distributed.
The rapidly increasing demands have inspired cloud providers to focus on the
traffic exchanged between the Cloud-WAN and other networks on the Internet,
see, e.g. [7]. In cloud computing, clients access the service from the cloud through
Point of Presence (PoP). In general, the requirement of a client is responded by
the PoP which is geographically closest to the client, because a shorter distance
usually means lower latency and higher quality. However, to balance the band-
width utilization and thus save the cost, cloud providers may also redirect some
requirements to a farther PoP that can also provide service with acceptable QoS.
Cloud
PoPs
Clients
Cloud WAN
Other networks
Figure 4: The logical architecture of the cloud wide area network (Cloud-WAN).
Figure 4 shows the logical architecture of the Cloud-WAN. As shown in Figure 4,
PoPs access the cloud through inner-domain networks, while clients access PoPs
through inter-domain networks. As shown by the red arrow and the blue arrow,
the PoP that responds to the requirement of a client is different from the PoP
that received it before, which is controlled by the traffic scheduling system of the
cloud provider to achieve some central-level goals such as balancing traffic and
saving bandwidth cost. Besides, the traffic demand of a client can be transferred
by more than one PoP.
23
As the traffic transmitted into PoPs is not controlled by the cloud provider, only
the egress bandwidth of PoPs is considered in a traffic scheduling system. Accord-
ing to the NBA problem, PoPs in a Cloud-WAN are source nodes, while clients
are destination nodes. The structure of data transmission network is a bipartite
graph. By the background of Cloud-WAN, a destination node may receive data
from more than one source node. According to these specific attributes, we ex-
tend the NBA problem to this scenario, which is named Cloud-WAN Bandwidth
Allocation (Cloud-WAN-BA) problem.
4.4.2 Formulation
Consider a Cloud-WAN-BA problem which is defined on G= (V, E)during a
billing cycle P, where Gis a bipartite graph; Vd={1,2, . . . , m}is the set of
PoPs; Vr={m+ 1, m + 2, . . . , n}is the set of clients, V=VdVr;E={(i, j)|i
Vd, j Vr}is the set of edges abstracted from network links; and Pis a given
period of time separated by a set of sampling time points T={1,2, . . . , p}.
Components are given as follows.
Parameters:
ui>0is the unit-price of bandwidth of the PoP iVd.
d(t)
j>0is the total traffic demands of the client jVrin the time slot
[t, t + 1], t T.
ci>0is the egress bandwidth capacity of the PoP iVd.
E(t)={(i, j)|iVd, j Vr}is the set of available connections among PoPs
and clients in the time slot [t, t + 1], t T.
Decision Variables:
f(t)
ij Z+represents the amount of traffic assigned to the edge (i, j)E(t)in the
time slot [t, t + 1], t T, where Z+is the set of positive integers.
Objective function:
The total bandwidth cost during the billing cycle Pshould be minimized. As
only egress bandwidth is considered, we denote the egress bandwidth distribution
of node iVdover Tby
Bi={X
(i,j)E(t)
jVr
f(t)
ij }p
t=1.(61)
24
Let Q95(·)be the operation to get the 95th percentile of a set of numbers. The
goal is represented by
min X
iVd
ui·Q95(Bi).(62)
Constraints:
The traffic demands of a client have to be fulfilled in the time slot [t, t + 1] ,t
T:
X
(i,j)E(t)
iVd
f(t)
ij =d(t)
j,jVr,(63)
The egress bandwidth of the PoP iVddoes not exceed its capacity ci:
X
(i,j)E(t)
jVr
f(t)
ij ci,iVd,(64)
Based on the discussions above, the Cloud-WAN-BA problem is formulated as
min X
iVd
ui·Q95({X
(i,j)E(t)
jVr
f(t)
ij }p
t=1)
subject to X
(i,j)E(t)
iVd
f(t)
ij =d(t)
j,jVr,tT, (65)
X
(i,j)E(t)
jVr
f(t)
ij ci,iVd,tT, (66)
f(t)
ij Z+,(i, j)E(t),tT . (67)
Considering the structure of (65) and (66), we quote the following proposition.
Proposition 3 (totally unimodular [16]).A matrix A is said to be totally uni-
modular if the determinant of every square submatrix formed from it has value 0,
+1, or 1. In the system of equations Ax =b, assume that Ais totally unimod-
ular and that all elements of Aand bare integers. Then, all basic solutions have
integer components.
25
5 Discussions
We propose the Network Bandwidth Allocation (NBA) problem for cloud com-
puting, and formulate it as an integer programming model. The following three
distinctive characteristics of transmitting data through networks are considered:
(1) the cost functions generated by pricing schemes are nonlinear, nonconvex
and noncontinuous; (2) network topology updates periodically; (3) data can be
replicated within network nodes. These attributes, along with the high dimen-
sionality of variables, make it very challenging to mathematically analyze the
NBA problem. The proposed NBA problem is a fundamental and high-level rep-
resentation for modeling transmitting data through networks while minimizing
the bandwidth cost, and it can be extended flexibly to suit various cloud com-
puting scenarios. Moreover, the NBA problem can be easily modified to suit for
other pricing schemes if its objective function based on the 95th percentile billing
is appropriately adjusted. We show four real cloud computing applications of
the NBA problem: the content delivery network (CDN), the live video delivery
network (LVDN), the real-time communication network (RTCN), and the cloud
wide area network (Cloud-WAN). It is expected that our first effort of nailing
down the mathematical models for these cloud computing problems can boost
more rigorous and insightful studies from various perspectives.
It is interesting yet very challenging to design efficient algorithms which can be
applied to the integer programming formulation of the NBA problem. Standard
solvers such as SCIP,CPLEX, and Gurobi turn out not to work well due to
the underlying hierarchical structures, combinatorial properties, as well as the
high dimensionality of variables of the corresponding integer programming for-
mulation. We are exploring the integration of conventional optimization tech-
niques/algorithms with problem-tailored heuristics to design applicable and/or
efficient algorithms. It is also noted that some applications usually require solv-
ing a large set of instances that are generated by homogeneous datasets yet with
different problem parameters. Hence, it is promising to apply some machine-
learning based methods to extract valuable information from data. For example,
bandwidth consumption empirically follows a regular distribution over the billing
cycles because there is a strong relationship between customer usage habits and
time over billing cycles. Besides, since the NBA problem is defined on networks,
the graph neural network (GNN) seems to be a primary deep learning architecture
that can be used to alleviate the difficulties in solving the NBA problem via its
functions of learning, reasoning, and generalizing graph-structured data.
26
References
[1] Y. Jadeja and K. Modi, “Cloud computing-concepts, architecture and chal-
lenges,” in 2012 International Conference on Computing, Electronics and
Electrical Technologies (ICCEET), pp. 877–880, IEEE, 2012.
[2] P. Mell, T. Grance, et al., “The nist definition of cloud computing,” NIST
Special Publication 800-145, 2011.
[3] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski,
G. Lee, D. Patterson, A. Rabkin, I. Stoica, et al., “A view of cloud comput-
ing,” Communications of the ACM, vol. 53, no. 4, pp. 50–58, 2010.
[4] X. Liu, F. Dobrian, H. Milner, J. Jiang, V. Sekar, I. Stoica, and H. Zhang,
“A case for a coordinated internet video control plane,” SIGCOMM Comput.
Commun. Rev., vol. 42, p. 359–370, aug 2012.
[5] M. K. Mukerjee, D. Naylor, J. Jiang, D. Han, S. Seshan, and H. Zhang,
“Practical, real-time centralized control for cdn-based live video delivery,”
SIGCOMM Comput. Commun. Rev., vol. 45, p. 311–324, aug 2015.
[6] Y. Zhan, M. Ghamkhari, H. Akhavan-Hejazi, D. Xu, and H. Mohsenian-Rad,
“Optimal response to burstable billing under demand uncertainty,” 2016.
[7] R. Singh, S. Agarwal, M. Calder, and V. Bahl, “Cost-effective cloud edge
traffic engineering with cascara,” in USENIX NSDI, April 2021.
[8] C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and
R. Wattenhofer, “Achieving high utilization with software-driven wan,” in
Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIG-
COMM ’13, (New York, NY, USA), p. 15–26, Association for Computing
Machinery, 2013.
[9] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata,
J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat,
“B4: Experience with a globally deployed software defined wan,” in Proceed-
ings of the ACM SIGCOMM Conference, (Hong Kong, China), 2013.
[10] M. R. Garey and D. S. Johnson, Computers and Intractability; A Guide to
the Theory of NP-Completeness. USA: W. H. Freeman & Co., 1990.
27
[11] V. Jalaparti, I. Bliznets, S. Kandula, B. Lucier, and I. Menache, “Dynamic
pricing and traffic engineering for timely inter-datacenter transfers,” in Pro-
ceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM ’16, (New
York, NY, USA), p. 73–86, Association for Computing Machinery, 2016.
[12] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl,
“Globally distributed content delivery,” IEEE Internet Computing, vol. 6,
no. 5, pp. 50–58, 2002.
[13] Go-Globe, “The state of live streaming–statistics and trends.” https://www.
go-globe.com/live-streaming-statistics/.
[14] A. Yosilewitz, “State of the stream q4/december 2019.” https://cdn.
streamelements.com/static/Q4_December.pdf.
[15] A. Yosilewitz, “Video streaming market worth $223.98 billion by
2028 | cagr: 21.0%,” February 2021. https://www.go-globe.com/
live-streaming-statistics/.
[16] A. Schrijver, Theory of linear and integer programming. John Wiley & Sons,
1998.
28
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Burstable billing is widely adopted in practice, e.g., by colocation data center providers, to charge for their users, e.g., data centers, for data transferring. However, there is still a lack of research on what the best way is for a user to manage its workload in response to burstable billing. To overcome this shortcoming, we propose a novel method to optimally respond to burstable billing under demand uncertainty. First, we develop a tractable mathematical expression to calculate the 95th percentile usage of a user, who is charged by provider via burstable billing for bandwidth usage. This model is then used to formulate a new bandwidth allocation problem to maximize the user's surplus, i.e., its net utility minus cost. Additionally, we examine different non-convex solution methods for the formulated stochastic optimization problem. We also extend our design to the case where a user can receive service from multiple providers, who all employ burstable billing. Using real-world workload traces, we show that our proposed method can reduce user's bandwidth cost by 26% and increase its total surplus by 23%, compared to the current practice of allocating bandwidth on-demand.
Article
Full-text available
CLOUD COMPUTING, the long-held dream of computing as a utility, has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased. Developers with innovative ideas for new Internet services no longer require the large capital outlays in hardware to deploy their service or the human expense to operate it. They need not be concerned about overprovisioning for a service whose popularity does not meet their predictions, thus wasting costly resources, or underprovisioning for one that becomes wildly popular, thus missing potential customers and revenue. Moreover, companies with large batch-oriented tasks can get results as quickly as their programs can scale, since using 1,000 servers for one hour costs no more than using one server for 1,000.
Conference Paper
We present the design, implementation, and evaluation of B4, a private WAN connecting Google's data centers across the planet. B4 has a number of unique characteristics: i) massive bandwidth requirements deployed to a modest number of sites, ii) elastic traffic demand that seeks to maximize average bandwidth, and iii) full control over the edge servers and network, which enables rate limiting and demand measurement at the edge. These characteristics led to a Software Defined Networking architecture using OpenFlow to control relatively simple switches built from merchant silicon. B4's centralized traffic engineering service drives links to near 100% utilization, while splitting application flows among multiple paths to balance capacity against application priority/demands. We describe experience with three years of B4 production deployment, lessons learned, and areas for future work.
Conference Paper
Neither traffic engineering nor fixed prices (e.g., \$/GB) alone fully address the challenges of highly utilized inter-datacenter WANs. The former offers more service to users who overstate their demands and poor service overall. The latter offers no service guarantees to customers, and providers have no lever to steer customer demand to lightly loaded paths/times. To address these issues, we design and evaluate Pretium -- a framework that combines dynamic pricing with traffic engineering for inter-datacenter bandwidth. In Pretium, users specify their required rates or transfer sizes with deadlines, and a price module generates a price quote for different guarantees (promises) on these requests. The price quote is generated using internal prices (which can vary over time and links) which are maintained and periodically updated by Pretium based on history. A supplementary schedule adjustment module gears the agreed-upon network transfers towards an efficient operating point by optimizing time-varying operation costs. Experiments using traces from a large production WAN show that Pretium improves total system efficiency (value of routed transfers minus operation costs) by more than 3.5X relative to current usage-based pricing schemes, while increasing the provider profits by 2X.
Article
Live video delivery is expected to reach a peak of 50 Tbps this year. This surging popularity is fundamentally changing the Internet video delivery landscape. CDNs must meet users' demands for fast join times, high bitrates, and low buffering ratios, while minimizing their own cost of delivery and responding to issues in real-time. Wide-area latency, loss, and failures, as well as varied workloads ("mega-events" to long-tail), make meeting these demands challenging. An analysis of video sessions concluded that a centralized controller could improve user experience, but CDN systems have shied away from such designs due to the difficulty of quickly handling failures, a requirement of both operators and users. We introduce VDN, a practical approach to a video delivery network that uses a centralized algorithm for live video optimization. VDN provides CDN operators with real-time, fine-grained control. It does this in spite of challenges resulting from the wide-area (e.g., state inconsistency, partitions, failures) by using a hybrid centralized+distributed control plane, increasing average bitrate by 1.7x and decreasing cost by 2x in different scenarios.
Conference Paper
We present SWAN, a system that boosts the utilization of inter-datacenter networks by centrally controlling when and how much traffic each service sends and frequently re-configuring the network's data plane to match current traffic demand. But done simplistically, these re-configurations can also cause severe, transient congestion because different switches may apply updates at different times. We develop a novel technique that leverages a small amount of scratch capacity on links to apply updates in a provably congestion-free manner, without making any assumptions about the order and timing of updates at individual switches. Further, to scale to large networks in the face of limited forwarding table capacity, SWAN greedily selects a small set of entries that can best satisfy current demand. It updates this set without disrupting traffic by leveraging a small amount of scratch capacity in forwarding tables. Experiments using a testbed prototype and data-driven simulations of two production networks show that SWAN carries 60% more traffic than the current practice.
Article
Video traffic already represents a significant fraction of today's traffic and is projected to exceed 90% in the next five years. In parallel, user expectations for a high quality viewing experience (e.g., low startup delays, low buffering, and high bitrates) are continuously increasing. Unlike traditional workloads that either require low latency (e.g., short web transfers) or high average throughput (e.g., large file transfers), a high quality video viewing experience requires sustained performance over extended periods of time (e.g., tens of minutes). This imposes fundamentally different demands on content delivery infrastructures than those envisioned for traditional traffic patterns. Our large-scale measurements over 200 million video sessions show that today's delivery infrastructure fails to meet these requirements: more than 20% of sessions have a rebuffering ratio ≥ 10% and more than 14% of sessions have a video startup delay ≥ 10s. Using measurement-driven insights, we make a case for a video control plane that can use a global view of client and network conditions to dynamically optimize the video delivery in order to provide a high quality viewing experience despite an unreliable delivery infrastructure. Our analysis shows that such a control plane can potentially improve the rebuffering ratio by up to 2× in the average case and by more than one order of magnitude under stress.