ArticlePDF Available

Dynamic Switch-Controller Association and Control Devolution for SDN Systems

Authors:
  • Shenzhen Institute of Artificial Intelligence and Robotics for Society

Abstract and Figures

In software-defined networking (SDN), as data plane scale expands, scalability and reliability of the control plane has become major concerns. To mitigate such concerns, two kinds of solutions have been proposed separately. One is multi-controller architecture, i.e., a logically centralized control plane with physically distributed controllers. The other is control devolution, i.e., delegating control of some flows back to switches. Most of existing solutions adopt either static switch-controller association or static devolution, which may not adapt well to the traffic variation, leading to high communication costs between switches and controller, and high computation costs of switches. In this paper, we propose a novel scheme to jointly consider both solutions, i.e., we dynamically associate switches with controllers and dynamically devolve control of flows to switches. Our scheme is an efficient online algorithm that does not need the statistics of traffic flows. By adjusting some parameter V, we can make a trade-off between costs and queue backlogs. Theoretical analysis and extensive simulations show that our scheme yields much lower costs and latency compared to static schemes, and balanced loads among controllers.
Content may be subject to copyright.
arXiv:1702.03065v4 [cs.NI] 3 Mar 2017
Dynamic Switch-Controller Association and
Control Devolution for SDN Systems
Xi Huang1, Simeng Bian1, Ziyu Shao1, Hong Xu2
1School of Information Science and Technology, ShanghaiTech University
2NetX Lab @ City University of Hong Kong
Email: {huangxi,biansm,shaozy}@shanghaitech.edu.cn, henry.xu@cityu.edu.hk
Abstract—In software-defined networking (SDN), as data plane
scale expands, scalability and reliability of the control plane
have become major concerns. To mitigate such concerns, two
kinds of solutions have been proposed separately. One is multi-
controller architecture, i.e., a logically centralized control plane
with physically distributed controllers. The other is control
devolution, i.e., delegating control of some flows back to switches.
Most of existing solutions adopt either static switch-controller
association or static devolution, which may not adapt well to the
traffic variation, leading to high communication costs between
switches and controller, and high computation costs of switches.
In this paper, we propose a novel scheme to jointly consider both
solutions, i.e., we dynamically associate switches with controllers
and dynamically devolve control of flows to switches. Our scheme
is an efficient online algorithm that does not need the statistics
of traffic flows. By adjusting some parameter V, we can make a
trade-off between costs and queue backlogs. Theoretical analysis
and extensive simulations show that our scheme yields much
lower costs and latency compared to static schemes, and balanced
loads among controllers.
I. INT RO DUC TI ON
Software-defined networking (SDN) holds great promises
to improve network performance and management. The key
idea of SDN is to decouple the control plane from the
data plane [1]. Data plane can focus on performing basic
functionalities such as packet forwarding at high speed, while
the logically centralized control plane manages the network.
Usually, switches at data plane send requests to control plane
for processing some flow events, e.g. flow-install events.
The control plane is a potential bottleneck of SDN in terms
of scalability and reliability. As the data plane expands, control
plane may not be able to process the increasing number of
requests if implemented with a single controller, resulting
unacceptable latency to flow setup. Reliability is also an issue
since a single controller is a single point of failure , resulting
in disastrous break-down of the control plane and the network.
Existing proposals to address such problems fall broadly
into two categories. One is to implement the control plane
as a distributed system with multiple controllers [2] [3].
Each switch then associate with certain controllers for fault-
tolerance and load balancing [4] [5] [6] [7]. The other is to
devolve partial loads of request processing from controllers to
switches [8] [9] [10], reducing the work load of controllers.
For switch-controller association, the first category of solu-
tion, initial design choice is to make a static switch-controller
association [2] [3]. However, such static association may result
in overloading of controllers and increasing flow setup latency
due to its inflexibility to handle traffic variations. An elastic
distributed controller architecture is proposed in [5], with
an efficient protocol to migrate switches across controllers.
However, it remains open to determine the switch-controller
association. Then [6] took a step further by formulating
the switch-controller association problem as a deterministic
optimization problem, i.e., an integer linear problem with
prohibitively high computational complexity. A local search
algorithm was proposed to find suboptimal associations within
a given time limit (e.g., 30 seconds). In [7], the controller
is assumed to be modeled as M/M/1 queue (Poisson arrival
and exponential service). Under such assumption, the switch-
controller association problem with steady-state objective
function was formulated as a many-to-one stable matching
problem with transfers. Then a novel two-phase algorithm was
proposed to connect stable matching with utility-based game
theoretic solutions, i.e., coalition formation game with Nash
stable solutions.
For control devolution, the second category of solution, an
obvious design choice is static devolution for certain functions
and flows [8] [9] [10]; i.e., switches locally process requests
that do not require network-wide state, such as link-layer
discovery service. Static devolution mitigates the loads on
control plane in some ways, but it could be inflexible in face
of traffic variations. As an alternative, dynamic devolution
allows switches to decide processing those requests locally or
uploading them to control plane, depending on their amounts
of loads at the moment. However, the design of dynamic
devolution with respect to traffic variations remains open.
Then several interesting questions are raised and answers
to such questions will definitely shape our design for SDN
networks:
Instead of deterministic switch-controller association with
infrequent re-association [6] [7], can we directly perform
dynamic switch-controller association with respect to the
traffic variation? What is the benefit that we can obtain
from a fine-grained control at the request level?
How to perform dynamic devolution?
How to make a trade-off between dynamic switch-
controller association and dynamic control devolution?
In this paper, we consider a general SDN network with
traffic variations, resulting variations of requests to handle with
flow events. We assume each request can be either processed
at switch (incurs computation costs) or be uploaded to cer-
tain controllers (incurs communication costs) 1. We aim at
reducing the computational cost by control devolution at data
plane, the communication cost by switch-controller association
between data plane and control plane, and the response time
1The scenario that some requests can only be processed by controller is a
straightforward extension of our model.
experienced by switches, which is mainly caused by queueing
delay on controllers. Under such settings, we provide a new
perspective and a novel scheme to answer those questions. Our
key results and contributions are summarized as follows:
Problem Formulation: According to the best of our
knowledge, this paper is the first to study the joint
optimization problem of dynamic switch-controller asso-
ciation and dynamic control devolution.
Finer Granularity Control: According to the best of
our knowledge, this paper is the first to perform the
control decisions at the granularity of request-level. Note
that request-level information such as time-varying queue
backlog sizes and number of request arrivals presents
the actual time-varying state of data plane. Hence it’d
help for more accurate decision making of dynamic
association and dynamic devolution when compared to
coarse-grained control.
Online Algorithm Design We formulate a stochastic
network optimization problem, aiming at minimizing
the long-term average sum of communication cost and
computational cost, while keeping time-average queue
backlogs of both switches and controllers small2. By
employing Lyapunov drift technique [11] and exploiting
sub-problems structure, we develop an efficient greedy
algorithm to achieve optimality asymptotically. Our al-
gorithm is online, which means it does not need the
statistics of traffic workloads and does not need the prior
assumption of traffic distribution.
Algorithm Analysis: We show that our algorithm yields a
tunable trade-off between O(1/V )deviation from min-
imum long-term average sum of communication cost
and computational cost and O(V)bound for average
queue backlogs. We also find that the positive parameter
Vdetermines the switches’ willingness of uploading
requests to controllers, i.e., performing switch-controller
association.
Simulation: We conduct large-scale trace-driven simula-
tions to evaluate the performance of our algorithm within
two widely adopted data center networking topologies,
i.e., Canonical 3-Tiered topology and Fat-tree topology.
Simulation results verify the effectiveness and the trade-
off of our algorithm. In addition, in the extreme case
that without control devolution, we compare our dy-
namic association scheme with other association schemes
including Static, Random, and JSQ (Join-the-Shortest-
Queue). Simulation results verify the advantages of our
scheme.
We organize the rest of paper as follows. We present the
basic idea and formulation in Section II. Then we show our
algorithm design and corresponding performance analysis in
Section III. In Section IV, we present simulation results. We
conclude this paper in Section V.
II. PROB LE M FO RM ULATI ON
In this section, we first provide a motivating example for
the dynamic switch-controller association and dynamic control
devolution. Then we introduce the system model and formulate
the problem.
2By applying Little’s law, small queue backlog implies small queueing
delay or short response time.
Associated with
Accessible to
Control plane
Data plane
s1
s2
s3
c1
c2
(a)
Scheduler
Control planeData plane
s1
s2
s3
c1
c2
(b)
Scheduler Control planeData plane
s1
s2
s3
c1
c2
(c)
Scheduler
Control planeData plane
s1
s2
s3
c1
c2
(d)
Scheduler Control planeData plane
s1
s2
s3
c1
c2
(e)
Scheduler
s1
Fig. 1. An example that shows the request-level scheduling process. There
are 3switches(s1, s2, s3),2controllers(c1, c2), and 1global scheduler. Each
switch or controller maintains a queue that buffers requests. During each time
slot, each controller can serve 2requests while each switch can serve only 1
request. There is a communication cost per request if switches upload requests
to controllers, and a computational cost (2per request on each switch) of
locally processing by switches themselves. At the beginning of time slot t,
s1,s2, and s3generates 3,2, and 2requests, respectively. The scheduler then
collects system dynamics and decides a switch-controller association (could
be (b) or (c)), aiming at minimizing the sum of communication cost (could
be the number of hops, RTTs, etc.) and computational cost, as well as request
queueing delay. Each switch chooses to either locally process its requests or
send them to controllers according to the scheduling decision.
A. Motivating Example
The example of dynamic association and devolution is
shown in Fig. 1.
First, we focus on the behavior of s3. In Fig. 1 (b), s3
chooses to process its requests locally, and that incurs a
computational cost of 2 per request. In Fig. 1 (c), s3decides
to upload requests to c2and that incurs a communication cost
of 3 per request. Although the computational cost is less than
communication cost, the decision of local processing leaves
one request not processed yet at the end of the time slot.
Hence, it is not necessarily a smart decision for a switch
to perform control devolution when its computational cost
is lower than its communication cost. Instead, the scheduler
should jointly decide control devolution and switch-controller
association at the same time.
Second, we focus on the behavior of associations. Fig. 1
(b) and (c) show two different associations. Fig. 1 (b) shows
the switch-controller association with (s1, c1)and (s2, c1)(s3
processes requests locally), denoted by X1.X1incurs the total
cost of communication and computation by 9but results in
uneven queue backlogs, leaving four requests unfinished after
time slot t. Fig. 1 (c) shows another association with (s1, c1)
and (s3, c2)(s2processes requests locally), denoted by X2.
X2incurs the total cost by 13 but does better in balancing
queue backlogs. Thus there is a non-trivial trade-off between
minimizing the total cost of communication and computation
and maintaining small queue backlogs in each controller.
B. Problem Formulation
We consider a time slotted network system, indexed by
{0,1,2, . . . }. Its control plane comprises a set Cof physically
distributed controllers, while its data plane consists of a set of
switches S. Each switch i∈ S keeps a queue backlog of size
Qs
i(t)for locally processing requests, while each controller
j∈ C maintains a queue backlog Qc
j(t)that buffers requests
from data plane. We denote [Qc
1(t), . . . , Qc
|C|(t)] as Qc(t)
and [Qs
1(t), . . . , Qs
|S| (t)] as Qs(t). We use Q(t)to denote
[Qs(t),Qc(t)].
At the beginning of time slot t, each switch i∈ S generates
some amounts 0Ai(t)amax of requests. Then each
switch could choose to process its requests either locally or
by sending to its associated controller. We assume that each
switch i∈ S has a service rate 0Ui(t)umax to
process the devoluted requests, while each controller j∈ C
has an available service rate 0Bj(t)bmax. We denote
[A1(t), . . . , A|S |(t)] as A(t),[B1(t), . . . , B|C |(t)] as B(t),
and [U1(t), . . . , U|S| (t)] as U(t). For i∈ S and j∈ C, we
assume that all Ai(t),Bj(t), and Ui(t)are i.i.d.; besides, their
first and second raw moments are all finite.
Then the scheduler collects system dynamics information
(A(t),B(t),Q(t)) during current time slot and makes a
scheduling decision, denoted by an association matrix X(t)
{0,1}|S|×|C |. Here X(t)i,j = 1 if switch iwill be associated
with controller jduring current time slot and 0otherwise.
An association is feasible if it guarantees that each switch is
associated with at most one controller during each time slot.
We denote the set of feasible associations as A,
A=
X∈ {0,1}|S|×|C | |X
j∈C
Xi,j 1for i∈ S
(1)
According to the scheduling decision, each switch isends its
request to controller jif Xi,j = 1. However, if Pj∈C Xi,j =
0, switch iappends its requests to local queue backlog. Then
both switches and controllers serve as many requests in their
queues as they could3. As a result, the update equation for
Qs
i(t)at switch iis
Qs
i(t+ 1) =
Qs
i(t) +
1X
j∈C
Xi,j (t)
·Ai(t)Ui(t)
+
(2)
and the update equation for Qc
j(t)at controller jis given by
Qc
j(t+ 1) = "Qc
j(t) + X
i∈S
Xi,j (t)·Ai(t)Bj(t)#+
(3)
where [x]+= max(x, 0).
Having covered the necessary notations and queueing dy-
namics, we switch to the objective and constraints of our
problem.
1) Time-Average Communication Cost: We define the com-
munication cost between switch iand controller jas Wi,j
4. Accordingly, we have a communication cost matrix W=
3Note that we do not fix the serving principle, which can be FIFO,LIFO,
etc. But in our simulation, we use FIFO.
4The communication cost can be the number of hops or round trip times
(RTT).
{Wi,j }. Fixing some association X∈ A, the communication
cost within one time slot is
fX(t) = ˆ
f(X,A(t)) ,X
j∈C X
i∈S
Wi,j ·Xi,j ·Ai(t)(4)
where we can view Wi,j as the price of transmitting one
request from switch ito controller j. Then, given a series
of associations {X0,X1,...,Xt1}, the time-average com-
munication cost is shown as follows
¯
f(t),1
t
t1
X
τ=0
E{fXτ(τ)}(5)
2) Time-average Computational Cost: There is a computa-
tional cost αifor each devoluted request to iwhen switch i
appends its requests to its local queue backlog for processing.
Given some association X∈ A, we define the one-time-slot
computational cost as
gX(t) = ˆg(X,A(t)) ,X
i∈S
αi·
1X
j∈C
Xi,j
·Ai(t)
(6)
Given a series of associations {X0,X1,...,Xt1}, the time-
average computational cost is
¯g(t),1
t
t1
X
τ=0
E{gXτ(τ)}(7)
3) Queueing Stability: A queueing process {Q(t)}is sta-
ble, if the following condition holds:
lim
t→∞
1
t
t1
X
τ=0
E{Q(τ)}<(8)
Accordingly, Qc(t)is stable if
lim
t→∞
1
t
t1
X
τ=0 X
j∈C
EQc
j(τ)<(9)
and Qs(t)is stable if
lim
t→∞
1
t
t1
X
τ=0 X
i∈S
E{Qs
i(τ)}<(10)
Queueing stability implies that both switches and controllers
would process buffered requests timely, so that queueing delay
is controlled within a small range.
Consequently, our problem formulation is given as follows
Minimize
X(t)∈A for t∈{0,1,2,... }lim
t→∞ sup ¯
f(t) + ¯g(t)
subject to (2),(3),(9),(10).(11)
III. ALG OR ITH M DESI GN A ND PER FOR MA NCE ANA LYS IS
To design a scheduling algorithm that solves problem (11),
we adopt the Lyapunov optimization technique in [11]. Define
the quadratic Lyapunov function as
L(Q(t)) = 1
2
X
j∈C Qc
j(t)2+X
i∈S
(Qs
i(t))2
(12)
Next, we define the conditional Lyapunov drift for two
consecutive time slots as
∆ (Q(t)) ,E{L(Q(t+ 1)) L(Q(t)) |Q(t)}(13)
This conditional difference measures the general change in
queues’ congestion state. We want to push such difference as
low as possible, so as to prevent queues Qs(t)and Qc(t)from
being overloaded. However, to maintain low queue backlogs,
the action we take might incur considerable communication
cost f(t)or computational cost g(t), or both. Hence, we
should jointly consider both queueing stability and f(t) + g(t).
Given any feasible association X, we define one-time-slot
conditional drift-plus-penalty function as
V(Q(t)) ,∆(Q(t)) + V·E{fX(t) + gX(t)|Q(t)}
(14)
where f(t)is defined by (4), g(t)is defined by (6), and V > 0
is a constant that weights the penalty brought by f(t)and g(t).
By minimizing the upper bound of the drift-plus-penalty
expression (14), the time average of communication cost can
be minimized while stabilizing the network of request queues.
We employ the concept of opportunistically minimizing an
expectation in [11], and we transform the long-term stochastic
optimization problem (11) into the following drift-plus-penalty
minimization problem at every time slot t.
Minimize
X∈A V·ˆ
f(X,A(t)) + ˆg(X,A(t))+
X
j∈C
Qc
j(t)·"X
i∈S
Xi,j ·Ai(t)#+
X
i∈S
Qs
i(t)·
(1 X
j∈C
Xi,j )·Ai(t)
(15)
After rearranging the terms in (15), our optimization prob-
lem turns out to be
Minimize
X∈A X
i∈S
[V αi+Qs
i(t)] Ai(t) + X
i∈S X
j∈C
[V Wi,j +
Qc
j(t)V αiQs
i(t)Xi,j Ai(t)
(16)
Since the first term Pi∈S [V αi+Qs
i(t)] in (16) has nothing
to do with X, then we fix it as constant and put our focus on
minimizing the second term of (16) only. For each i∈ S, we
split Cinto two disjoint sets Ji
1and Ji
2, i.e. Ji
1˙
SJi
2=C.
Ji
1={j C | V Wi,j +Qc
j(t)> V αi+Qs
i(t)}(17)
Ji
2={j C | V Wi,j +Qc
j(t)V αi+Qs
i(t)}(18)
Note that X∈ {0,1}|S|×|C| and for switch i,Pj∈C Xi,j 1.
Therefore, for each switch i∈ S,
X
j∈C V Wi,j +Qc
j(t)V αiQs
i(t)Xi,j Ai(t)
=
X
j∈J i
1V Wi,j +Qc
j(t)V αiQs
i(t)Xi,j +
X
j∈J i
2V Wi,j +Qc
j(t)V αiQs
i(t)Xi,j
Ai(t)
(19)
Next we show how to minimize (19). Given (i, j )∈ S × C,
we define
ω(i, j) = V Wi,j +Qc
j(t)V αiQs
i(t)(20)
Then we consider two different cases.
If Ji
2=, i.e., ω(i, j)>0for all j∈ C, then the only
way to minimize (19) is setting X
i,j = 0 for all j∈ C.
If Ji
26=, then we handle with Xi,j for j J i
1and
j J i
2separately.
For j J i
1, it is not hard to see X
i,j = 0 for j J i
1.
For j J i
2,ω(i, j)0. Then we should make
X
i,j= 1 for such jthat
j= arg min
j∈J i
2
ω(i, j)(21)
and X
i,j = 0 for j J i
2− {j}. In such a way, given
any X∈ A, for switch ithe following always holds
X
j∈J i
1
ω(i, j)·X
i,j +X
j∈J i
2
ω(i, j)·X
i,j
X
j∈J i
1
ω(i, j)
·0 + min
j∈J i
2
ω(i, j)
=X
j∈J i
1
ω(i, j)·X
i,j +X
j∈J i
2
ω(i, j)·X
i,j
(22)
Therefore, the above process shows us how to minimize (16).
Then we have the algorithm shown as follows:
Algorithm 1 Greedy Scheduling Algorithm
Input: During time slot t, the scheduler collects queue lengths
information from individual controllers and switches, i.e.
Qc(t),Qs(t), and A(t)
Output: A scheduling association X S × C
1: Start with an empty set X ← ∅
2: for each switch i∈ S do
3: Split all controllers Cinto two sets Ji
1and Ji
2, where
Ji
1={j C | ω(i, j )>0}and
Ji
2={j C | ω(i, j )0}
4: If Ji
2=, then skip current iteration.
5: If Ji
26=, then choose controller j J i
2such that
jarg min
j∈C
ω(i, j)
6: X ← X S{(i, j )}
7: end for
8: return X
According to X, switches upload requests to controllers
or append requests to their local queues. Then controllers
and switches update their queue backlogs as in (2) and
(3) after serving requests.
Remarks:
Our algorithm is greedy since it greedily associates each
switch with controllers that either with small queue
backlog size or close to the switch, and otherwise it leaves
all requests locally processed.
For switch i, given any controller jfar enough from i,
i.e., Wi,j > αi, switch idecides to upload requests to
jonly if ω(i, j)is non-positive and smaller than any
other. This requires switch iitself holds enough requests
locally, i.e., Qs
i(t)V·(Wi,j αi) + Qc
j(t). Then it
will upload requests. Thus smaller Vwill invoke more
effectively the willingness of switch ito upload requests
to control plane.
On the other hand, given any controller kclose to switch
i, i.e., Wi,j < αi, switch iwill process requests locally
if control plane holds large amounts of requests, i.e.,
Qs
i(t)< V ·(Wi,j αi)+Qc
j(t). Thus given any large V,
controllers will have to hold great loads of requests before
switches become willing to process requests locally.
Therefore, the parameter Vactually controls switches’
willingness of uploading requests to controllers, i.e.,
performing switch-controller association. In other words,
it controls the trade-off between communication cost and
the computational cost, which are incurred by uploading
requests to control plane and locally processing, respec-
tively.
Now we turn to time complexity analysis of our algorithm.
Within each time slot, the algorithm runs |S| iterations in
total. For each switch i, it takes O(|C|)steps to split Cinto
two disjoint sets. If the resulting Ji
26=, then the algorithm
needs to calculate ω(i, j )for each controller (|C| controllers
in total) and picks up jfor the minimum ω(i, j ). For each
switch-controller pair (i, j ), calculating ω(i, j)incurs only
constant time according to (20). Therefore, during each time
slot, our algorithm takes about O(|S| × |C|)time to decide the
optimal switch-controller association. In fact, our algorithm
can also run in a parallel manner: i.e., for each switch i,
given W,Qc(t),Qs
i(t), and Ai(t), the scheduler can decide
its associated controller independently from other switches.
Next we characterize the performance of our algo-
rithm. We suppose gand fare the infimum of time-
average computational cost and communication cost that we
want to achieve, respectively. We also suppose dmax =
maxi,j (E(B2
j(t)), E(U2
i(t)), E(A2
i(t))). The we have the fol-
lowing theorem5on the O(1/V ), O(V)trade-off between
costs and queue backlogs:
Theorem 1: Given the parameters V > 0,ǫ > 0, and
constant K1
2dmax ·(|C| +|S|+|S|2), then the queue vector
process Q(t)is stable; besides, the time-average expectation of
communication cost and computational cost, as well as queue
backlogs on switches and controllers satisfy:
lim sup
t→∞ ¯
f(t) + ¯g(t)f+g+K
V
lim sup
t→∞
1
t
t1
X
τ=0
X
j∈C
EQc
j(τ)+X
i∈S
E{Qs
i(τ)}
K+V·(f+g)
ǫ
(23)
IV. SIM ULATI ON RES ULTS
A. Basic Settings
Topology: In this section, we evaluate our Greedy schedul-
ing algorithm under two widely adopted topologies: Canonical
3-Tiered topology and Fat-tree [12] topology. We show two of
their instances in Fig. 2 and Fig. 3, respectively.
5See the proof in appendix B.
Core
Aggregate
Edge
Host
Fig. 2. An instance of Canonical 3-Tiered topology with k= 4.kdenotes
the number of core switches. In this paper, the number of aggregate switches
is also set to k, and each connects to k1edge switches. The total number
of switches is k2+k. Each edge switch is directly connected to k
2hosts.
Therefore, there are k3
k2
2hosts in total.
Core
Aggregate
Edge
Host
Fig. 3. An instance of Fat-tree topology with k= 4.kdenotes the number
of pods. The number of core, aggregate, edge switches are k2
4,k2
2,k2
2,
respectively. And the total number of switches is 5
4k2. Each edge switch is
directly connected to k
2hosts. Accordingly, there are k3
4hosts in total.
To make our performance analysis comparable in both
topologies, we construct a Fat-tree network and a Canonical
3-Tiered network with almost the same number of switches.
Regarding the Canonical 3-Tiered topology, we set the number
of core switches as 26. Accordingly, the total number of
switches is 702. Regarding the Fat-tree topology, we set the
port number as 24 and thusly there are 720 switches in total.
Note that the two resulting topologies are also comparable to
the size of commercial data centers [13].
In both topologies, we deploy controllers on the hosts (one
controller for every two pods6), which are denoted by the
blue circles in Fig. 2 and Fig. 3.
Traffic Workloads: We conduct trace-driven simulations,
where the flow arrival process on each switch follows the
distribution of flow inter-arrival time in [13], which is drawn
from measurements within real-world data centers. In [13], the
average flow inter-arrival time is about 1700µs. We then set
the length of each time slot as 10ms. Accordingly, the average
flow arrival rate on each switch is about 5.88 flows per time
slot.
In fact, there do exist hot spots within pods in real-world
data center networks, where the switches have significantly
high flow arrival rates. In our simulation, we pick the first pod
as a hot spot and all switches there have significantly high flow
arrival rate, i.e., 200 flows per time slot. As for controllers, we
set their individual capacity as 600 flows per time slot. That is
consistent with the capacity of a typical NOX controller [14].
Costs: Given any network topology, we define the com-
munication cost Wi,j between switch iand controller jas
the length (number of hops) of shortest path from ito j.
6In Canonical 3-Tiered topology, we regard the group of switches that
affiliate the same aggregation switch as one pod (including the aggregation
switch itself).
(a) Total cost vs. V (b) Total queue backlog vs. V
Fig. 4. Performance of Greedy under Fat-tree topology and Canonical 3-
Tiered topology in terms of (a) the sum of total communication cost and
computational cost, and (b) total queue backlog.
(a) Total cost vs. V (b) Total queue backlog vs. V
Fig. 5. Performance of Greedy under Fat-tree topology with request arrivals
that follow Poisson and Pareto process in terms of (a) the sum of total
communication cost and computational cost, and (b) total queue backlog.
Then we set a common computation cost αfor all switches,
which equals to the average hop number between switches and
controllers of its underlying topology. In Fat-tree topology,
α= 4.13; while in 3-Tiered topology, α= 4.81.
B. Evaluation of Greedy Algorithm
Fig. 4(a) presents how the summation of long-term average
communication cost and computational cost changes with
different Vin Fat-tree and 3-Tiered topologies. As Vvaries
from 0to 3.0×107, we can observe that the total cost
goes down gradually. This is consistent with our previous
theoretic analysis. The intuition behind such decline is as
follows. Remind that Vcontrols the switches’ willingness of
uploading requests. For switches that are close to controllers
(their communication cost is less than the average), large
Vmakes them unwilling to process requests locally unless
the controllers get too heavy load. As Vincreases, those
switches will choose to upload requests to further reduce the
costs since for those switches, communication costs are less
than the computation cost. Another observation we make is
that the total cost of 3-Tiered topology is more than Fat-
tree’s. The reason is that 3-Tiered has a higher computational
cost (α= 4.81 compared to 4.13) and it cost more when
switches process requests locally. In Fig. 5(a), we also show
the total cost of Greedy in Fat-tree topology with other two
request arrival processes. The curves of Poisson and Pareto
exhibit qualitatively similar decline in total cost, although
Pareto incurs slightly (1.5%) larger queue backlog size than
Poisson.
Fig. 4(b) shows the curve of total queue backlog size with
different values of V. From the figure, we notice that total
queue backlog size increases until Vreaches about 0.75 ×107
and 1.5×107in Fat-tree and 3-Tiered topologies, respectively.
This is also consistent with the O(V)queue backlog size
bound in (23). Recall our analysis in Total Cost: larger V
invokes most switches to spend more time uploading requests
(a) Canonical 3-Tiered topology (b) Fat-tree topology
Fig. 6. Communication cost comparison among four scheduling schemes
under Canonical 3-Tiered topology and Fat-tree topology, respectively.
to control plane. However, control plane’s service capacity
is fixed and requests will keep accumulating. Thus when V
becomes sufficiently large, control plane will eventually hold
most of requests in the system. This explains the increasing
queue backlog size in Fig. 4(b). Fig. 5(b) present the total
queue backlog size in Fat-tree topology when we apply
Greedy with request arrivals that follow Poisson and Pareto
processes. Note that we do not show that of 3-Tiered topology
because curves there almost overlap with those in Fig. 5(b).
C. Comparison with Other Dynamic Association Schemes
In this subsection, we consider the extreme case by set-
ting common computational cost α= 2.0×1028 for all
switches. This means the cost of local processing requests
are prohibitively high and each time slot switches choose
to upload requests to controllers. Thus our greedy algorithm
degenerates into a dynamic switch-controller association algo-
rithm. We compare its performance with three other schemes:
Static,Random and JSQ(Join-the-Shorest-Queue). In static
scheme, each switch ichooses the controller jwith minimum
communication cost Wi,j and then fixes this association for all
time slots. In random scheme, the scheduler randomly picks up
a controller for each switch at each time slot. In JSQ scheme,
it randomly picks up one switch iwithout replacement round
by round until all switches have chosen the target controllers.
At each round, the selected switch ichooses the controller j
with the smallest queue backlog size at present, and appends
all its requests to j’s queue.
Fig. 6 presents a comparison among Static,Random,
JSQ, and Greedy in terms of communication cost under
Canonical 3-Tiered topology and Fat-tree, respectively. First,
the communication cost under Static is the minimum among
all schemes, which is consistent with its goal of minimizing
the overall communication cost. Greedy cuts down the com-
munication cost with increasing V. Eventually, when Vis
sufficiently large, communication cost stops decreasing and
remains unchanged. Both Random and JSQ exhibit much
higher communication costs, compared to Greedy and Static.
Fig. 7 presents a comparison among the four schemes in
terms of the variance of queue backlog size under Canonical
3-Tiered topology and Fat-tree topology, respectively. In fact,
smaller queue backlog size variance indicates better capability
of load balancing. The variance of Static grows exponentially
with time, showing that Static is incompetent in load balanc-
ing. The reason is that Static greedily associates switches with
their nearest controllers, ignoring different controllers’ loads.
When it comes to Random and JSQ, the variance is almost 0,
which shows the two schemes’ advantage in load balancing.
While the variance of Greedy is in between the other three: it
(a) Canonical 3-Tiered topology (b) Fat-tree topology
Fig. 7. Variance of queue backlog size comparison among four scheduling
schemes under Canonical 3-Tiered topology and Fat-tree topology, respec-
tively.
(a) Canonical 3-Tiered topology (b) Fat-tree topology
Fig. 8. Average queue backlog size comparison among four scheduling
schemes under Canonical 3-Tiered topology and Fat-tree topology, respec-
tively.
(a) Poisson (b) Pareto
Fig. 9. Communication cost comparison among four scheduling schemes
under Fat-tree topology, when the flow arrival follows Poisson and Pareto,
respectively.
increases at the beginning and then remains stable after about
thousands of time slots. Furthermore, Greedy exhibits higher
variance of queue backlog size with larger V, i.e., the load of
controllers is more imbalanced.
Fig. 8 presents a comparison among the four schemes in
terms of the average queue backlog size under Canonical 3-
Tiered topology and Fat-tree topology, respectively. Actually,
the observations in Fig. 8 are very consistent with that of
Fig. 7. Intuitively, the more balanced the load of controller
is, the smaller of the average queue backlog size. In Fig. 7,
the variance of Static is high while that of Random and JSQ
are almost 0, so the average queue backlog size of Static is
large while that of Random and JSQ is small in Fig. 8. As
for Greedy, the variance increases with Vin Fig. 7, so the
average queue backlog size of Greedy rises as Vincreases as
shown in Fig. 8.
In addition to trace-driven simulation, we also conduct
the comparison under other two assumptions of flow arrival
processes, i.e., Poisson and Pareto processes. They two are
widely adopted in traffic analysis. We only show the simulation
results under Fat-tree topology, because the simulation results
in 3-Tiered topology is qualitatively similar to that in Fat-tree
topology. Fig. 9 shows the communication cost comparison
(a) Poisson (b) Pareto
Fig. 10. Average queue backlog size comparison among four scheduling
schemes under Fat-tree topology, when the flow arrival follows Poisson and
Pareto, respectively.
when the flow arrival process follows Poisson and Pareto,
respectively. Fig. 10 shows the average queue backlog size
comparison when the flow arrival process follows Poisson and
Pareto processes, respectively. We can see from these figures
that the scheduling policies perform qualitatively consistent
under different arrival processes.
In summary, among four schemes, Static is on the one
end of performance spectrum: it minimizes communication
cost while incurring extremely large queue backlogs; both
Random and JSQ are on the other end of performance
spectrum: they minimize the average queue backlog while
incurring much large communication costs. In contrast, our
Greedy scheme achieves a trade-off between minimization
of communication costs and minimization of queue backlogs.
Through a tunable parameter V, we can achieve different
degrees of balance between cost minimization and latency
(queue backlog) minimization.
V. CO NCL US ION
In this paper, we studied the joint optimization problem
of dynamic switch-controller association and dynamic
control devolution for SDN networks. We formulated the
problem as a stochastic network optimization problem,
aiming at minimizing the long-term average summation
of total communication cost and computational cost while
maintaining low time-average queue backlogs. We proposed
an efficient online greedy algorithm, which yields a long-term
average sum of communication cost and computational cost
within O(1/V )of optimality, with a trade-off in an O(V)
queue backlog size for any positive control parameter V.
Extensive simulation results show the effectiveness and
optimality of our online algorithm, and the ability to maintain
a tunable trade-off compared to other dynamic association
schemes.
ACK NOWL EDG MEN T
This work was supported in part by NSFC grant No.6130
2114, and Hong Kong RGC CRF-C7036-15G.
REF ERE NC ES
[1] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson,
J. Rexford, S. Shenker, and J. Turner, “Openflow: enabling innovation in
campus networks,ACM SIGCOMM Computer Communication Review,
vol. 38, no. 2, pp. 69–74, 2008.
[2] T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu,
R. Ramanathan, Y. Iwata, H. Inoue, T. Hama et al., “Onix: A distributed
control platform for large-scale production networks.” in OSDI, vol. 10,
2010, pp. 1–6.
[3] A. Tootoonchian and Y. Ganjali, “Hyperflow: A distributed control plane
for openflow,” in Proceedings of the 2010 internet network management
conference on Research on enterprise networking, 2010, pp. 3–3.
[4] D. Levin, A. Wundsam, B. Heller, N. Handigol, and A. Feldmann,
“Logically centralized?: state distribution trade-offs in software defined
networks,” in Proceedings of the first workshop on Hot topics in software
defined networks. ACM, 2012, pp. 1–6.
[5] A. Dixit, F. Hao, S. Mukherjee, T. Lakshman, and R. Kompella, “To-
wards an elastic distributed sdn controller,” ACM SIGCOMM Computer
Communication Review, vol. 43, no. 4, pp. 7–12, 2013.
[6] A. Krishnamurthy, S. P. Chandrabose, and A. Gember-Jacobson,
“Pratyaastha: An efficient elastic distributed sdn control plane,” in
Proceedings of the third workshop on Hot topics in software defined
networking. ACM, 2014, pp. 133–138.
[7] T. Wang, F. Liu, J. Guo, and H. Xu, “Dynamic sdn controller assignment
in data center networks: Stable matching with transfers,” in Proc. of
INFOCOM, 2016.
[8] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and
S. Banerjee, “Devoflow: scaling flow management for high-performance
networks,” in ACM SIGCOMM Computer Communication Review,
vol. 41, no. 4. ACM, 2011, pp. 254–265.
[9] S. Hassas Yeganeh and Y. Ganjali, “Kandoo: a framework for efficient
and scalable offloading of control applications,” in Proceedings of the
first workshop on Hot topics in software defined networks. ACM, 2012,
pp. 19–24.
[10] K. Zheng, L. Wang, B. Yang, Y. Sun, Y. Zhang, and S. Uhlig, “Lazyc-
trl: Scalable network control for cloud data centers,arXiv preprint
arXiv:1504.02609, 2015.
[11] M. J. Neely, “Stochastic network optimization with application to
communication and queueing systems,Synthesis Lectures on Commu-
nication Networks, vol. 3, no. 1, pp. 1–211, 2010.
[12] M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data
center network architecture,” in ACM SIGCOMM Computer Communi-
cation Review, vol. 38, no. 4. ACM, 2008, pp. 63–74.
[13] T. Benson, A. Akella, and D. A. Maltz, “Network traffic characteristics
of data centers in the wild,” in Proceedings of the 10th ACM SIGCOMM
conference on Internet measurement. ACM, 2010, pp. 267–280.
[14] A. Tootoonchian, S. Gorbunov, Y. Ganjali, M. Casado, and R. Sherwood,
“On controller performance in software-defined networks,” in Presented
as part of the 2nd USENIX Workshop on Hot Topics in Management of
Internet, Cloud, and Enterprise Networks and Services, 2012.
APP EN D IX A
PROB LEM T RA NSF ORM ATION B Y OPP ORT UNI STI CA LLY
MI NI MIZ IN G A N EXP ECTATI ON
By minimizing the upper bound of the drift-plus-penalty
expression (14), the time average of communication cost can
be minimized while stabilizing the network of request queues.
We denote the objective function of (16) by Jt(X)and its
optimal solution X∈ A.
Therefore, for any other scheduling decision X∈ A made
during time slot t, we have
Jt(X)Jt(X)(24)
Taking the conditional expectation on both sides conditional
on Qc(t), we have
E[Jt(X)|Qc(t)] E[Jt(X)|Qc(t)] (25)
for any X∈ A. In such way, instead of directly solving
the long-term stochastic optimization problem (11), we
can opportunistically choose a feasible association to solve
problem (16) during each time slot.
APP EN D IX B
PROO F OF THE ORE M 1
Given L(Q(t)) defined in (12), we have
L(Q(t+ 1)) L(Q(t))
=1
2
X
j∈C hQc
j(t+ 1)2Qc
j(t)2i+X
i∈S h(Qs
i(t+ 1))2
(Qs
i(t))2i
1
2X
j∈C
Qc
j(t)Bj(t) + X
i∈S
Xi,j ·Ai(t)!2
Qc
j(t)2
+
1
2X
i∈S n(Qs
i(t)Ui(t) + Yi·Ai(t))2(Qs
i(t))2o
=1
2X
j∈C (2Qc
j(t)· X
i∈S
Xi,j ·Ai(t)Bj(t)!+
X
i∈S
Xi,j ·Ai(t)Bj(t)!2
+1
2X
i∈S n(Yi·Ai(t)Ui(t))2
+2Qs
i(t)·(Yi·Ai(t)Ui(t))}
X
j∈C (Qc
j(t)· X
i∈S
Xi,j ·Ai(t)Bj(t)!+
(Pi∈S Xi,j ·Ai(t))2+ (Bj(t))2
2+
X
i∈S Qs
i(t)·(Yi·Ai(t)Ui(t)) + (Yi·Ai(t))2+ (Ui(t))2
2
(26)
Then with the definition of ∆(Q(t)) in (13), we have
∆(Q(t))
=E{L(Q(t+ 1)) L(Q(t)) |Q(t)}
E
X
j∈C
Qc
j(t)· X
i∈S
Xi,j (t)Ai(t)Bj(t)!|Q(t)
+
E(X
i∈S
Qs
i(t)·(Yi(t)Ai(t)Ui(t)) |Q(t))+
1
2E
X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
|Q(t)
+
1
2E(X
i∈S h(YiAi(t))2+ (Ui(t))2i|Q(t))
=X
j∈C
Qc
j(t)·E( X
i∈S
Xi,j (t)Ai(t)Bj(t)!|Q(t))+
X
i∈S
Qs
i(t)·E{(Yi(t)Ai(t)Ui(t)) |Q(t)}+
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
+
1
2X
i∈S h(YiAi(t))2+ (Ui(t))2i(27)
The last equality in (27) holds because of conditional
expectation on Q(t), then both Qs
i(t)and Qc
j(t)can be
regarded as a constant. Besides, the queueing process {Q(t)}
is independent of the arrival process {A(t)}and service
process {B(t)},{U(t)}. Hence, the last two terms have
nothing to do with Q(t). Now consider the last two terms in
(27). We have
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
+
1
2X
i∈S h(YiAi(t))2+ (Ui(t))2i
=1
2
X
j∈C
(Bj(t))2+X
i∈S
(Ui(t))2
+
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+
1
2X
i∈S
(1 X
j∈C
Xi,j )·Ai(t)
2
(28)
Then by taking expectation on (28), the following holds
E
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
+
1
2X
i∈S h(YiAi(t))2+ (Ui(t))2i)
=E
1
2
X
j∈C
(Bj(t))2+X
i∈S
(Ui(t))2
+
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+
1
2X
i∈S
(1 X
j∈C
Xi,j )·Ai(t)
2
=1
2
X
j∈C
En(Bj(t))2o+X
i∈S
En(Ui(t))2o
+
1
2X
j∈C
E
X
i∈S
Xi,j Ai(t)!2
+
1
2X
i∈S
E
(1 X
j∈C
Xi,j )2·(Ai(t))2
=1
2
X
j∈C
En(Bj(t))2o+X
i∈S
En(Ui(t))2o
+
1
2X
j∈C
E(X
i∈S
X2
i,j (Ai(t))2+ 2 X
i<i
Xi,j Xi,jAi(t)Ai(t))+
1
2X
i∈S
E
(1 X
j∈C
Xi,j )2·(Ai(t))2
(29)
Remind that the request arrival processes {A(t)}are indepen-
dent and they are also independent of Xi,j for (i, j )∈ S × C.
Then we have
E
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
+
1
2X
i∈S h(YiAi(t))2+ (Ui(t))2i)
=1
2
X
j∈C
En(Bj(t))2o+X
i∈S
En(Ui(t))2o
+
1
2
X
j∈C X
i∈S
EX2
i,j En(Ai(t))2o+
2X
i<i
E{Xi,j }E{Xi,j}E{Ai(t)}E{Ai(t)}#+
1
2X
i∈S
E
(1 X
j∈C
Xi,j )2
·En(Ai(t))2o
1
2C·max
j∈C {E(B2
j(t))}+S·max
i∈S {E(U2
i(t))}+
max
i∈S {E(A2
i(t))}
X
j∈C
E
X
i∈S
Xi,j !2
+
X
i∈S
E
1X
j∈C
Xi,j
2
1
2max
i,j (E(B2
j(t)), E(U2
i(t)), E(A2
i(t)))·
C+S+max
X∈A
X
j∈C X
i∈S
Xi,j !2
+X
i∈S
(Yi(t))2
(30)
where the first inequality holds because of the following
reasoning. We suppose iarg maxi∈S Ai(t). Then for any
i, i∈ S E{Ai(t)} · E{Ai(t)}
(E{Ai(t)})2(31)
As we know that Var {Ai(t)} ≥ 0, then
E{Ai(t)} · E{Ai(t)}
EA2
i(t)(32)
Thus the first inequality in (30) holds.
Next, we focus on the upper bound of
Pj∈C Pi∈S Xi,j 2+Pi∈S 1Pj∈C Xi,j 2
for X∈ A.
At each time slot, Xi,j ∈ {0,1}and for each switch
i∈ S, it must decide either to upload requests to one
of controllers or process them locally. Then among all
Xi,j (for all (i, j) S × C) and (1 Pj∈C Xi,j )(for
i∈ S), there are exactly |S| of them that’s equal to one.
Let a[0,|S|]denote the number of switches that decide
to upload requests to control plane, i.e., there are aterms
among all Xi,j (for (i, j ) S × C ) that’s equal to one.
Likewise, let b[0,|S|]denote the number of switches
that process requests locally. Accordingly, we know that
a+b=|S|. Besides, Pi∈S 1Pj∈C Xi,j 2=bsince
there are exactly bswitches such that for any switch iamong
them Pj∈C Xi,j = 0.
Now we prove that the upper bound of Pj∈C Pi∈S Xi,j2
is a2and the bound is reached when all aswitches is
associated with the same controller. We use Rto denote the
set of those aswitches. We introduce indicator Ik ,l such that
Ik,l = 1 if switch kand switch lare associated with the same
controller and 0otherwise. Therefore, for any switch-controller
association N R × C such that |N | =a, we have
X
j∈C X
i∈S
Xi,j !2
=X
j∈C
X
i:(i,j)∈N
X2
i,j + 2 ·X
i,i∈R:i<i
and (i,j),(i,j)∈N
Xi,j Xi,j
=X
(i,j)∈N
X2
i,j + 2 ·X
j∈C X
i,i:i<i
Ii,iXi,j Xi,j
=X
(i,j)∈N
X2
i,j + 2 ·X
i,i∈R:i<i
Ii,i
=a+ 2 ·X
i,i∈R:i<i
Ii,i
(33)
where the last equality holds because for any pair of switches
(i, i),Ii,i= 1 only when iand iupload requests to the
same controller. From (33), we know that the upper bound is
reached when Ii,i= 1 for all i, i∈ R, i.e., when all switches
in Rconnected to the same switches. In such case, since there
are 1
2a(a1) pairs of different switches, then the upper bound
of Pj∈C Pi∈S Xi,j 2is a+a(a1) = a2. Hence,
X
j∈C X
i∈S
Xi,j !2
+X
i∈S
(1 X
j∈C
Xi,j )2
a2+b
=a2+|S| a
=a1
22+|S| − 1
4
(34)
Now that ais a non-negative integer and 0a≤ |S |, then
the upper bound in (34) reaches its maximum value |S |2when
a=|S|. In other words, the upper bound reaches maximum
when all switches in Supload requests to the same controller.
As a result,
E
1
2X
j∈C
X
i∈S
Xi,j Ai(t)!2
+ (Bj(t))2
+
1
2X
i∈S h(YiAi(t))2+ (Ui(t))2i)
1
2max
i,j (E(B2
j(t)), E(U2
i(t)), E(A2
i(t))) ·|C| +|S| +|S|2
=dmax
2|C| +|S | +|S|2=K
(35)
We assume the whole control plane is capable of handling
all requests from data plane in the mean sense. Therefore,
for j∈ C, there exists ǫc
j>0such that E[Bj(t)
Pi∈S Xi,j Ai(t)|Qc(t)] = ǫc
j. Likewise, for i∈ S, there
exists ǫs
i>0such that E[Ui(t)YiAi(t)|Qc(t)] = ǫs
i. Fol-
lowing (35) and the definition in (14), after taking expectation
on V(Q(t)), we have
E{V(Q(t)}
K+X
j∈C
EQc
j(t)·E(E(X
i∈S
Xi,j (t)Ai(t)Bj(t)|Q(t)))
+X
i∈S
E{Qs
i(t)} · E
E
[1 X
j∈C
Xi,j (t)]Ai(t)Ui(t)|Q(t)
+V·E{E{f(t) + g(t)|Q(t)}}
=K+X
j∈C
EQc
j(t)·E(X
i∈S
Xi,j (t)Ai(t)Bj(t))
+X
i∈S
E{Qs
i(t)} · E
[1 X
j∈C
Xi,j (t)]Ai(t)Ui(t)
+V·E{f(t) + g(t)}
KǫcX
j∈C
Qc
j(t)ǫsX
i∈S
Qs
i(t) + V·(f+g)
(36)
where ǫc= minj∈C {ǫc
j},ǫs= mini∈S {ǫs
i}. Expanding the
term E{V(Q(t))}, then for any time slot τ,
E{L(Q(τ+ 1)) L(Q(τ))}+V·E{f(τ) + g(τ)}
KǫcX
j∈C
EQc
j(τ)ǫsX
i∈S
E{Qs
i(τ)}+V(f+g)
(37)
Next, summing over τ∈ {0,1,2, . . . , t 1}for some t > 0,
we have
E{L(Q(t)) L(Q(0))}+V·
t1
X
τ=0
E[f(τ) + g(τ)]
t·Kǫc
t1
X
τ=0 X
j∈C
EQc
j(τ)ǫs
t1
X
τ=0 X
i∈S
E{Qs
i(τ)}+
t·V·(f+g)
(38)
By re-arrangement of terms at both sides and ignoring some
non-negative term such as E{L(Q(t))}and EQc
j(t), with
ǫc, ǫs>0and V > 0, we have
V·
t1
X
τ=0
E[f(τ) + g(τ)]
t·V·(f+g) + t·K+E{L(Q(0))}
(39)
ǫc·
t1
X
τ=0 X
j∈C
EQc
j(τ)
t·V·(f+g) + t·K+E{L(Q(0))}
(40)
ǫs·
t1
X
τ=0 X
i∈S
E{Qs
i(τ)}
t·V·(f+g) + t·K+E{L(Q(0))}
(41)
Then by dividing both sides of (39) by V·t, (40) by ǫc·t,
and (41) by ǫs·t, we have
1
t·
t1
X
τ=0
E[f(τ)]
(f+g) + K
V+E{L(Q(0))}
V·t
(42)
1
t
t1
X
τ=0 X
j∈C
EQc
j(τ)
V·(f+g) + K
ǫc+E{L(Q(0))}
ǫc·t
(43)
1
t
t1
X
τ=0 X
i∈S
E{Qs
i(τ)}
V·(f+g) + K
ǫs+E{L(Q(0))}
ǫs·t
(44)
At last, taking the limit as t→ ∞ for both equations, we have
the desired results:
lim
t→∞
1
t·
t1
X
τ=0
E[f(τ) + g(τ)] f+g+K
V(45)
lim
t→∞
1
t
t1
X
τ=0 X
j∈C
EQc
j(τ)V·(f+g) + K
ǫc(46)
lim
t→∞
1
t
t1
X
τ=0 X
i∈S
E{Qs
i(τ)} ≤ V·(f+g) + K
ǫs(47)
By setting ǫ= min{ǫc, ǫs}, the following desired result
holds
lim sup
t→∞
1
t
t1
X
τ=0
X
j∈C
EQc
j(τ)+X
i∈S
E{Qs
i(τ)}
K+V·(f+g)
ǫ(48)
... A realistic, in-band control plane channel was realized at all times. To realize the switching plane, a number of interconnected Open vSwitch 2 v2.8.2 virtual switches were [38], [39]. Each switch instance in the Internet2 topology is allocated a client, while the fat-tree topology hosts clients at the leaf switches. ...
... The links of the fat- tree topology only posses the inherit processing and queuing delays. The arrival rates of the incoming service embedding re- quests were modeled using a negative exponential distribution [39]. In the fat-tree topology, each leaf-switch was connected to 2 client instances, bringing the total number of clients up to 16. ...
... The resulting controller placement is depicted in Fig. 4a. The SDN controller replicas in the data- center topology were deployed on the leaf-nodes, similar to the controller placement presented in [39]. [5,10,15,20] [s] Availability failure rate ...
Article
Full-text available
Current approaches to tackling the single point of failure in SDN entail a distributed operation of SDN controller instances. Their state synchronization process is reliant on the assumption of a correct decision-making in the controllers. Successful introduction of SDN in the critical infrastructure networks also requires catering to the issue of unavailable, unreliable (e.g. buggy) and malicious controller failures. We propose MORPH, a framework tolerant to unavailability and Byzantine failures, that distinguishes and localizes faulty controller instances and appropriately reconfigures the control plane. Our controller-switch connection assignment leverages the awareness of the source of failure to optimize the number of active controllers and minimize the controller and switch reconfiguration delays. The proposed re-assignment executes dynamically after each successful failure identification. We require 2FM +FA+1 controllers to tolerate FM malicious and FA availability-induced failures. After a successful detection of FM malicious controllers, MORPH reconfigures the control plane to require a single controller message to forward the system state. Next, we outline and present a solution to the practical correctness issues related to the statefulness of the distributed SDN controller applications, previously ignored in the literature. We base our performance analysis on a resource-aware routing application, deployed in an emulated testbed comprising up to 16 controllers and up to 34 switches, so to tolerate up to 5 unique Byzantine and additional 5 availability-induced controller failures (a total of 10 unique controller failures). We quantify and highlight the dynamic decrease in the packet and CPU load and the response time after each successful failure detection.
... Online optimizations [21]- [25] are able to outperform their static (or one-off) counterparts [11]- [20] in systems with rich variations. In [21]- [23], dynamic association of controllers and switches was developed based on OpenFlow 1.3, typically under the assumption of one-off placement of controllers. Every controller remains active and processes flow requests from the switches in its domain, despite some of the controllers are lightly loaded and could be switched off for energy and cost savings [21]- [23]. ...
... In [21]- [23], dynamic association of controllers and switches was developed based on OpenFlow 1.3, typically under the assumption of one-off placement of controllers. Every controller remains active and processes flow requests from the switches in its domain, despite some of the controllers are lightly loaded and could be switched off for energy and cost savings [21]- [23]. Only a few heuristics [24], [25] have been designed to periodically turn on/off controllers and update the association of controllers and switches in a centralized manner, adapting to the variations of requests. ...
... Dynamic controller-switch association was studied in [21]- [23] to balance workloads by adaptively reassigning the switches to the controllers, under the assumption that the controllers remain active once turned on. In [21], the problem was solved in two phases, where the results of a stable matching between switches and controllers in the first phase were regarded as the input of the coalition game in the second phase to achieve a Nash equilibrium. ...
Article
Full-text available
Decentralized orchestration of the control plane is critical to the scalability and reliability of software-defined network (SDN). However, existing orchestrations of SDN are either one-off or centralized, and would be inefficient the presence of temporal and spatial variations in traffic requests. In this paper, a fully distributed orchestration is proposed to minimize the time-average cost of SDN, adapting to the variations. This is achieved by stochastically optimizing the on-demand activation of controllers, adaptive association of controllers and switches, and real-time request processing and dispatching. The proposed approach is able to operate at multiple timescales for activation and association of controllers, and request processing and dispatching, thereby alleviating potential service interruptions caused by orchestration. A new analytic framework is developed to confirm the asymptotic optimality of the proposed approach in the presence of non-negligible signaling delays between controllers. Corroborated from extensive simulations, the proposed approach can save up to 73% the time-average operational cost of SDN, as compared to the existing static orchestration.
... The controllers in the Internet2 scenario were placed so to maximize the system coverage against failures as per [6], [10]. The controllers of the fat-tree topology were placed on the leaf-nodes as per [6], [11]. The state synchronization between the controllers and the resulting switch reconfigurations occur in in-band control mode. ...
... The arrivals for incoming service requests are modeled using n.e.d. [11]. ...
Conference Paper
Full-text available
Distributed Software Defined Networking (SDN) controllers aim to solve the issue of single-point-of-failure and improve the scalability of the control plane. Byzantine and faulty controllers, however, may enforce incorrect configurations and thus endanger the control plane correctness. Multiple Byzantine Fault Tolerance (BFT) approaches relying on Replicated State Machine (RSM) execution have been proposed in the past to cater for this issue. The scalability of such solutions is, however, limited. Additionally, the interplay between progressing the state of the distributed controllers and the consistency of the external reconfigurations of the forwarding devices has not been thoroughly investigated. In this work, we propose an agreement-and-execution group-based approach to increase the overall through-put of a BFT-enabled distributed SDN control plane. We adapt a proven sequencing-based BFT protocol, and introduce two optimized BFT protocols that preserve the uniform agreement, causality and liveness properties. A state-hashing approach which ensures causally ordered switch reconfigurations is proposed, that enables an opportunistic RSM execution without relying on strict sequencing. The proposed designs are implemented and validated for two realistic topologies, a path computation application and a set of KPIs: switch reconfiguration (response) time, signaling overhead, and acceptance rates. We show a clear decrease in the system response time and communication overhead with the proposed models, compared to a state-of-the-art approach.
... The controllers in the Internet2 scenario were placed so to maximize the system coverage against failures as per [6], [10]. The controllers of the fat-tree topology were placed on the leaf-nodes as per [6], [11]. The state synchronization between the controllers and the resulting switch reconfigurations occur in in-band control mode. ...
... The arrivals for incoming service requests are modeled using n.e.d. [11]. ...
Preprint
Full-text available
Distributed Software Defined Networking (SDN) controllers aim to solve the issue of single-point-of-failure and improve the scalability of the control plane. Byzantine and faulty controllers, however, may enforce incorrect configurations and thus endanger the control plane correctness. Multiple Byzantine Fault Tolerance (BFT) approaches relying on Replicated State Machine (RSM) execution have been proposed in the past to cater for this issue. The scalability of such solutions is, however, limited. Additionally, the interplay between progressing the state of the distributed controllers and the consistency of the external reconfigurations of the forwarding devices has not been thoroughly investigated. In this work, we propose an agreement-and-execution group-based approach to increase the overall throughput of a BFT-enabled distributed SDN control plane. We adapt a proven sequencing-based BFT protocol, and introduce two optimized BFT protocols that preserve the uniform agreement, causality and liveness properties. A state-hashing approach which ensures causally ordered switch reconfigurations is proposed, that enables an opportunistic RSM execution without relying on strict sequencing. The proposed designs are implemented and validated for two realistic topologies, a path computation application and a set of KPIs: switch reconfiguration (response) time, signaling overhead, and acceptance rates. We show a clear decrease in the system response time and communication overhead with the proposed models, compared to a state-of-the-art approach.
... The resulting controller placement is depicted in Fig. 3a. SDN controller replicas in the data-center topology are assumed to run on the leaf-nodes, deployed as virtual machines (VMs) (Fig. 3b), similar to the controller placement presented in [33]. ...
... (a) Internet2 topology [31] (b) Fat-tree topology [33] We model the arrival rates of the incoming service embed- ding requests using a negative exponential distribution [34]. To emphasize the effects of the EC on the quality of decision- making in the SDN-LB application, we distribute the total request load non-uniformly across the controller replicas. ...
Article
Full-text available
Scalability of the control plane in a Software Defined Network (SDN) is enabled by means of decentralization of the decision-making logic, i.e. by replication of controller functions to physically or virtually dislocated controller replicas. Replication of a centralized controller state also enables the protection against controller failures by means of primary and backup replicas responsible for managing the underlying SDN data plane devices. In this work, we investigate the effect of the the deployed consistency model on scalability and correctness metrics of the SDN control plane. In particular, we compare the strong and eventual consistency, and make a case for a novel adaptive consistency approach. The existing controller platforms rely on either strong or eventual consistency mechanisms in their state distribution. We show how an adaptive consistency model offers the scalability benefits in terms of the total requesthandling throughput and response time, in contrast to the strong consistency model. We also outline how the adaptive consistency approach can provide for correctness semantics, that are unachievable with the eventual consistency paradigm in practice. The adaptability of our approach provides a balanced and tunable trade-off of scalability and correctness for the SDN application implemented on top of the adaptive framework. To validate our assumptions, we evaluate and compare the different approaches in an emulated testbed with an example of a load balancer controller application. The experimental setup comprises up to five extended OpenDaylight controller instances and two network topologies from the area of service provider and data center networks.
... They also reformulate DCAP as an online optimization to minimize the total cost [27]. Huang et al. [28] propose a novel scheme to jointly consider both static and dynamic switch-controller association and devolution. ProgrammabilityGuardian [29] improves the path programmability of offline flows and maintains low communication overhead by using a middle layer to establish the fine-grained flow-controller mappings. ...
Article
Software-Defined Networking (SDN) provides great opportunities to improve the network performance of Wide Area Networks (WANs). In Software-Defined WANs (SD-WANs), SDN controllers dynamically route flows based on network status by managing underlying switches. However, under controller failures in SD-WANs, existing solutions are unadaptable and thus cannot efficiently map offline switches, which were controlled by failed controllers, to active controllers. Thus, flows, which traverse offline switches, become offline and lose their programmability, which means they cannot be rerouted to accommodate to traffic variation. Consequently, the network programmability degrades. In this paper, we propose Matchmaker, an adaptive solution to recover offline flows under controller failures in SD-WANs. Matchmaker smartly changes the paths of some offline flows to adjust the control cost of offline switches based on given control ability of active controllers. As a result, Matchmaker can efficiently map offline switches to active controllers and increase the number of recovered flows. The simulation results show that Matchmaker outperforms existing solutions by increasing the number of recovered offline flows up to 45% under ATT topology and up to 77% under Belnet topology.
Article
A distributed control plane is more scalable and robust in software defined networking. This paper focuses on controller load balancing using packet-in request redirection, that is, given the instantaneous state of the system, determining whether to redirect packet-in requests for each switch, such that the overall control plane response time (CPRT) is minimized. To address the above problem, we propose a framework based on Lyapunov optimization. First, we use the drift-plus-penalty algorithm to combine CPRT minimization problem with controller capacity constraints, and further derive a non-linear program, whose optimal solution is obtained with brute force using standard linearization techniques. Second, we present a greedy strategy to efficiently obtain a solution with a bounded approximation ratio. Third, we reformulate the program as a problem of maximizing a non-monotone submodular function subject to matroid constraints. We implement a controller prototype for packet-in request redirection, and conduct trace-driven simulations to validate our theoretical results. The results show that our algorithms can reduce the average CPRT by 81.6% compared to static assignment, and achieve a 3× improvement in maximum controller capacity violation ratio.
Article
Full-text available
The biggest challenge for the network service providers is the day to day advancement of technologies which makes them difficult to manage the traditional networks. This day to day advancement has worked as a motivation to vendors for developing, deploying and migrating their services, installments of new hardware, trained people and up gradation of infrastructure which involves a huge cost and time. These frequent changes demand a new network architecture which supports future technologies and solves all these issues named as the proposal of networks defined by software. A large amount of data is being generated and through the internet, we interact with the world using our smart devices such as tablets, sensors, and smartphones using the concepts of Internet of Things (IoT). Along with continuous growth and development, there is a continuous heterogenous and ever-increasing demands of services. This leads to a cause of emerging challenge of load balancing of networks for meeting up with highly demanding requirements (e.g., high performance, lower latency, high throughput, and high availability) of IoT and 5G network applications. For meeting up highly increasing demands, various proposal of load balancing techniques comes forward, in which highly dedicated balancers of loads are being required for ever service in some of them, or for every new service, manual recognition of device is required. In the conventional network, on the basis of the local information in the network, load balancing is being established. However, the production of more optimized load balancers and a global view for the network is being contained by SDN controllers. So, these well-known techniques are quite time-consuming, expensive and impractical as well as service types aren’t being considered by various existing load balancing schemes. Through this paper, researchers focus on an SDN based load balancing (SBLB) service, in which minimized response time and maximized resource utilization are being considered for the user on cloud servers. The proposed scheme is being constituted by an application module which runs along with a SDN controller and server pools that connect to the controller through SDN enabled switches. The application module contains a dynamic load balancing module, a monitoring module and a service classification module. All messages are being handled in real time and host pool are being maintained by the Controller. The performance of the proposed scheme has been validated by experimental results. Through various experiments, results are being concluded that usage of SBLB results in significant decrease in average response and reply time.
Conference Paper
In software-defined networking (SDN) systems, the scalability and reliability of the control plane still remain as major concerns. Existing solutions adopt either multi-controller designs or control devolution back to the data plane. The former requires a flexible yet efficient switch-controller association mechanism to adapt to workload changes and potential failures, while the latter demands timely decision making with low overheads. The integrate design for both is even more challenging. Meanwhile, the dramatic advancement in machine learning techniques has boosted the practice of predictive scheduling to improve the responsiveness in various systems. Nonetheless, so far little work has been conducted for SDN systems. In this paper, we study the joint problem of dynamic switch-controller association and control devolution, while investigating the benefits of predictive scheduling in SDN systems. We propose POSCAD, an efficient, online, and distributed scheme that exploits predictive future information to minimize the total system cost and the average request response time with queueing stability guarantee. Theoretical analysis and trace-driven simulation results show that POSCAD requires only mild-value of future information to achieve a near-optimal system cost and near-zero average request response time. Further, POSCAD is robust against mis-prediction to reduce the average request response time.
Article
Full-text available
The advent of software defined networking enables flexible, reliable and feature-rich control planes for data center networks. However, the tight coupling of centralized control and complete visibility leads to a wide range of issues among which scalability has risen to prominence. To address this, we present LazyCtrl, a novel hybrid control plane design for data center networks where network control is carried out by distributed control mechanisms inside independent groups of switches while complemented with a global controller. Our design is motivated by the observation that data center traffic is usually highly skewed and thus edge switches can be grouped according to traffic locality. LazyCtrl aims at bringing laziness to the global controller by dynamically devolving most of the control tasks to independent switch groups to process frequent intra-group events near datapaths while handling rare inter-group or other specified events by the controller. We implement LazyCtrl and build a prototype based on Open vSwich and Floodlight. Trace-driven experiments on our prototype show that an effective switch grouping is easy to maintain in multi-tenant clouds and the central controller can be significantly shielded by staying lazy, with its workload reduced by up to 82%.
Conference Paper
Full-text available
OpenFlow is a great concept, but its original design imposes excessive overheads. It can simplify network and traffic management in enterprise and data center environments, because it enables flow-level control over Ethernet switching and provides global visibility of the flows in the network. However, such fine-grained control and visibility comes with costs: the switch-implementation costs of involving the switch's control-plane too often and the distributed-system costs of involving the OpenFlow controller too frequently, both on flow setups and especially for statistics-gathering. In this paper, we analyze these overheads, and show that OpenFlow's current design cannot meet the needs of high-performance networks. We design and evaluate DevoFlow, a modification of the OpenFlow model which gently breaks the coupling between control and global visibility, in a way that maintains a useful amount of visibility without imposing unnecessary costs. We evaluate DevoFlow through simulations, and find that it can load-balance data center traffic as well as fine-grained solutions, without as much overhead: DevoFlow uses 10--53 times fewer flow table entries at an average switch, and uses 10--42 times fewer control messages.
Conference Paper
Cloud computing realises the vision of utility computing. Tenants can benefit from on-demand provisioning of computational resources according to a pay-per-use model and can outsource hardware purchases and maintenance. Tenants, however, have only limited ...
Conference Paper
Distributed controllers have been proposed for Software Defined Networking to address the issues of scalability and reliability that a centralized controller suffers from. One key limitation of the distributed controllers is that the mapping between a switch and a controller is statically configured, which may result in uneven load distribution among the controllers. To address this problem, we propose ElastiCon, an elastic distributed controller architecture in which the controller pool is dynamically grown or shrunk according to traffic conditions and the load is dynamically shifted across controllers. We propose a novel switch migration protocol for enabling such load shifting, which conforms with the Openflow standard. We also build a prototype to demonstrate the efficacy of our design.
Conference Paper
OpenFlow assumes a logically centralized controller, which ideally can be physically distributed. However, current deployments rely on a single controller which has major drawbacks including lack of scalability. We present HyperFlow, a distributed event-based control plane for OpenFlow. HyperFlow is logically centralized but physically distributed: it provides scalability while keeping the benefits of network control centralization. By passively synchronizing network-wide views of OpenFlow controllers, HyperFlow localizes decision making to individual controllers, thus minimizing the control plane response time to data plane requests. HyperFlow is resilient to network partitioning and component failures. It also enables interconnecting independently managed OpenFlow networks, an essential feature missing in current OpenFlow deployments. We have implemented HyperFlow as an application for NOX. Our implementation requires minimal changes to NOX, and allows reuse of existing NOX applications with minor modifications. Our preliminary evaluation shows that, assuming sufficient control bandwidth, to bound the window of inconsistency among controllers by a factor of the delay between the farthest controllers, the network changes must occur at a rate lower than 1000 events per second across the network.
Conference Paper
Although there is tremendous interest in designing improved networks for data centers, very little is known about the network-level traffic characteristics of data centers today. In this paper, we conduct an empirical study of the network traffic in 10 data centers belonging to three different categories, including university, enterprise campus, and cloud data centers. Our definition of cloud data centers includes not only data centers employed by large online service providers offering Internet-facing applications but also data centers used to host data-intensive (MapReduce style) applications). We collect and analyze SNMP statistics, topology and packet-level traces. We examine the range of applications deployed in these data centers and their placement, the flow-level and packet-level transmission properties of these applications, and their impact on network and link utilizations, congestion and packet drops. We describe the implications of the observed traffic patterns for data center internal traffic engineering as well as for recently proposed architectures for data center networks.