Content uploaded by Bian Simeng

Author content

All content in this area was uploaded by Bian Simeng on Apr 17, 2017

Content may be subject to copyright.

arXiv:1702.03065v4 [cs.NI] 3 Mar 2017

Dynamic Switch-Controller Association and

Control Devolution for SDN Systems

Xi Huang1, Simeng Bian1, Ziyu Shao1, Hong Xu2

1School of Information Science and Technology, ShanghaiTech University

2NetX Lab @ City University of Hong Kong

Email: {huangxi,biansm,shaozy}@shanghaitech.edu.cn, henry.xu@cityu.edu.hk

Abstract—In software-deﬁned networking (SDN), as data plane

scale expands, scalability and reliability of the control plane

have become major concerns. To mitigate such concerns, two

kinds of solutions have been proposed separately. One is multi-

controller architecture, i.e., a logically centralized control plane

with physically distributed controllers. The other is control

devolution, i.e., delegating control of some ﬂows back to switches.

Most of existing solutions adopt either static switch-controller

association or static devolution, which may not adapt well to the

trafﬁc variation, leading to high communication costs between

switches and controller, and high computation costs of switches.

In this paper, we propose a novel scheme to jointly consider both

solutions, i.e., we dynamically associate switches with controllers

and dynamically devolve control of ﬂows to switches. Our scheme

is an efﬁcient online algorithm that does not need the statistics

of trafﬁc ﬂows. By adjusting some parameter V, we can make a

trade-off between costs and queue backlogs. Theoretical analysis

and extensive simulations show that our scheme yields much

lower costs and latency compared to static schemes, and balanced

loads among controllers.

I. INT RO DUC TI ON

Software-deﬁned networking (SDN) holds great promises

to improve network performance and management. The key

idea of SDN is to decouple the control plane from the

data plane [1]. Data plane can focus on performing basic

functionalities such as packet forwarding at high speed, while

the logically centralized control plane manages the network.

Usually, switches at data plane send requests to control plane

for processing some ﬂow events, e.g. ﬂow-install events.

The control plane is a potential bottleneck of SDN in terms

of scalability and reliability. As the data plane expands, control

plane may not be able to process the increasing number of

requests if implemented with a single controller, resulting

unacceptable latency to ﬂow setup. Reliability is also an issue

since a single controller is a single point of failure , resulting

in disastrous break-down of the control plane and the network.

Existing proposals to address such problems fall broadly

into two categories. One is to implement the control plane

as a distributed system with multiple controllers [2] [3].

Each switch then associate with certain controllers for fault-

tolerance and load balancing [4] [5] [6] [7]. The other is to

devolve partial loads of request processing from controllers to

switches [8] [9] [10], reducing the work load of controllers.

For switch-controller association, the ﬁrst category of solu-

tion, initial design choice is to make a static switch-controller

association [2] [3]. However, such static association may result

in overloading of controllers and increasing ﬂow setup latency

due to its inﬂexibility to handle trafﬁc variations. An elastic

distributed controller architecture is proposed in [5], with

an efﬁcient protocol to migrate switches across controllers.

However, it remains open to determine the switch-controller

association. Then [6] took a step further by formulating

the switch-controller association problem as a deterministic

optimization problem, i.e., an integer linear problem with

prohibitively high computational complexity. A local search

algorithm was proposed to ﬁnd suboptimal associations within

a given time limit (e.g., 30 seconds). In [7], the controller

is assumed to be modeled as M/M/1 queue (Poisson arrival

and exponential service). Under such assumption, the switch-

controller association problem with steady-state objective

function was formulated as a many-to-one stable matching

problem with transfers. Then a novel two-phase algorithm was

proposed to connect stable matching with utility-based game

theoretic solutions, i.e., coalition formation game with Nash

stable solutions.

For control devolution, the second category of solution, an

obvious design choice is static devolution for certain functions

and ﬂows [8] [9] [10]; i.e., switches locally process requests

that do not require network-wide state, such as link-layer

discovery service. Static devolution mitigates the loads on

control plane in some ways, but it could be inﬂexible in face

of trafﬁc variations. As an alternative, dynamic devolution

allows switches to decide processing those requests locally or

uploading them to control plane, depending on their amounts

of loads at the moment. However, the design of dynamic

devolution with respect to trafﬁc variations remains open.

Then several interesting questions are raised and answers

to such questions will deﬁnitely shape our design for SDN

networks:

•Instead of deterministic switch-controller association with

infrequent re-association [6] [7], can we directly perform

dynamic switch-controller association with respect to the

trafﬁc variation? What is the beneﬁt that we can obtain

from a ﬁne-grained control at the request level?

•How to perform dynamic devolution?

•How to make a trade-off between dynamic switch-

controller association and dynamic control devolution?

In this paper, we consider a general SDN network with

trafﬁc variations, resulting variations of requests to handle with

ﬂow events. We assume each request can be either processed

at switch (incurs computation costs) or be uploaded to cer-

tain controllers (incurs communication costs) 1. We aim at

reducing the computational cost by control devolution at data

plane, the communication cost by switch-controller association

between data plane and control plane, and the response time

1The scenario that some requests can only be processed by controller is a

straightforward extension of our model.

experienced by switches, which is mainly caused by queueing

delay on controllers. Under such settings, we provide a new

perspective and a novel scheme to answer those questions. Our

key results and contributions are summarized as follows:

•Problem Formulation: According to the best of our

knowledge, this paper is the ﬁrst to study the joint

optimization problem of dynamic switch-controller asso-

ciation and dynamic control devolution.

•Finer Granularity Control: According to the best of

our knowledge, this paper is the ﬁrst to perform the

control decisions at the granularity of request-level. Note

that request-level information such as time-varying queue

backlog sizes and number of request arrivals presents

the actual time-varying state of data plane. Hence it’d

help for more accurate decision making of dynamic

association and dynamic devolution when compared to

coarse-grained control.

•Online Algorithm Design We formulate a stochastic

network optimization problem, aiming at minimizing

the long-term average sum of communication cost and

computational cost, while keeping time-average queue

backlogs of both switches and controllers small2. By

employing Lyapunov drift technique [11] and exploiting

sub-problems structure, we develop an efﬁcient greedy

algorithm to achieve optimality asymptotically. Our al-

gorithm is online, which means it does not need the

statistics of trafﬁc workloads and does not need the prior

assumption of trafﬁc distribution.

•Algorithm Analysis: We show that our algorithm yields a

tunable trade-off between O(1/V )deviation from min-

imum long-term average sum of communication cost

and computational cost and O(V)bound for average

queue backlogs. We also ﬁnd that the positive parameter

Vdetermines the switches’ willingness of uploading

requests to controllers, i.e., performing switch-controller

association.

•Simulation: We conduct large-scale trace-driven simula-

tions to evaluate the performance of our algorithm within

two widely adopted data center networking topologies,

i.e., Canonical 3-Tiered topology and Fat-tree topology.

Simulation results verify the effectiveness and the trade-

off of our algorithm. In addition, in the extreme case

that without control devolution, we compare our dy-

namic association scheme with other association schemes

including Static, Random, and JSQ (Join-the-Shortest-

Queue). Simulation results verify the advantages of our

scheme.

We organize the rest of paper as follows. We present the

basic idea and formulation in Section II. Then we show our

algorithm design and corresponding performance analysis in

Section III. In Section IV, we present simulation results. We

conclude this paper in Section V.

II. PROB LE M FO RM ULATI ON

In this section, we ﬁrst provide a motivating example for

the dynamic switch-controller association and dynamic control

devolution. Then we introduce the system model and formulate

the problem.

2By applying Little’s law, small queue backlog implies small queueing

delay or short response time.

Associated with

Accessible to

Control plane

Data plane

s1

s2

s3

c1

c2

(a)

Scheduler

Control planeData plane

s1

s2

s3

c1

c2

(b)

Scheduler Control planeData plane

s1

s2

s3

c1

c2

(c)

Scheduler

Control planeData plane

s1

s2

s3

c1

c2

(d)

Scheduler Control planeData plane

s1

s2

s3

c1

c2

(e)

Scheduler

s1

Fig. 1. An example that shows the request-level scheduling process. There

are 3switches(s1, s2, s3),2controllers(c1, c2), and 1global scheduler. Each

switch or controller maintains a queue that buffers requests. During each time

slot, each controller can serve 2requests while each switch can serve only 1

request. There is a communication cost per request if switches upload requests

to controllers, and a computational cost (2per request on each switch) of

locally processing by switches themselves. At the beginning of time slot t,

s1,s2, and s3generates 3,2, and 2requests, respectively. The scheduler then

collects system dynamics and decides a switch-controller association (could

be (b) or (c)), aiming at minimizing the sum of communication cost (could

be the number of hops, RTTs, etc.) and computational cost, as well as request

queueing delay. Each switch chooses to either locally process its requests or

send them to controllers according to the scheduling decision.

A. Motivating Example

The example of dynamic association and devolution is

shown in Fig. 1.

First, we focus on the behavior of s3. In Fig. 1 (b), s3

chooses to process its requests locally, and that incurs a

computational cost of 2 per request. In Fig. 1 (c), s3decides

to upload requests to c2and that incurs a communication cost

of 3 per request. Although the computational cost is less than

communication cost, the decision of local processing leaves

one request not processed yet at the end of the time slot.

Hence, it is not necessarily a smart decision for a switch

to perform control devolution when its computational cost

is lower than its communication cost. Instead, the scheduler

should jointly decide control devolution and switch-controller

association at the same time.

Second, we focus on the behavior of associations. Fig. 1

(b) and (c) show two different associations. Fig. 1 (b) shows

the switch-controller association with (s1, c1)and (s2, c1)(s3

processes requests locally), denoted by X1.X1incurs the total

cost of communication and computation by 9but results in

uneven queue backlogs, leaving four requests unﬁnished after

time slot t. Fig. 1 (c) shows another association with (s1, c1)

and (s3, c2)(s2processes requests locally), denoted by X2.

X2incurs the total cost by 13 but does better in balancing

queue backlogs. Thus there is a non-trivial trade-off between

minimizing the total cost of communication and computation

and maintaining small queue backlogs in each controller.

B. Problem Formulation

We consider a time slotted network system, indexed by

{0,1,2, . . . }. Its control plane comprises a set Cof physically

distributed controllers, while its data plane consists of a set of

switches S. Each switch i∈ S keeps a queue backlog of size

Qs

i(t)for locally processing requests, while each controller

j∈ C maintains a queue backlog Qc

j(t)that buffers requests

from data plane. We denote [Qc

1(t), . . . , Qc

|C|(t)] as Qc(t)

and [Qs

1(t), . . . , Qs

|S| (t)] as Qs(t). We use Q(t)to denote

[Qs(t),Qc(t)].

At the beginning of time slot t, each switch i∈ S generates

some amounts 0≤Ai(t)≤amax of requests. Then each

switch could choose to process its requests either locally or

by sending to its associated controller. We assume that each

switch i∈ S has a service rate 0≤Ui(t)≤umax to

process the devoluted requests, while each controller j∈ C

has an available service rate 0≤Bj(t)≤bmax. We denote

[A1(t), . . . , A|S |(t)] as A(t),[B1(t), . . . , B|C |(t)] as B(t),

and [U1(t), . . . , U|S| (t)] as U(t). For i∈ S and j∈ C, we

assume that all Ai(t),Bj(t), and Ui(t)are i.i.d.; besides, their

ﬁrst and second raw moments are all ﬁnite.

Then the scheduler collects system dynamics information

(A(t),B(t),Q(t)) during current time slot and makes a

scheduling decision, denoted by an association matrix X(t)∈

{0,1}|S|×|C |. Here X(t)i,j = 1 if switch iwill be associated

with controller jduring current time slot and 0otherwise.

An association is feasible if it guarantees that each switch is

associated with at most one controller during each time slot.

We denote the set of feasible associations as A,

A=

X∈ {0,1}|S|×|C | |X

j∈C

Xi,j ≤1for i∈ S

(1)

According to the scheduling decision, each switch isends its

request to controller jif Xi,j = 1. However, if Pj∈C Xi,j =

0, switch iappends its requests to local queue backlog. Then

both switches and controllers serve as many requests in their

queues as they could3. As a result, the update equation for

Qs

i(t)at switch iis

Qs

i(t+ 1) =

Qs

i(t) +

1−X

j∈C

Xi,j (t)

·Ai(t)−Ui(t)

+

(2)

and the update equation for Qc

j(t)at controller jis given by

Qc

j(t+ 1) = "Qc

j(t) + X

i∈S

Xi,j (t)·Ai(t)−Bj(t)#+

(3)

where [x]+= max(x, 0).

Having covered the necessary notations and queueing dy-

namics, we switch to the objective and constraints of our

problem.

1) Time-Average Communication Cost: We deﬁne the com-

munication cost between switch iand controller jas Wi,j

4. Accordingly, we have a communication cost matrix W=

3Note that we do not ﬁx the serving principle, which can be FIFO,LIFO,

etc. But in our simulation, we use FIFO.

4The communication cost can be the number of hops or round trip times

(RTT).

{Wi,j }. Fixing some association X∈ A, the communication

cost within one time slot is

fX(t) = ˆ

f(X,A(t)) ,X

j∈C X

i∈S

Wi,j ·Xi,j ·Ai(t)(4)

where we can view Wi,j as the price of transmitting one

request from switch ito controller j. Then, given a series

of associations {X0,X1,...,Xt−1}, the time-average com-

munication cost is shown as follows

¯

f(t),1

t

t−1

X

τ=0

E{fXτ(τ)}(5)

2) Time-average Computational Cost: There is a computa-

tional cost αifor each devoluted request to iwhen switch i

appends its requests to its local queue backlog for processing.

Given some association X∈ A, we deﬁne the one-time-slot

computational cost as

gX(t) = ˆg(X,A(t)) ,X

i∈S

αi·

1−X

j∈C

Xi,j

·Ai(t)

(6)

Given a series of associations {X0,X1,...,Xt−1}, the time-

average computational cost is

¯g(t),1

t

t−1

X

τ=0

E{gXτ(τ)}(7)

3) Queueing Stability: A queueing process {Q(t)}is sta-

ble, if the following condition holds:

lim

t→∞

1

t

t−1

X

τ=0

E{Q(τ)}<∞(8)

Accordingly, Qc(t)is stable if

lim

t→∞

1

t

t−1

X

τ=0 X

j∈C

EQc

j(τ)<∞(9)

and Qs(t)is stable if

lim

t→∞

1

t

t−1

X

τ=0 X

i∈S

E{Qs

i(τ)}<∞(10)

Queueing stability implies that both switches and controllers

would process buffered requests timely, so that queueing delay

is controlled within a small range.

Consequently, our problem formulation is given as follows

Minimize

X(t)∈A for t∈{0,1,2,... }lim

t→∞ sup ¯

f(t) + ¯g(t)

subject to (2),(3),(9),(10).(11)

III. ALG OR ITH M DESI GN A ND PER FOR MA NCE ANA LYS IS

To design a scheduling algorithm that solves problem (11),

we adopt the Lyapunov optimization technique in [11]. Deﬁne

the quadratic Lyapunov function as

L(Q(t)) = 1

2

X

j∈C Qc

j(t)2+X

i∈S

(Qs

i(t))2

(12)

Next, we deﬁne the conditional Lyapunov drift for two

consecutive time slots as

∆ (Q(t)) ,E{L(Q(t+ 1)) −L(Q(t)) |Q(t)}(13)

This conditional difference measures the general change in

queues’ congestion state. We want to push such difference as

low as possible, so as to prevent queues Qs(t)and Qc(t)from

being overloaded. However, to maintain low queue backlogs,

the action we take might incur considerable communication

cost f(t)or computational cost g(t), or both. Hence, we

should jointly consider both queueing stability and f(t) + g(t).

Given any feasible association X, we deﬁne one-time-slot

conditional drift-plus-penalty function as

∆V(Q(t)) ,∆(Q(t)) + V·E{fX(t) + gX(t)|Q(t)}

(14)

where f(t)is deﬁned by (4), g(t)is deﬁned by (6), and V > 0

is a constant that weights the penalty brought by f(t)and g(t).

By minimizing the upper bound of the drift-plus-penalty

expression (14), the time average of communication cost can

be minimized while stabilizing the network of request queues.

We employ the concept of opportunistically minimizing an

expectation in [11], and we transform the long-term stochastic

optimization problem (11) into the following drift-plus-penalty

minimization problem at every time slot t.

Minimize

X∈A V·ˆ

f(X,A(t)) + ˆg(X,A(t))+

X

j∈C

Qc

j(t)·"X

i∈S

Xi,j ·Ai(t)#+

X

i∈S

Qs

i(t)·

(1 −X

j∈C

Xi,j )·Ai(t)

(15)

After rearranging the terms in (15), our optimization prob-

lem turns out to be

Minimize

X∈A X

i∈S

[V αi+Qs

i(t)] Ai(t) + X

i∈S X

j∈C

[V Wi,j +

Qc

j(t)−V αi−Qs

i(t)Xi,j Ai(t)

(16)

Since the ﬁrst term Pi∈S [V αi+Qs

i(t)] in (16) has nothing

to do with X, then we ﬁx it as constant and put our focus on

minimizing the second term of (16) only. For each i∈ S, we

split Cinto two disjoint sets Ji

1and Ji

2, i.e. Ji

1˙

SJi

2=C.

Ji

1={j∈ C | V Wi,j +Qc

j(t)> V αi+Qs

i(t)}(17)

Ji

2={j∈ C | V Wi,j +Qc

j(t)≤V αi+Qs

i(t)}(18)

Note that X∈ {0,1}|S|×|C| and for switch i,Pj∈C Xi,j ≤1.

Therefore, for each switch i∈ S,

X

j∈C V Wi,j +Qc

j(t)−V αi−Qs

i(t)Xi,j Ai(t)

=

X

j∈J i

1V Wi,j +Qc

j(t)−V αi−Qs

i(t)Xi,j +

X

j∈J i

2V Wi,j +Qc

j(t)−V αi−Qs

i(t)Xi,j

Ai(t)

(19)

Next we show how to minimize (19). Given (i, j )∈ S × C,

we deﬁne

ω(i, j) = V Wi,j +Qc

j(t)−V αi−Qs

i(t)(20)

Then we consider two different cases.

•If Ji

2=∅, i.e., ω(i, j)>0for all j∈ C, then the only

way to minimize (19) is setting X∗

i,j = 0 for all j∈ C.

•If Ji

26=∅, then we handle with Xi,j for j∈ J i

1and

j∈ J i

2separately.

–For j∈ J i

1, it is not hard to see X∗

i,j = 0 for j∈ J i

1.

–For j∈ J i

2,ω(i, j)≤0. Then we should make

X∗

i,j∗= 1 for such j∗that

j∗= arg min

j∈J i

2

ω(i, j)(21)

and X∗

i,j = 0 for j∈ J i

2− {j∗}. In such a way, given

any X′∈ A, for switch ithe following always holds

X

j∈J i

1

ω(i, j)·X′

i,j +X

j∈J i

2

ω(i, j)·X′

i,j

≥

X

j∈J i

1

ω(i, j)

·0 + min

j∈J i

2

ω(i, j)

=X

j∈J i

1

ω(i, j)·X∗

i,j +X

j∈J i

2

ω(i, j)·X∗

i,j

(22)

Therefore, the above process shows us how to minimize (16).

Then we have the algorithm shown as follows:

Algorithm 1 Greedy Scheduling Algorithm

Input: During time slot t, the scheduler collects queue lengths

information from individual controllers and switches, i.e.

Qc(t),Qs(t), and A(t)

Output: A scheduling association X ⊂ S × C

1: Start with an empty set X ← ∅

2: for each switch i∈ S do

3: Split all controllers Cinto two sets Ji

1and Ji

2, where

Ji

1={j∈ C | ω(i, j )>0}and

Ji

2={j∈ C | ω(i, j )≤0}

4: If Ji

2=∅, then skip current iteration.

5: If Ji

26=∅, then choose controller j∗∈ J i

2such that

j∗∈arg min

j∈C

ω(i, j)

6: X ← X S{(i, j ∗)}

7: end for

8: return X

According to X, switches upload requests to controllers

or append requests to their local queues. Then controllers

and switches update their queue backlogs as in (2) and

(3) after serving requests.

Remarks:

•Our algorithm is greedy since it greedily associates each

switch with controllers that either with small queue

backlog size or close to the switch, and otherwise it leaves

all requests locally processed.

•For switch i, given any controller jfar enough from i,

i.e., Wi,j > αi, switch idecides to upload requests to

jonly if ω(i, j)is non-positive and smaller than any

other. This requires switch iitself holds enough requests

locally, i.e., Qs

i(t)≥V·(Wi,j −αi) + Qc

j(t). Then it

will upload requests. Thus smaller Vwill invoke more

effectively the willingness of switch ito upload requests

to control plane.

•On the other hand, given any controller kclose to switch

i, i.e., Wi,j < αi, switch iwill process requests locally

if control plane holds large amounts of requests, i.e.,

Qs

i(t)< V ·(Wi,j −αi)+Qc

j(t). Thus given any large V,

controllers will have to hold great loads of requests before

switches become willing to process requests locally.

•Therefore, the parameter Vactually controls switches’

willingness of uploading requests to controllers, i.e.,

performing switch-controller association. In other words,

it controls the trade-off between communication cost and

the computational cost, which are incurred by uploading

requests to control plane and locally processing, respec-

tively.

Now we turn to time complexity analysis of our algorithm.

Within each time slot, the algorithm runs |S| iterations in

total. For each switch i, it takes O(|C|)steps to split Cinto

two disjoint sets. If the resulting Ji

26=∅, then the algorithm

needs to calculate ω(i, j )for each controller (|C| controllers

in total) and picks up j∗for the minimum ω(i, j ). For each

switch-controller pair (i, j ), calculating ω(i, j)incurs only

constant time according to (20). Therefore, during each time

slot, our algorithm takes about O(|S| × |C|)time to decide the

optimal switch-controller association. In fact, our algorithm

can also run in a parallel manner: i.e., for each switch i,

given W,Qc(t),Qs

i(t), and Ai(t), the scheduler can decide

its associated controller independently from other switches.

Next we characterize the performance of our algo-

rithm. We suppose g∗and f∗are the inﬁmum of time-

average computational cost and communication cost that we

want to achieve, respectively. We also suppose dmax =

maxi,j (E(B2

j(t)), E(U2

i(t)), E(A2

i(t))). The we have the fol-

lowing theorem5on the O(1/V ), O(V)trade-off between

costs and queue backlogs:

Theorem 1: Given the parameters V > 0,ǫ > 0, and

constant K≥1

2dmax ·(|C| +|S|+|S|2), then the queue vector

process Q(t)is stable; besides, the time-average expectation of

communication cost and computational cost, as well as queue

backlogs on switches and controllers satisfy:

lim sup

t→∞ ¯

f(t) + ¯g(t)≤f∗+g∗+K

V

lim sup

t→∞

1

t

t−1

X

τ=0

X

j∈C

EQc

j(τ)+X

i∈S

E{Qs

i(τ)}

≤K+V·(f∗+g∗)

ǫ

(23)

IV. SIM ULATI ON RES ULTS

A. Basic Settings

Topology: In this section, we evaluate our Greedy schedul-

ing algorithm under two widely adopted topologies: Canonical

3-Tiered topology and Fat-tree [12] topology. We show two of

their instances in Fig. 2 and Fig. 3, respectively.

5See the proof in appendix B.

Core

Aggregate

Edge

Host

Fig. 2. An instance of Canonical 3-Tiered topology with k= 4.kdenotes

the number of core switches. In this paper, the number of aggregate switches

is also set to k, and each connects to k−1edge switches. The total number

of switches is k2+k. Each edge switch is directly connected to k

2hosts.

Therefore, there are k3

−k2

2hosts in total.

Core

Aggregate

Edge

Host

Fig. 3. An instance of Fat-tree topology with k= 4.kdenotes the number

of pods. The number of core, aggregate, edge switches are k2

4,k2

2,k2

2,

respectively. And the total number of switches is 5

4k2. Each edge switch is

directly connected to k

2hosts. Accordingly, there are k3

4hosts in total.

To make our performance analysis comparable in both

topologies, we construct a Fat-tree network and a Canonical

3-Tiered network with almost the same number of switches.

Regarding the Canonical 3-Tiered topology, we set the number

of core switches as 26. Accordingly, the total number of

switches is 702. Regarding the Fat-tree topology, we set the

port number as 24 and thusly there are 720 switches in total.

Note that the two resulting topologies are also comparable to

the size of commercial data centers [13].

In both topologies, we deploy controllers on the hosts (one

controller for every two pods6), which are denoted by the

blue circles in Fig. 2 and Fig. 3.

Trafﬁc Workloads: We conduct trace-driven simulations,

where the ﬂow arrival process on each switch follows the

distribution of ﬂow inter-arrival time in [13], which is drawn

from measurements within real-world data centers. In [13], the

average ﬂow inter-arrival time is about 1700µs. We then set

the length of each time slot as 10ms. Accordingly, the average

ﬂow arrival rate on each switch is about 5.88 ﬂows per time

slot.

In fact, there do exist hot spots within pods in real-world

data center networks, where the switches have signiﬁcantly

high ﬂow arrival rates. In our simulation, we pick the ﬁrst pod

as a hot spot and all switches there have signiﬁcantly high ﬂow

arrival rate, i.e., 200 ﬂows per time slot. As for controllers, we

set their individual capacity as 600 ﬂows per time slot. That is

consistent with the capacity of a typical NOX controller [14].

Costs: Given any network topology, we deﬁne the com-

munication cost Wi,j between switch iand controller jas

the length (number of hops) of shortest path from ito j.

6In Canonical 3-Tiered topology, we regard the group of switches that

afﬁliate the same aggregation switch as one pod (including the aggregation

switch itself).

(a) Total cost vs. V (b) Total queue backlog vs. V

Fig. 4. Performance of Greedy under Fat-tree topology and Canonical 3-

Tiered topology in terms of (a) the sum of total communication cost and

computational cost, and (b) total queue backlog.

(a) Total cost vs. V (b) Total queue backlog vs. V

Fig. 5. Performance of Greedy under Fat-tree topology with request arrivals

that follow Poisson and Pareto process in terms of (a) the sum of total

communication cost and computational cost, and (b) total queue backlog.

Then we set a common computation cost αfor all switches,

which equals to the average hop number between switches and

controllers of its underlying topology. In Fat-tree topology,

α= 4.13; while in 3-Tiered topology, α= 4.81.

B. Evaluation of Greedy Algorithm

Fig. 4(a) presents how the summation of long-term average

communication cost and computational cost changes with

different Vin Fat-tree and 3-Tiered topologies. As Vvaries

from 0to 3.0×107, we can observe that the total cost

goes down gradually. This is consistent with our previous

theoretic analysis. The intuition behind such decline is as

follows. Remind that Vcontrols the switches’ willingness of

uploading requests. For switches that are close to controllers

(their communication cost is less than the average), large

Vmakes them unwilling to process requests locally unless

the controllers get too heavy load. As Vincreases, those

switches will choose to upload requests to further reduce the

costs since for those switches, communication costs are less

than the computation cost. Another observation we make is

that the total cost of 3-Tiered topology is more than Fat-

tree’s. The reason is that 3-Tiered has a higher computational

cost (α= 4.81 compared to 4.13) and it cost more when

switches process requests locally. In Fig. 5(a), we also show

the total cost of Greedy in Fat-tree topology with other two

request arrival processes. The curves of Poisson and Pareto

exhibit qualitatively similar decline in total cost, although

Pareto incurs slightly (∼1.5%) larger queue backlog size than

Poisson.

Fig. 4(b) shows the curve of total queue backlog size with

different values of V. From the ﬁgure, we notice that total

queue backlog size increases until Vreaches about 0.75 ×107

and 1.5×107in Fat-tree and 3-Tiered topologies, respectively.

This is also consistent with the O(V)queue backlog size

bound in (23). Recall our analysis in Total Cost: larger V

invokes most switches to spend more time uploading requests

(a) Canonical 3-Tiered topology (b) Fat-tree topology

Fig. 6. Communication cost comparison among four scheduling schemes

under Canonical 3-Tiered topology and Fat-tree topology, respectively.

to control plane. However, control plane’s service capacity

is ﬁxed and requests will keep accumulating. Thus when V

becomes sufﬁciently large, control plane will eventually hold

most of requests in the system. This explains the increasing

queue backlog size in Fig. 4(b). Fig. 5(b) present the total

queue backlog size in Fat-tree topology when we apply

Greedy with request arrivals that follow Poisson and Pareto

processes. Note that we do not show that of 3-Tiered topology

because curves there almost overlap with those in Fig. 5(b).

C. Comparison with Other Dynamic Association Schemes

In this subsection, we consider the extreme case by set-

ting common computational cost α= 2.0×1028 for all

switches. This means the cost of local processing requests

are prohibitively high and each time slot switches choose

to upload requests to controllers. Thus our greedy algorithm

degenerates into a dynamic switch-controller association algo-

rithm. We compare its performance with three other schemes:

Static,Random and JSQ(Join-the-Shorest-Queue). In static

scheme, each switch ichooses the controller jwith minimum

communication cost Wi,j and then ﬁxes this association for all

time slots. In random scheme, the scheduler randomly picks up

a controller for each switch at each time slot. In JSQ scheme,

it randomly picks up one switch iwithout replacement round

by round until all switches have chosen the target controllers.

At each round, the selected switch ichooses the controller j

with the smallest queue backlog size at present, and appends

all its requests to j’s queue.

Fig. 6 presents a comparison among Static,Random,

JSQ, and Greedy in terms of communication cost under

Canonical 3-Tiered topology and Fat-tree, respectively. First,

the communication cost under Static is the minimum among

all schemes, which is consistent with its goal of minimizing

the overall communication cost. Greedy cuts down the com-

munication cost with increasing V. Eventually, when Vis

sufﬁciently large, communication cost stops decreasing and

remains unchanged. Both Random and JSQ exhibit much

higher communication costs, compared to Greedy and Static.

Fig. 7 presents a comparison among the four schemes in

terms of the variance of queue backlog size under Canonical

3-Tiered topology and Fat-tree topology, respectively. In fact,

smaller queue backlog size variance indicates better capability

of load balancing. The variance of Static grows exponentially

with time, showing that Static is incompetent in load balanc-

ing. The reason is that Static greedily associates switches with

their nearest controllers, ignoring different controllers’ loads.

When it comes to Random and JSQ, the variance is almost 0,

which shows the two schemes’ advantage in load balancing.

While the variance of Greedy is in between the other three: it

(a) Canonical 3-Tiered topology (b) Fat-tree topology

Fig. 7. Variance of queue backlog size comparison among four scheduling

schemes under Canonical 3-Tiered topology and Fat-tree topology, respec-

tively.

(a) Canonical 3-Tiered topology (b) Fat-tree topology

Fig. 8. Average queue backlog size comparison among four scheduling

schemes under Canonical 3-Tiered topology and Fat-tree topology, respec-

tively.

(a) Poisson (b) Pareto

Fig. 9. Communication cost comparison among four scheduling schemes

under Fat-tree topology, when the ﬂow arrival follows Poisson and Pareto,

respectively.

increases at the beginning and then remains stable after about

thousands of time slots. Furthermore, Greedy exhibits higher

variance of queue backlog size with larger V, i.e., the load of

controllers is more imbalanced.

Fig. 8 presents a comparison among the four schemes in

terms of the average queue backlog size under Canonical 3-

Tiered topology and Fat-tree topology, respectively. Actually,

the observations in Fig. 8 are very consistent with that of

Fig. 7. Intuitively, the more balanced the load of controller

is, the smaller of the average queue backlog size. In Fig. 7,

the variance of Static is high while that of Random and JSQ

are almost 0, so the average queue backlog size of Static is

large while that of Random and JSQ is small in Fig. 8. As

for Greedy, the variance increases with Vin Fig. 7, so the

average queue backlog size of Greedy rises as Vincreases as

shown in Fig. 8.

In addition to trace-driven simulation, we also conduct

the comparison under other two assumptions of ﬂow arrival

processes, i.e., Poisson and Pareto processes. They two are

widely adopted in trafﬁc analysis. We only show the simulation

results under Fat-tree topology, because the simulation results

in 3-Tiered topology is qualitatively similar to that in Fat-tree

topology. Fig. 9 shows the communication cost comparison

(a) Poisson (b) Pareto

Fig. 10. Average queue backlog size comparison among four scheduling

schemes under Fat-tree topology, when the ﬂow arrival follows Poisson and

Pareto, respectively.

when the ﬂow arrival process follows Poisson and Pareto,

respectively. Fig. 10 shows the average queue backlog size

comparison when the ﬂow arrival process follows Poisson and

Pareto processes, respectively. We can see from these ﬁgures

that the scheduling policies perform qualitatively consistent

under different arrival processes.

In summary, among four schemes, Static is on the one

end of performance spectrum: it minimizes communication

cost while incurring extremely large queue backlogs; both

Random and JSQ are on the other end of performance

spectrum: they minimize the average queue backlog while

incurring much large communication costs. In contrast, our

Greedy scheme achieves a trade-off between minimization

of communication costs and minimization of queue backlogs.

Through a tunable parameter V, we can achieve different

degrees of balance between cost minimization and latency

(queue backlog) minimization.

V. CO NCL US ION

In this paper, we studied the joint optimization problem

of dynamic switch-controller association and dynamic

control devolution for SDN networks. We formulated the

problem as a stochastic network optimization problem,

aiming at minimizing the long-term average summation

of total communication cost and computational cost while

maintaining low time-average queue backlogs. We proposed

an efﬁcient online greedy algorithm, which yields a long-term

average sum of communication cost and computational cost

within O(1/V )of optimality, with a trade-off in an O(V)

queue backlog size for any positive control parameter V.

Extensive simulation results show the effectiveness and

optimality of our online algorithm, and the ability to maintain

a tunable trade-off compared to other dynamic association

schemes.

ACK NOWL EDG MEN T

This work was supported in part by NSFC grant No.6130

2114, and Hong Kong RGC CRF-C7036-15G.

REF ERE NC ES

[1] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson,

J. Rexford, S. Shenker, and J. Turner, “Openﬂow: enabling innovation in

campus networks,” ACM SIGCOMM Computer Communication Review,

vol. 38, no. 2, pp. 69–74, 2008.

[2] T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu,

R. Ramanathan, Y. Iwata, H. Inoue, T. Hama et al., “Onix: A distributed

control platform for large-scale production networks.” in OSDI, vol. 10,

2010, pp. 1–6.

[3] A. Tootoonchian and Y. Ganjali, “Hyperﬂow: A distributed control plane

for openﬂow,” in Proceedings of the 2010 internet network management

conference on Research on enterprise networking, 2010, pp. 3–3.

[4] D. Levin, A. Wundsam, B. Heller, N. Handigol, and A. Feldmann,

“Logically centralized?: state distribution trade-offs in software deﬁned

networks,” in Proceedings of the ﬁrst workshop on Hot topics in software

deﬁned networks. ACM, 2012, pp. 1–6.

[5] A. Dixit, F. Hao, S. Mukherjee, T. Lakshman, and R. Kompella, “To-

wards an elastic distributed sdn controller,” ACM SIGCOMM Computer

Communication Review, vol. 43, no. 4, pp. 7–12, 2013.

[6] A. Krishnamurthy, S. P. Chandrabose, and A. Gember-Jacobson,

“Pratyaastha: An efﬁcient elastic distributed sdn control plane,” in

Proceedings of the third workshop on Hot topics in software deﬁned

networking. ACM, 2014, pp. 133–138.

[7] T. Wang, F. Liu, J. Guo, and H. Xu, “Dynamic sdn controller assignment

in data center networks: Stable matching with transfers,” in Proc. of

INFOCOM, 2016.

[8] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and

S. Banerjee, “Devoﬂow: scaling ﬂow management for high-performance

networks,” in ACM SIGCOMM Computer Communication Review,

vol. 41, no. 4. ACM, 2011, pp. 254–265.

[9] S. Hassas Yeganeh and Y. Ganjali, “Kandoo: a framework for efﬁcient

and scalable ofﬂoading of control applications,” in Proceedings of the

ﬁrst workshop on Hot topics in software deﬁned networks. ACM, 2012,

pp. 19–24.

[10] K. Zheng, L. Wang, B. Yang, Y. Sun, Y. Zhang, and S. Uhlig, “Lazyc-

trl: Scalable network control for cloud data centers,” arXiv preprint

arXiv:1504.02609, 2015.

[11] M. J. Neely, “Stochastic network optimization with application to

communication and queueing systems,” Synthesis Lectures on Commu-

nication Networks, vol. 3, no. 1, pp. 1–211, 2010.

[12] M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data

center network architecture,” in ACM SIGCOMM Computer Communi-

cation Review, vol. 38, no. 4. ACM, 2008, pp. 63–74.

[13] T. Benson, A. Akella, and D. A. Maltz, “Network trafﬁc characteristics

of data centers in the wild,” in Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement. ACM, 2010, pp. 267–280.

[14] A. Tootoonchian, S. Gorbunov, Y. Ganjali, M. Casado, and R. Sherwood,

“On controller performance in software-deﬁned networks,” in Presented

as part of the 2nd USENIX Workshop on Hot Topics in Management of

Internet, Cloud, and Enterprise Networks and Services, 2012.

APP EN D IX A

PROB LEM T RA NSF ORM ATION B Y OPP ORT UNI STI CA LLY

MI NI MIZ IN G A N EXP ECTATI ON

By minimizing the upper bound of the drift-plus-penalty

expression (14), the time average of communication cost can

be minimized while stabilizing the network of request queues.

We denote the objective function of (16) by Jt(X)and its

optimal solution X∗∈ A.

Therefore, for any other scheduling decision X∈ A made

during time slot t, we have

Jt(X)≥Jt(X∗)(24)

Taking the conditional expectation on both sides conditional

on Qc(t), we have

E[Jt(X)|Qc(t)] ≥E[Jt(X∗)|Qc(t)] (25)

for any X∈ A. In such way, instead of directly solving

the long-term stochastic optimization problem (11), we

can opportunistically choose a feasible association to solve

problem (16) during each time slot.

APP EN D IX B

PROO F OF THE ORE M 1

Given L(Q(t)) deﬁned in (12), we have

L(Q(t+ 1)) −L(Q(t))

=1

2

X

j∈C hQc

j(t+ 1)2−Qc

j(t)2i+X

i∈S h(Qs

i(t+ 1))2−

(Qs

i(t))2i

≤1

2X

j∈C

Qc

j(t)−Bj(t) + X

i∈S

Xi,j ·Ai(t)!2

−Qc

j(t)2

+

1

2X

i∈S n(Qs

i(t)−Ui(t) + Yi·Ai(t))2−(Qs

i(t))2o

=1

2X

j∈C (2Qc

j(t)· X

i∈S

Xi,j ·Ai(t)−Bj(t)!+

X

i∈S

Xi,j ·Ai(t)−Bj(t)!2

+1

2X

i∈S n(Yi·Ai(t)−Ui(t))2

+2Qs

i(t)·(Yi·Ai(t)−Ui(t))}

≤X

j∈C (Qc

j(t)· X

i∈S

Xi,j ·Ai(t)−Bj(t)!+

(Pi∈S Xi,j ·Ai(t))2+ (Bj(t))2

2+

X

i∈S Qs

i(t)·(Yi·Ai(t)−Ui(t)) + (Yi·Ai(t))2+ (Ui(t))2

2

(26)

Then with the deﬁnition of ∆(Q(t)) in (13), we have

∆(Q(t))

=E{L(Q(t+ 1)) −L(Q(t)) |Q(t)}

≤E

X

j∈C

Qc

j(t)· X

i∈S

Xi,j (t)Ai(t)−Bj(t)!|Q(t)

+

E(X

i∈S

Qs

i(t)·(Yi(t)Ai(t)−Ui(t)) |Q(t))+

1

2E

X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

|Q(t)

+

1

2E(X

i∈S h(YiAi(t))2+ (Ui(t))2i|Q(t))

=X

j∈C

Qc

j(t)·E( X

i∈S

Xi,j (t)Ai(t)−Bj(t)!|Q(t))+

X

i∈S

Qs

i(t)·E{(Yi(t)Ai(t)−Ui(t)) |Q(t)}+

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

+

1

2X

i∈S h(YiAi(t))2+ (Ui(t))2i(27)

The last equality in (27) holds because of conditional

expectation on Q(t), then both Qs

i(t)and Qc

j(t)can be

regarded as a constant. Besides, the queueing process {Q(t)}

is independent of the arrival process {A(t)}and service

process {B(t)},{U(t)}. Hence, the last two terms have

nothing to do with Q(t). Now consider the last two terms in

(27). We have

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

+

1

2X

i∈S h(YiAi(t))2+ (Ui(t))2i

=1

2

X

j∈C

(Bj(t))2+X

i∈S

(Ui(t))2

+

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+

1

2X

i∈S

(1 −X

j∈C

Xi,j )·Ai(t)

2

(28)

Then by taking expectation on (28), the following holds

E

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

+

1

2X

i∈S h(YiAi(t))2+ (Ui(t))2i)

=E

1

2

X

j∈C

(Bj(t))2+X

i∈S

(Ui(t))2

+

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+

1

2X

i∈S

(1 −X

j∈C

Xi,j )·Ai(t)

2

=1

2

X

j∈C

En(Bj(t))2o+X

i∈S

En(Ui(t))2o

+

1

2X

j∈C

E

X

i∈S

Xi,j Ai(t)!2

+

1

2X

i∈S

E

(1 −X

j∈C

Xi,j )2·(Ai(t))2

=1

2

X

j∈C

En(Bj(t))2o+X

i∈S

En(Ui(t))2o

+

1

2X

j∈C

E(X

i∈S

X2

i,j (Ai(t))2+ 2 X

i<i′

Xi,j Xi′,jAi(t)Ai′(t))+

1

2X

i∈S

E

(1 −X

j∈C

Xi,j )2·(Ai(t))2

(29)

Remind that the request arrival processes {A(t)}are indepen-

dent and they are also independent of Xi,j for (i, j )∈ S × C.

Then we have

E

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

+

1

2X

i∈S h(YiAi(t))2+ (Ui(t))2i)

=1

2

X

j∈C

En(Bj(t))2o+X

i∈S

En(Ui(t))2o

+

1

2

X

j∈C X

i∈S

EX2

i,j En(Ai(t))2o+

2X

i<i′

E{Xi,j }E{Xi′,j}E{Ai(t)}E{Ai′(t)}#+

1

2X

i∈S

E

(1 −X

j∈C

Xi,j )2

·En(Ai(t))2o

≤1

2C·max

j∈C {E(B2

j(t))}+S·max

i∈S {E(U2

i(t))}+

max

i∈S {E(A2

i(t))}

X

j∈C

E

X

i∈S

Xi,j !2

+

X

i∈S

E

1−X

j∈C

Xi,j

2

≤1

2max

i,j (E(B2

j(t)), E(U2

i(t)), E(A2

i(t)))·

C+S+max

X∈A

X

j∈C X

i∈S

Xi,j !2

+X

i∈S

(Yi(t))2

(30)

where the ﬁrst inequality holds because of the following

reasoning. We suppose i∗∈arg maxi∈S Ai(t). Then for any

i, i′∈ S E{Ai(t)} · E{Ai′(t)}

≤(E{Ai∗(t)})2(31)

As we know that Var {Ai∗(t)} ≥ 0, then

E{Ai(t)} · E{Ai′(t)}

≤EA2

i∗(t)(32)

Thus the ﬁrst inequality in (30) holds.

Next, we focus on the upper bound of

Pj∈C Pi∈S Xi,j 2+Pi∈S 1−Pj∈C Xi,j 2

for X∈ A.

At each time slot, Xi,j ∈ {0,1}and for each switch

i∈ S, it must decide either to upload requests to one

of controllers or process them locally. Then among all

Xi,j (for all (i, j)∈ S × C) and (1 −Pj∈C Xi,j )(for

i∈ S), there are exactly |S| of them that’s equal to one.

Let a∈[0,|S|]denote the number of switches that decide

to upload requests to control plane, i.e., there are aterms

among all Xi,j (for (i, j )∈ S × C ) that’s equal to one.

Likewise, let b∈[0,|S|]denote the number of switches

that process requests locally. Accordingly, we know that

a+b=|S|. Besides, Pi∈S 1−Pj∈C Xi,j 2=bsince

there are exactly bswitches such that for any switch iamong

them Pj∈C Xi,j = 0.

Now we prove that the upper bound of Pj∈C Pi∈S Xi,j2

is a2and the bound is reached when all aswitches is

associated with the same controller. We use Rto denote the

set of those aswitches. We introduce indicator Ik ,l such that

Ik,l = 1 if switch kand switch lare associated with the same

controller and 0otherwise. Therefore, for any switch-controller

association N ⊆ R × C such that |N | =a, we have

X

j∈C X

i∈S

Xi,j !2

=X

j∈C

X

i:(i,j)∈N

X2

i,j + 2 ·X

i,i′∈R:i<i′

and (i,j),(i′,j)∈N

Xi,j Xi′,j

=X

(i,j)∈N

X2

i,j + 2 ·X

j∈C X

i,i′:i<i′

Ii,i′Xi,j Xi′,j

=X

(i,j)∈N

X2

i,j + 2 ·X

i,i′∈R:i<i′

Ii,i′

=a+ 2 ·X

i,i′∈R:i<i′

Ii,i′

(33)

where the last equality holds because for any pair of switches

(i, i′),Ii,i′= 1 only when iand i′upload requests to the

same controller. From (33), we know that the upper bound is

reached when Ii,i′= 1 for all i, i′∈ R, i.e., when all switches

in Rconnected to the same switches. In such case, since there

are 1

2a(a−1) pairs of different switches, then the upper bound

of Pj∈C Pi∈S Xi,j 2is a+a(a−1) = a2. Hence,

X

j∈C X

i∈S

Xi,j !2

+X

i∈S

(1 −X

j∈C

Xi,j )2

≤a2+b

=a2+|S| − a

=a−1

22+|S| − 1

4

(34)

Now that ais a non-negative integer and 0≤a≤ |S |, then

the upper bound in (34) reaches its maximum value |S |2when

a=|S|. In other words, the upper bound reaches maximum

when all switches in Supload requests to the same controller.

As a result,

E

1

2X

j∈C

X

i∈S

Xi,j Ai(t)!2

+ (Bj(t))2

+

1

2X

i∈S h(YiAi(t))2+ (Ui(t))2i)

≤1

2max

i,j (E(B2

j(t)), E(U2

i(t)), E(A2

i(t))) ·|C| +|S| +|S|2

=dmax

2|C| +|S | +|S|2=K

(35)

We assume the whole control plane is capable of handling

all requests from data plane in the mean sense. Therefore,

for j∈ C, there exists ǫc

j>0such that E[Bj(t)−

Pi∈S Xi,j Ai(t)|Qc(t)] = ǫc

j. Likewise, for i∈ S, there

exists ǫs

i>0such that E[Ui(t)−YiAi(t)|Qc(t)] = ǫs

i. Fol-

lowing (35) and the deﬁnition in (14), after taking expectation

on ∆V(Q(t)), we have

E{∆V(Q(t)}

≤K+X

j∈C

EQc

j(t)·E(E(X

i∈S

Xi,j (t)Ai(t)−Bj(t)|Q(t)))

+X

i∈S

E{Qs

i(t)} · E

E

[1 −X

j∈C

Xi,j (t)]Ai(t)−Ui(t)|Q(t)

+V·E{E{f(t) + g(t)|Q(t)}}

=K+X

j∈C

EQc

j(t)·E(X

i∈S

Xi,j (t)Ai(t)−Bj(t))

+X

i∈S

E{Qs

i(t)} · E

[1 −X

j∈C

Xi,j (t)]Ai(t)−Ui(t)

+V·E{f(t) + g(t)}

≤K−ǫcX

j∈C

Qc

j(t)−ǫsX

i∈S

Qs

i(t) + V·(f∗+g∗)

(36)

where ǫc= minj∈C {ǫc

j},ǫs= mini∈S {ǫs

i}. Expanding the

term E{∆V(Q(t))}, then for any time slot τ,

E{L(Q(τ+ 1)) −L(Q(τ))}+V·E{f(τ) + g(τ)}

≤K−ǫcX

j∈C

EQc

j(τ)−ǫsX

i∈S

E{Qs

i(τ)}+V(f∗+g∗)

(37)

Next, summing over τ∈ {0,1,2, . . . , t −1}for some t > 0,

we have

E{L(Q(t)) −L(Q(0))}+V·

t−1

X

τ=0

E[f(τ) + g(τ)]

≤t·K−ǫc

t−1

X

τ=0 X

j∈C

EQc

j(τ)−ǫs

t−1

X

τ=0 X

i∈S

E{Qs

i(τ)}+

t·V·(f∗+g∗)

(38)

By re-arrangement of terms at both sides and ignoring some

non-negative term such as E{L(Q(t))}and EQc

j(t), with

ǫc, ǫs>0and V > 0, we have

V·

t−1

X

τ=0

E[f(τ) + g(τ)]

≤t·V·(f∗+g∗) + t·K+E{L(Q(0))}

(39)

ǫc·

t−1

X

τ=0 X

j∈C

EQc

j(τ)

≤t·V·(f∗+g∗) + t·K+E{L(Q(0))}

(40)

ǫs·

t−1

X

τ=0 X

i∈S

E{Qs

i(τ)}

≤t·V·(f∗+g∗) + t·K+E{L(Q(0))}

(41)

Then by dividing both sides of (39) by V·t, (40) by ǫc·t,

and (41) by ǫs·t, we have

1

t·

t−1

X

τ=0

E[f(τ)]

≤(f∗+g∗) + K

V+E{L(Q(0))}

V·t

(42)

1

t

t−1

X

τ=0 X

j∈C

EQc

j(τ)

≤V·(f∗+g∗) + K

ǫc+E{L(Q(0))}

ǫc·t

(43)

1

t

t−1

X

τ=0 X

i∈S

E{Qs

i(τ)}

≤V·(f∗+g∗) + K

ǫs+E{L(Q(0))}

ǫs·t

(44)

At last, taking the limit as t→ ∞ for both equations, we have

the desired results:

lim

t→∞

1

t·

t−1

X

τ=0

E[f(τ) + g(τ)] ≤f∗+g∗+K

V(45)

lim

t→∞

1

t

t−1

X

τ=0 X

j∈C

EQc

j(τ)≤V·(f∗+g∗) + K

ǫc(46)

lim

t→∞

1

t

t−1

X

τ=0 X

i∈S

E{Qs

i(τ)} ≤ V·(f∗+g∗) + K

ǫs(47)

By setting ǫ= min{ǫc, ǫs}, the following desired result

holds

lim sup

t→∞

1

t

t−1

X

τ=0

X

j∈C

EQc

j(τ)+X

i∈S

E{Qs

i(τ)}

≤

K+V·(f∗+g∗)

ǫ(48)