Page 1

Towards Optimal Rate Allocation for Data Aggregation

in Wireless Sensor Networks∗

Lu Su, Yan Gao, and Yong Yang

Department of Computer Science

University of Illinois at Urbana-Champaign

Urbana, IL, 61801, USA

{lusu2, yangao3, yang25}@illinois.edu

Guohong Cao

Department of Computer Science & Engineering

The Pennsylvania State University

University Park, PA, 16802, USA

gcao@cse.psu.edu

ABSTRACT

This paper aims at achieving optimal rate allocation for data ag-

gregation in wireless sensor networks. We first formulate this rate

allocation problem as a network utility maximization problem. Due

to its non-convexity, we take a couple of variable substitutions on

the original problem and transform it into an approximate prob-

lem, which is convex. We then apply duality theory to decom-

pose this approximate problem into a rate control subproblem and a

scheduling subproblem. Based on this decomposition, a distributed

algorithm for joint rate control and scheduling is designed, and

proved to approach arbitrarily close to the optimum of the approxi-

mate problem. Finally, we show that our approximate solution can

achievenear-optimalperformancethroughboththeoreticalanalysis

and simulations.

Categories and Subject Descriptors

C.2.1 [Computer-Communication Network]: Network Architec-

ture and Design—Wireless communication; G.1.6 [Mathematics

of Computing]: Optimization—Convex programming

General Terms

Algorithms, Theory

Keywords

Wireless Sensor Networks, Data Aggregation, Rate Allocation,

Scheduling, Network Utility Optimization, Cross Layer Design

1.INTRODUCTION

Data aggregation has been put forward as an essential paradigm

for routing in wireless sensor networks [1]. The idea is to use a

function like average, max or min to combine the data coming from

different sources enroute to eliminate transmission redundancy and

thus save energy as well as bandwidth. In recent years, a large spec-

trum of studies have been conducted on various problems of data

∗This work was supported in part by National Science Foundation

under grant CNS-0916171.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that copies

bear this notice and the full citation on the first page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior specific

permission and/or a fee.

MobiHoc’11, May 16–19, 2011, Paris, France.

Copyright 2011 ACM 978-1-4503-0722-2/11/05 ...$10.00.

aggregationinsensornetworks. However, thefollowingfundamen-

tal question has not been answered thus far: “Does there exist an

optimal rate allocation for data aggregation which maximizes the

utilization of network resources and meanwhile maintains certain

fairness among all the sources?". This paper gives the answer to

this question.

Finding the optimal rate allocation for data aggregation in sen-

sor networks can be regarded as a utility-based resource alloca-

tion problem. In particular, each source is associated with a utility,

which is defined as a function of the source’s sending rate. The

function value can be conceptually regarded as the quality of infor-

mation provided by the source. For a given aggregation tree, there

exists a unique “maximum utility" rate allocation, at which the net-

work resource utilization is optimal. Meanwhile, certain fairness

such as max-min fairness and proportional fairness can be achieved

when we choose appropriate utility functions.

The problem of maximizing the network utilities has been ex-

plored in the context of both wired [2, 3, 4, 5, 6] and wireless [7, 8,

9, 10] networks, for rate control in unicast or multicast. Although

using a similar approach, we show that rate allocation for data ag-

gregation in wireless sensor networks faces unique challenges both

theoretically and practically, making this problem a completely dif-

ferent one to which none of the existing solutions can be applied.

A

C

B

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

S

(b) Queues at node C

Scenario I

(c) Queues at node C

Scenario II

(a) Topology

Packets

from A

Packets

from B

Aggregated

by C

Packets

from A

Packets

from B

Aggregated

by C

Figure 1: An example of data aggregation constraint

Challenge I: Theoretically, rate allocation for data aggregation is

not only subject to the network capacity constraint, but also the

data aggregation constraint on the aggregation nodes (i.e., non-

leaf nodes) of the aggregation tree.

Figure 1 provides an intuitive example of the data aggregation

constraint. A simple aggregation tree is shown in Fig. 1(a). In this

case, two source nodes A and B collect and transmit data to node C,

who aggregates the received data and forwards them to the sink S.

Fig. 1(b) and (c) illustrate two different scenarios of data aggrega-

tion. In either scenario, three columns are displayed, which corre-

spond to three queues maintained by node C. The first two queues

store the packets coming from A and B, while the last one (queue

C) keeps the packets generated by aggregating A and B’s packets.

Page 2

The packets in each of the three queues are sorted by the times-

tamps recorded in their headers. In Fig. 1(b) and (c), the vertical

axis denotes the timestamp, which indicates the time when the car-

ried data packet is collected. In this paper, we assume that only the

data collected at the same time can be aggregated. This assump-

tion is valid in many applications of sensor networks, such as target

localization and fire alarming, since the data collected at the same

time usually contain the information about the same event. For this

reason, an aggregated packet has the same timestamp as the raw

packets involved in its aggregation. Sometimes, a packet coming

from a source node has no coincident packets with the same times-

tamp from other source nodes, such as the packets with timestamp

3 and 7 in queue A. In this case, the aggregation node does nothing

but simply forwards it upwards.

In the first scenario shown in Fig. 1(b), the number of packets

stored in queue C is the same as the number of packets in queue

A, since the time slots when node B collects data are the subset of

the time slots when A collects data. Therefore, to keep the network

stable, in other words, to prevent the queue of node C from over-

flow, the transmission rate of node C should be no less than A’s

rate. However, this doesn’t hold in the second scenario displayed

in Fig. 1(c), where the only difference from scenario I is that the

timestampsofallthepacketsinqueueBareincreasedbyonewhich

implies that node B postpones all its data collections by one time

slot. Surprisingly, this causes a fundamental change in queue C. In

particular, no aggregation can be made, since there is no coincident

packets of A and B. As a result, the number of packets in queue C

is the summation of queue A and B’s packets. Therefore, in this

scenario, the requirement of stability becomes that C should send

faster than the aggregate rate of A and B.

This example reveals the fact that the transmission rate of an

aggregation node is constrained by not only the rates of its children

but also their packet timestamp patterns. The packet timestamp

pattern of a node includes two components: the intervals between

the timestamps of consecutive packets and the time-offsets of the

packets among the nodes who share the same parent.

1

2

3

4

5

6

7

8

9

10

A

C

N

S

B

(a) Topology

1

2

3

4

5

6

7

8

9

10

(b) Queues at node C

Time t1

Packets

from A

Packets

from B

Aggregated

by C

(c) Queues at node C

Time t2

Packets

from A

Packets

from B

Aggregated

by C

Figure 2: An example of data availability constraint

Challenge II: Practically, rate allocation for data aggregation has

an implicit constraint, which is referred to as the data availability

constraint in this paper. Figure 2 gives us an illustrative example

of this constraint. Similar to the previous example, node A and B

work as the source nodes. However, B is not directly connected to

the aggregation node C. An intermediate node N relays data for B.

Suppose at time t1, as shown in Fig. 2(b), A has delivered some

data to C, whereas B’s data has not arrived at C since they are de-

layed at node N. At this moment, although lots of A’s packets are

waiting in its buffer, node C needs to wait until B’s data arrives

at time t2 (for the sake of simplicity, suppose A transmits no data

during this period), and then fulfills its aggregation task, as shown

in Fig. 2(c). This is because if C chooses to deliver A’s packets at

time t1, it has to do the same job again when B’s packets arrive. As

a result, double amount of traffic is injected into the downstream

links by C. This is definitely not an economic solution from the

perspective of network utility maximization.

The main purpose of our work is to address the above challenges.

We first formulate this rate allocation problem as a network utility

maximization problem. Due to its non-convexity, we take a couple

of variable substitutions on the original problem and transform it

into an approximate problem, which is convex. We then apply du-

ality theory to decompose this approximate problem vertically into

aratecontrolsubproblemandaschedulingsubproblemthatinteract

through shadow prices. Based on this decomposition, a distributed

subgradient algorithm for joint rate control and scheduling is de-

signed, and proved to approach arbitrarily close to the optimum of

the approximate problem. Finally, we show that our approximate

solution can achieve near-optimal performance through both the-

oretical analysis and simulations. To the best of our knowledge,

this work is the first one to present a joint design of rate allocation

and scheduling in the context of data aggregation in wireless sensor

networks.

The rest of the paper is organized as follows. Section 2 summa-

rizes the related work. We introduce the system model in Section 3

and formulate the problem in Section 4. In Section 5, the original

problem is transformed into a convex approximate problem, with

the solution given in Section 6. In Section 7, we explain how the

proposed solution is implemented in a decentralized manner. Then,

wediscusssomerelatedissuesinSection8, andevaluatetheperfor-

mance of the proposed schemes in Section 9. Section 10 concludes

the paper.

2.RELATED WORK

In this section, we provide brief summaries of the existing work

on sensory data aggregation and network utility maximization re-

spectively, and clarify the novelty of this paper.

Sensory data aggregation becomes a research hotspot after the

presentation of the seminal work [1]. A large variety of problems

regarding this topic have been extensively studied. Representative

problems include: how to construct the most energy efficient ag-

gregation tree [11, 12], how to schedule the transmissions of sensor

nodes such that the aggregation delay can be minimized [13, 14],

how to maximize or bound the lifetime of the aggregation tree [15,

16], how to secure data aggregation [17, 18], how to derive theo-

retic bound of aggregation capacity [19, 20], how to achieve fair

aggregation [21, 22], etc.

The framework of network utility maximization (NUM) was first

developed in the context of wireline network in [2, 3], followed

by [4, 5, 6]. The main idea of the framework is based on the de-

composition of a system-wide optimization problem. A distributed

congestion control mechanism is designed to drive the solutions of

the decomposed problems towards the global optimum. Later on,

NUM was studied in the context of wireless networks. In wireless

networks, this problem is more difficult because of the interference

nature of wireless communication. To address this challenge, a

scheduling mechanism is integrated into the optimization frame-

work to stabilize the system [7, 8, 9, 10].

This work is the first attempt to utilize NUM framework to ex-

plore the optimal rate allocation for sensory data aggregation. The

theoretical and practical challenges aforementioned in Section 1

make the problem we target at completely different from previous

work, and thus none of the existing solutions can be applied.

3.SYSTEM MODEL

In this section, we explain in detail the system model.

Page 3

3.1Aggregation Model

We consider an aggregation tree T rooted at the sink. We denote

the set of tree edges (links) by L = L(T). The sensor nodes on

T can be divided into source nodes that collect sensory readings,

and aggregation nodes that aggregate and relay sensory readings.

In the rest of the paper, we assume that the source nodes are only at

the leaf nodes of the aggregation tree. However, it is possible that

a sensor node plays a dual role as both source node and aggrega-

tion node. This problem can be easily addressed through a simple

graph transformation. In particular, we replace each source node at

a non-leaf node by creating a new leaf node and placing the source

node in it, and then connect the new leaf node to the non-leaf node

where the source node is originally located by a link with infinite

capacity1.

As in most applications of sensor networks, we assume that all

the sensor nodes are equipped with the same sensing and commu-

nicating devices, and furthermore, the maximum collecting rate of

the sensing device is larger than the maximum transmitting rate of

the communicating device. The lifetime of the network is divided

into slots with equal duration, and each slot is further divided into

subslots. The sensing device of each node may or may not col-

lect data at a subslot. Once it decides to work within a subslot,

it always collects data at its maximum speed. The data collected

within a subslot is encapsulated into a packet, which is referred to

as the Basic Transmission Unit (BTU).

3.2Probabilistic Rate Model

AsdiscussedinSection1, thetransmissionrateofanaggregation

nodeisconstrainedbynotonlytheratesofitschildrenbutalsotheir

packet timestamp patterns. To characterize the packet timestamp

patterns of sensor nodes, we introduce a probabilistic rate model,

which is defined and explained as below:

Definition 1. Probabilistic Rate Model. At each subslot, the

sensing device of a node chooses to collect data or not, based on a

probability, which is referred to as the data collection rate of this

node.

Suppose a time slot is composed of 100 subslots, and the data

collection rate of a given sensor node is 0.5. Then, it works at

roughly50randomlyselectedsubslotsandsleepsattherestoftime.

As a result, it collects around 50 packets (BTU) within a time slot.

We define the data transmission rate of a node as the ratio of the

numberofpackets(BTU)thisnodedeliverswithinatimeslottothe

number of subslots in each slot. Data transmission rate is actually

a normalized rate. Its relation with the genuine transmission rate,

which is equal to the number of bits that a node delivers within a

time slot, can be reflected by the formula: xN=

xNand xGdenote the normalized and genuine data transmission

rate, respectively. In addition, CB is the size of BTU, and Nsub

represents the number of subslots in each slot. For example, as-

sume the data transmission rate of a node is 1000bps, CBis 100bits

and Nsubis 20. Based on the above formula, the normalized data

transmission rate is 0.5.

The probabilistic rate model can be considered as a generaliza-

tion of a node’s packet timestamp pattern. This can be better un-

derstood after we mathematically formulate the problem in the next

section. Additionally, in the sensing tasks which require periodic

data collection, the probabilistic rate model can capture the long-

term expectation of time-varying time-offsets of the packets from

xG

CB×Nsub, where

1In practice, we do not place this link in any independent set, and

thus it has no impact on the scheduling problem.

different nodes. It is extremely difficult to mathematically model

the fixed time-offsets. More detailed discussion can be found in

Section 8.2.

4.

4.1

Link flow: We define link flow as the single-hop data traffic going

through a link (tree edge) of the aggregation tree. Conceptually, it

includes source flow originating at source nodes and aggregation

flow coming from aggregation nodes. The set of source flows and

the set of aggregation flows are denoted by FSand FA. In ad-

dition, F = FS?FArepresents the set of all the flows on the

denote its parent flow and set of children flows, respectively. Fi-

nally, f(l) implies the flow which goes through a link l ∈ L, and

l(f) means the link through which a flow f ∈ F passes. We denote

the normalized rate of each flow f ∈ F by xf. For a source flow

f ∈ FS, its rate is quantitatively equal to the data collection rate of

the corresponding source node. We use Xf to denote the interval

in which xf must lie:

PROBLEM FORMULATION

Terminology

aggregation tree. For any flow f ∈ F, we use π(f) and C(f) to

Xf:= {[mf,Mf]

where mf and Mf are the lower and upper bounds of xf.

Queue:Ateach aggregationnode, the packetscoming from each of

its children flows f is buffered in a queue, denoted by Q(f). The

packets in each queue are sorted by their timestamps, as shown

in Fig. 1 and Fig. 2. We use Tt(f) to denote the timestamp of

the packet on the top of Q(f) (largest timestamp), and Tb(f) to

denote the timestamp of the packet at the bottom of Q(f) (smallest

timestamp). In addition, the aggregated packets (available data) are

stored in a separate queue, waiting for transmission.

4.2Constraints

In this part, we elaborate on the constraints that our objective

function is subject to.

Network Capacity Constraint: Based on the network topology, a

conflict graph [23] can be constructed to capture the contention

relations among the links. In the conflict graph, each vertex rep-

resents a link, and an edge between two vertices implies the con-

tentionbetweenthetwocorrespondinglinks, i.e., theycannottrans-

mit at the same time. Given a conflict graph, we can identify all its

independent sets of vertices. The links in an independent set can

transmit simultaneously.

Let I denote the set of independent sets. We represent an in-

dependent set, Ii (i = 1,2,...,|I|), as a |L|-dimensional rate

vector, which is ri. In ri, the lth entry is ri

Ii;0

otherwise} where cl denotes the capacity of link l ∈ L.

Here we should note that this capacity is a normalized capacity,

which is defined as the ratio of the maximum number of packets

(BTU) that can be delivered through a link within a time slot to the

number of subslots in each slot. The feasible capacity region Π at

the link layer is defined as the convex hull of these rate vectors:

if f ∈ FS;[0,Mf]

if f ∈ FA}.

l := {cl

if l ∈

Π := {r | r =

|I|

?

i=1

αiri, αi≥ 0,

|I|

?

i=1

αi= 1}.

With above notations, now we can formally define the network

capacity constraint as follows:

xf(l)≤ rl

for all l ∈ L.

Page 4

Namely, the rate of each flow must not exceed the amount

of capacity allocated to the link it passes through.

{(rl1,rl2,...,rl|L|)| li ∈ L}, and it should satisfy r ∈ Π.

Data Aggregation Constraint: As in the rate control of unicast

and multicast scenarios, it is essential to investigate the relationship

between the rate of a parent flow and the rates of its children flows

so as to stabilize the network. However, the difficulty of achieving

this goal in the context of data aggregation is much larger, since a

slight change in the packet timestamp pattern of a node may incur

significant change in the resulting aggregated packets, as disclosed

in Section 1. To overcome this difficulty, we adopt a probabilistic

rate model, which is defined in Section 3. Given the rate of a node,

this model can generalize all the possibilities of this node’s packet

timestamp patterns.

Under the probabilistic rate model, the data aggregation con-

straint can be formulated as follows:

?

Here we give an interpretation of this constraint which may draw

a better understanding. We say a node covers a subslot if there

exists a packet transmitted by this node whose timestamp is exactly

this subslot. Then, 1 − xf denotes the percentage of the subslots

which are not covered by the sending node of f. Later, we use the

concepts of a flow and its sending node interchangeably. It follows

that?

needs to cover all the subslots covered by at least one child, which

is 1 −?

Data Availability Constraint: From the example shown in Fig. 2,

we learn that an aggregation node cannot take any actions before it

makes sure that all the packets collected at the same time have ar-

rived. Given a timestamp, after receiving a packet with this times-

tamp or larger timestamp from each child node, the aggregation

node will know that all the packets with this timestamp have ar-

rived. Here we assume that packets arrive in the order of their

timestamps. Then, it performs the aggregation and puts the result-

ing aggregated packet into the queue where the packets available

for transmission are stored. Thus, the timestamps of the available

packets are constrained within the time interval from the small-

est Tb(f) to the smallest Tt(f) among all the queues. Recall that

Tb(f) and Tt(f) denote the timestamps of the packets on the bot-

tom and the top of each queue, respectively.

With a binary indicator variable bτ(f) defined as below:

Let r :=

1 −

fc∈C(f)

(1 − xfc) ≤ xf

for all f ∈ FA.

fc∈C(f)(1 − xfc) implies the percentage of the subslots not

covered by any of f’s children nodes. Intuitively, a parent node

fc∈C(f)(1 − xfc). By this intuition, the data aggregation

constraint is presented.

bτ(f) =

?

1

0

There is a packet with timestamp τ in Q(f).

otherwise.

In terms of the number of BTUs, we denote the amount of the avail-

able data at the sending node of f by λf. λf can be calculated as

follows:

λf=

minfj∈C(f)Tt(fj)

?

τ=minfi∈C(f)Tb(fi)

(

?

fk∈C(f)

bτ(fk))

(1)

where?denotes the bitwise operation “OR".

inFig.2, theamountoftheavailabledataatnodeC(storedinqueue

C) at time t1(Fig. 2(b)) and t2(Fig. 2(c)) are 0 and 6, respectively.

Note that in the scenario happens at t2, we do not take into account

the packet with timestamp 10 in queue B, since it is still unknown

By this formula, we can easily check that in the example shown

at this moment whether node A also has a packet with timestamp

10, and thus cannot mark this packet to be available.

Furthermore, let afbe the amount of data which can be transmit-

ted by the sending node of f. Then, the data availability constraint

can be formally defined as follows:

af≤ λf

for all f ∈ FA.

In other words, for each flow f ∈ FA, it can not deliver more

datathantheamountoftheavailabledatastoredatitssendingnode.

Within a time slot, once the available data are all sent out, the send-

ing node can not do more transmission even if some of the queues

for its children nodes are not empty. The data availability constraint

minimizes the amount of packets a node could inject into the net-

work. By this constraint, for each timestamp, there is at most one

packet with this timestamp arriving at the sink. More importantly,

the data availability constraint is actually the prerequisite of the

data aggregation constraint, since the data aggregation constraint

implicitly assumes all the packets from different sources collected

at the same time are merged into a single packet.

4.3Problem Formulation

With the terminologies and constraints defined above, we can

now formulate the problem to be solved. We associate each source

flow f ∈ FSwith a utility function Uf(xf) : R+ → R+. In this

paper, we assume Ufis continuously differentiable, increasing, and

strictly concave. Our objective is to choose the rate of each flow

xf and the allocated capacity of each link rl so as to maximize

the aggregate utility function. We now formulate the problem of

optimal rate allocation for data aggregation in sensor networks as

the following constrained nonlinear optimization problem:

P : max

?

xf(l)≤ rl

af≤ λf

1 −

fc∈C(f)

x ∈ X,

f∈FS

Uf(xf)

(2)

subject tofor all l ∈ L

for all f ∈ FA

for all f ∈ FA

(3)

(4)

?

(1 − xfc) ≤ xf

(5)

r ∈ Π

In P, the data availability constraint (4) works as the prerequi-

site of the data aggregation constraint (5). However, it is actually

an implicit constraint that need not be considered when solving this

optimization problem, although in practice each aggregation node

works following this constraint. We will give more detailed expla-

nation on this point in Section 8.1.

By choosing appropriate utility functions, the optimal rate allo-

cation can achieve different fairness models among the flows [2,

3]. For instance, if we let Uf(xf) = wfln(xf) for f ∈ FS, the

weighted proportional fairness can be achieved.

5.

5.1

APPROXIMATE PROBLEM

Variable Substitution

Though we formulate the problem of optimal rate allocation, it

turns out to be a non-convex program, due to the non-convexity of

the data aggregation constraint (5). To address this problem, we

reorganize the data aggregation constraint and take a log transform

on both sides: ln(1 − xf) ≤?

fc∈C(f)ln(1 − xfc). Next, we

substitute xf of each flow by ˜ xf = −ln(1 − xf), where we call

Page 5

˜ xf the transformed rate of f. By this variable substitution, the

data aggregation constraint becomes:?

aggregation constraint. In addition, based on the feasible region of

xf (f ∈ F), we can derive the feasible region of ˜ xf as

?

[0,−ln(1 − Mf)]

where −ln(1 − mf) ≥ 0 and −ln(1 − Mf) < ∞.

By the variable substitution described above, we transform the

data aggregation constraint into a linear constraint. However, this

substitution has a side effect, namely, it turns the network ca-

pacity constraint into 1 − exp(−˜ xf(l)) − rl ≤ 0, another non-

convex constraint. To overcome this problem, we reorganize this

constraint and take a log transform on both sides, then we have

˜ xf(l) ≤ −ln(1 − rl). Next, we take another variable substitu-

tion on rl: ˜ rl = −ln(1 − rl) where we call ˜ rl the transformed

allocated capacity of link l ∈ L. By this variable substitution, the

non-convex constraint is transformed into ˜ xf(l)−˜ rl≤ 0. We name

this constraint as the transformed capacity constraint. Recall that

rlis a normalized capacity allocated to link l, and thus it satisfies

0 ≤ rl ≤ 1. Since rl = 1 − exp(−˜ rl), it can be derived that

0 ≤ ˜ rl≤ ∞. By substituting ˜ rlfor rl, the original capacity region

Π is transformed into a transformed capacity region Π?:

fc∈C(f)˜ xfc≤ ˜ xf. In the

rest of this paper, we refer to this constraint as the transformed

? Xf:=

[−ln(1 − mf),−ln(1 − Mf)]f ∈ FS

f ∈ FA

Π?:= {˜ r | ˜ rli= −ln(1 − rli), i = 1,2,...,|L|, r ∈ Π}

However, Π?is not a convex region. Figure. 3 illustrates this

region transformation. Particularly, Fig. 3(a) shows an example of

two-dimensional capacity region Π, and the transformed capacity

region Π?is drawn in Fig. 3(b). As can be seen, Π?(the shaded

area) is not a convex region.

(a)?

(b)?’

(c)?

(d)?

Figure 3: Region transformation

To tackle this problem, we constitute an approximate trans-

formed capacity region?Π, which is convex. Recall that the original

all the independent sets. In fact, these rate vectors are the extreme

pointsofΠ, sinceeachofthemcannotberepresentedbytheconvex

combination of others. In the transformed capacity region Π?, let

˜ ridenote the point (vector) transformed from the ith extreme point

ri(rate vector of ith independent set) in Π. In ˜ ri, the lth entry is

˜ ri

if l ∈ Ii;

the transformed capacity, and defined by ˜ cl= −ln(1 − cl).

Now, we can define the approximate transformed capacity region

?Π as the convex hull of these transformed rate vectors:

?Π := {˜ r | ˜ r =

It is not difficult to prove that each ˜ ri, i = 1,2,...,|I| cannot be

represented by the convex combination of others either, and thus is

an extreme point of?Π. Therefore, for each (ith) independent set,

capacity region Π is actually the convex hull of the rate vectors of

l:= {˜ cl

0

otherwise} where ˜ cl is referred to as

|I|

?

i=1

αi˜ ri, αi≥ 0,

|I|

?

i=1

αi= 1}.

there is a one-to-one mapping between its corresponding extreme

points in Π (i.e., ri) and?Π (i.e., ˜ ri).

area), which corresponds to the original capacity region Π in

Fig. 3(a). As can be seen, despite of the convexity it achieves, it

does not cover all the points of the transformed capacity region Π?

(the area enclosed by the dashed curve). Furthermore, it includes

some points outside the boundary of Π?, and this implies that?Π

problem. Actually, if we take a reverse variable substitution (i.e.,

rl= 1−exp(−˜ rl)) on each point ˜ r ∈?Π, a new region denoted by

see, it does have some points outside the original capacity region Π

(the area enclosed by the dashed curve).

However, in our algorithm that will be introduced in the next

section, we do not map the solution in?Π to Π in this way, namely,

scheme, which can guarantee that there always exists a feasible

point in Π, which corresponds to the solution attained in the context

of?Π.

lem is rG:= {(rG

may not be located inside Π. However, in Π, we can always find

a point, which is rL=?|I|

solution is located in?Π, our algorithm can identify a correspond-

rGas the genuine mapping point of ˜ r0, and rLas the linear map-

ping point of ˜ r0. Similarly, given a point r0 =?|I|

and ˜ rL=?|I|

Now, we can formally define the approximate problem as fol-

lows:

Figure3(c)showstheapproximatecapacityregion?Π(theshaded

may result in some solutions which are not feasible in the original

?Π?is attained, and shown in Fig. 3(d) (the shaded area). As one can

through rl = 1 − exp(−˜ rl). Instead, we design a safe mapping

Based on the definition of?Π, any point in?Π, say ˜ r0, can be

l

= 1 − exp(−˜ r0l), l ∈ L} , and it

expressed as ˜ r0 =?|I|

i=1αi˜ ri. Its counterpart in the original prob-

l)| rG

i=1αiriwhere each αiequals the αiin

i=1αi˜ ri. By this mapping scheme, wherever the optimal

˜ r0 =?|I|

ing feasible solution inside Π. In the rest of this paper, we refer to

i=1αiriinside

Π, we can define ˜ rG:= {(˜ rG

i=1αi˜ rias the genuine mapping point and the linear

mapping point of r0in?Π.

l)| ˜ rG

l

= −ln(1 − r0l), l ∈ L}

?P : max

subject to

?

f∈FS

˜ xf(l)− ˜ rl≤ 0

?

˜ x ∈? X,

Uf(1 − exp(−˜ xf))

(6)

for all l ∈ L

for all f ∈ FA

(7)

fc∈C(f)

˜ xfc− ˜ xf≤ 0

(8)

˜ r ∈?Π

According to [24] (Chapter 3.2.4), since 1 − exp(−˜ xf) is a

strictly concave and increasing function, the objective function (6)

remains strictly concave and increasing. Thus,?P is a convex prob-

maximizer, we can use its linear mapping point in Π as the approx-

imate solution of P.

5.2Approximation Analysis

In this subsection, we provide some theoretical analysis on both

the original problem P and the approximate problem?P.

Theorem 1. The optimal solution of P (?P) must be attained on

PROOF. Here we only show the proof for P, since the proof for

?P is similar. By contradiction, suppose the optimal solution of P,

lem, and always has a unique maximizer. Once we identify this

the boundary of Π (?Π).

Page 6

denoted by r∗, is a strictly interior point of Π2. Since in Π, the

components of r∗only appear in the network capacity constraint

(i.e., xf(l) ≤ rl), we do not need to check other constraints. At

optimality, the network capacity constraint may or may not be ac-

tive (we say a constraint is active if it attains equality). If it is not

active, xf(l)will not change if we increase rl. On the other hand, if

it is active, xf(l)will go up to some extent with the increase of rl.

As a result, the objective value will be improved, since it’s strictly

increasing with xf(l). Since r∗is an interior point, there must exist

some room to increase some components of r∗, without changing

the others. This conflicts the assumption that r∗is the optimal so-

lution. Therefore, r∗must be located on the boundary of Π.

In Π (?Π), which is a compact |L|-dimensional polyhedron, each

points. Thus, the optimal solution of P (?P) can be expressed as

solution, which is the linear mapping point of ˜ r∗, is also located at

the boundary of Π.

Now, we are interested in how far our approximate solution is

from the optimal solution. In other words, we want to know the

difference between our approximate objective value and the opti-

mal objective value. We first introduce some notations. For any

point r0 (˜ r0) in Π (?Π), we define P(r0) (?P(˜ r0)) as the optimiza-

P∗(r0) (?P∗(˜ r0)) be the optimal objective value of P(r0) (?P(˜ r0)).

P∗= P∗(r∗) (?P∗=?P∗(˜ r∗)) to denote the global optimal objec-

Then, we investigate the performance of the approximate solu-

tion. Suppose the objective value of our approximate solution is

ˆP∗. In the rest of this section, we first show that the difference be-

tween the global optimal objective value of P (i.e., P∗) andˆP∗is

bounded by?P∗−ˆP∗through Theorem 2, and then give a looser

Theorem 2. The optimal objective value of the original prob-

lem P is upper bounded by the optimal objective value of the ap-

proximate problem?P.

in Fig. 3(a). Thus, r∗can be expressed as r∗=

Suppose its genuine mapping point in?Π is ˜ rG. As can be seen

ways find the linear mapping point of r∗, which is denoted by ˜ rL

and shown in Fig. 3(b). Since the function f(x) = −ln(1 − x)

is strictly convex, it can be derived that for each l ∈ L, ˜ rG

−ln(1 − r∗

Similar to the proof of Theorem 1, we can show that?P∗(˜ rG) ≤

P∗= ?P∗(˜ rG) and?P∗(˜ rL) ≤ ?P∗, it can be concluded that

By Theorem 2, the approximation ratio of our solution can be

bounded by

ˆ P∗

P∗−ˆP∗.

Theorem 3. Suppose that the optimal solution of?P is ˜ r∗, and

2In fact, a solution also includes the rate x, here we only consider

the capacity r simply for the ease of expression.

facet of its boundary is defined by the convex hull of |L| extreme

r∗=?|L|

i=1αiri(˜ r∗=?|L|

i=1αi˜ ri). In addition, the approximate

tion problem P (?P) when r (˜ r) is fixed to be r0(˜ r0). In addition, let

Suppose r∗(˜ r∗) is the global optimal solution in Π (?Π), we use

tive value of P (?P).

but simpler bound by Theorem 3.

PROOF. Let the point in Π which maximizes P be r∗, as shown

?|L|

i=1αiri.

in Fig. 3(b), it may not be inside?Π. However, in?Π, we can al-

l

=

l.

l) ≤

?|L|

i=1αi(−ln(1 − ri

l)) =

?|L|

i=1αi˜ ri

l= ˜ rL

?P∗(˜ rL) by moving each component of ˜ rGtowards ˜ rL. Since

P∗≤?P∗.

?P∗−ˆ P∗

. Next, we give a looser but simpler bound of

its linear mapping point in Π, i.e., the approximate solution is rL.

Furthermore, let ˜ r0 be rL’s corresponding genuine mapping point

in?Π. Then, the value of P∗−ˆP∗is bounded by µα∗T(˜ r∗− ˜ r0),

problem?P(˜ r0).

is shown in Fig. 3(d). Since rLis the approximate solution, by The-

orem 2, P∗− P∗(rL) is bounded by?P∗(˜ r∗) − P∗(rL), which is

tion and sensitivity (Chapter 5.6 in [24]), we denote the perturbed

version of the optimization problem?P(˜ r0) by?P˜ r0, in which the

Here u := (ul,l ∈ L) is the vector of perturbation variables. It is

evident that?P˜ r0coincides with problem?P(˜ r0) when u is a zero

relaxed the transformed capacity constraint of link l.

We denote the optimal objective value of?P˜ r0at u by?P∗

cave,?P∗

tween P∗and P∗(rL) can be bounded as follows: P∗−P∗(rL) ≤

?P∗(˜ r∗) −?P∗(˜ r0) =?P∗

Let f(x) = −ln(1 − x), and thus f−1(y) = 1 − exp(−y).

As previously discussed, ˜ r∗=?|L|

tation of a vector to delegate all of its components.) and rL=

?|L|

˜ r0 = f(rL), it can be inferred that ˜ r∗≥ ˜ r0(i.e., ˜ r∗

fore, each component of ˜ r∗−˜ r0is nonnegative. Furthermore, since

˜ r0 = f(?|L|

where µα∗represents the vector of the optimal dual variables of

PROOF. Ascanbeseen, ˜ r∗and ˜ r0areshowninFig.3(c), andrL

further equal to?P∗(˜ r∗)−?P∗(˜ r0). Based on the theory of perturba-

transformed capacity constraint is replaced by ˜ xf(l)− ˜ r0l ≤ ul.

vector. On the other hand, when ulis positive it means that we have

˜ r0(u).

According to [24] (Chapter 5.6.1), since problem?P(˜ r0) is con-

?P∗

˜ r0).

˜ r0(u) is a concave function of u. It follows that?P∗

˜ r0(u) ≤

˜ r0(0) + µα∗Tu. Therefore, let u = ˜ r∗− ˜ r0, the difference be-

˜ r0(u) −?P∗

˜ r0(0) ≤ µα∗Tu = µα∗T(˜ r∗−

i=1αi˜ ri. It follows that rG=

f−1(?|L|

i=1αiri. Similar to the proof of Theorem 2, it can be proved

that rG≥ rL(i.e., rG

i=1αi˜ ri) (for the sake of simplicity, here we use the no-

l ≥ rL

l). Since f(x) is strict increasing, and

l≥ ˜ r0l). There-

i=1αif−1(˜ ri)), µα∗T(˜ r∗−˜ r0) is a function of µα∗and

˜ ri, i = 1,2,...,|L|.

From this bound, it can be seen that P∗−ˆP∗is proportional

to the difference between ˜ r∗and ˜ r0. Actually, it is not difficult to

show that when the capacity of each link decreases, the difference

between ˜ r∗and ˜ r0 will drop accordingly. However, this does not

necessarily means that µα∗T(˜ r∗− ˜ r0) will also drop, since µα∗

may increase with the decrease of capacities3. In fact, µα∗depends

on the particular utility function we choose, and thus there is no

universal conclusion on this point. In Section 9, we will show an

example in which P∗−ˆP∗drops when the capacity of each link is

reduced.

6. CROSS LAYER DESIGN VIA DUAL DE-

COMPOSITION

6.1 The Dual Problem

Solving?P directly requires global coordination of all flows,

networks. Since?P is a convex program with compact feasible

Therefore, there exists a unique maximizer (˜ x∗, ˜ r∗) for?P, which

3For more detailed explanation on µα∗, please refer to Section 6.

4Slater’s condition can be guaranteed by assuming there exist vec-

tors ˜ x ∈? X and ˜ r ∈?Π which satisfy all the constraints, i.e., strictly

which is impractical in a distributed environment such as sensor

region, strong duality can be achieved4(Chapter 5.2.3 in [24]).

can be attained by a distributed algorithm derived via formulating

feasible points exist.

Page 7

and solving the Lagrange dual problem of?P. In order to achieve

L(˜ x, ˜ r,µα,µβ) =Uf(1 − exp(−˜ xf)) −

this, we first take a look at the Lagrangian of?P:

?

−

f∈FS

?

l∈L

µα

l(˜ xf(l)− ˜ rl)

?

f∈FA

µβ

f(

?

fc∈C(f)

˜ xfc− ˜ xf)

.

In L(˜ x, ˜ r,µα,µβ), µα:= (µα

F) are vectors of Lagrangian multipliers, corresponding to the

transformed capacity constraint (7) and the transformed aggrega-

tion constraint (8), respectively. They are also interpreted as the

“shadow prices" of the constraints, which can be understood as the

“costs" a flow will be charged if it violates the constraints.

Since it can be derived that

l,l ∈ L) and µβ:= (µβ

f,f ∈

?

?

f∈FA

µβ

f(

?

fc∈C(f)

˜ xfc− ˜ xf) =

?

?

f∈FA

µβ

f

?

fc∈C(f)

˜ xfc−

?

π(f)− µβ

f∈FA

(µβ

µβ

f˜ xf

=

f∈F

µβ

π(f)˜ xf−

?

f∈FA

µβ

f˜ xf=

f∈FS

µβ

π(f)˜ xf+

?

f∈FA

f)˜ xf

and

grangian as follows:

?

l∈Lµα

l˜ xf(l) =

?

f∈Fµα

l(f)˜ xf, we reorganize the La-

L(˜ x, ˜ r,µα,µβ) =

?

f∈FS

+

?

Uf(1 − exp(−˜ xf)) − (µα

?

l(f)+ µβ

π(f))˜ xf

?

?

f∈FA

(−µα

l(f)− µβ

π(f)+ µβ

f)˜ xf

?

+

?

l∈L

µα

l˜ rl.

The dual of the primal problem?P is:

?D :min

µα,µβ≥0

D(µα,µβ),

where the dual objective function D(µα,µβ) is given as

D(µα,µβ) := max

X, ˜ r∈?Π˜ x∈?

L(˜ x, ˜ r,µα,µβ)

In the dual objective function, the Lagrangian multipliers

(shadow prices) µαand µβ, serve as the dual variables. Further-

more, D(µα,µβ) can be decomposed into two separate optimiza-

tion problems: D(µα,µβ) = D1(µα,µβ)+D2(µα). D1(µα,µβ)

and D2(µα) are defined below:

D1(µα,µβ) := max

˜ x∈?

X

?

f∈FS

+

?

Uf(1 − exp(−˜ xf)) − (µα

?

l(f)+ µβ

π(f))˜ xf

?

?

µα

f∈FA

?

(−µα

l(f)− µβ

π(f)+ µβ

f)˜ xf

?

D2(µα) := max

˜ r∈?Π

l∈L

l˜ rl

Among them, D1(µα,µβ) denotes the rate allocation problem,

while D2(µα) is the scheduling problem. In particular, the rate al-

location problem aims at finding the rate of each source node that

maximizes the aggregate utilities of all sources, subject to the con-

straint that the system is stable under some scheduling policy, while

the scheduling problem focuses on finding a scheduling policy that

stabilizes the system, for any rate vector of sources picked by the

rate allocation problem. In the rest of this section, we will first

elaborate on these two problems separately, and then explain how

to develop a cross-layer joint design of them.

6.2 Interpretation of the Prices

Before proceeding with the decoupled problems, we first provide

detailed explanation on the aforementioned shadow prices µαand

µβ. Theoretically, these prices represent the “costs" a flow will

be charged if it violates the constraints. In practice, they imply

the congestion information that the network elements need to share

with each other, so that the traffic rates on different links of the

network can be adjusted appropriately.

Let us first take a look at µα, which corresponds to the trans-

formed capacity constraint in?P. When a flow f violates this

a cost of µα

cal interpretation of this cost in the context of the original prob-

lem P. Since ˜ xf(l) = −ln(1 − xf(l)), it can be derived that

d˜ xf(l)=

to xf(l)is µα

l

In this paper, we call µα

l

it passes data at a rate of xf(l) = ˆ xf(l). With link price, when a

flow f violates the network capacity constraint (3) in the original

problem P, i.e., xf(l) > rl, the total cost it needs to pay can be

calculated as follows:?xf(l)

the context of?P, which is µα

gation constraint in?P.

of aggregating packets is µβ

When at optimality, according to the Karush-Kuhn-Tucker con-

ditions, only the prices corresponding to active constraints are pos-

itive, which implies the price of an uncongested link is zero.

6.3 The Rate Allocation Problem

The rate allocation problem can be further divided as follows:

constraint (i.e., ˜ xf(l) > ˜ rl), if f increases its rate for d˜ xf(l),

ld˜ xf(l)should be charged. Next, we give a practi-

1

1−xf(l)dxf(l). Therefore, the cost charged with respect

ld˜ xf(l)= µα

1

1−ˆ xf(l)the link price of flow f when

1

1−xf(l)dxf(l).

rl

µα

l

1

1−ˆ xf(l)dˆ xf(l)= µα

l(˜ xf(l)− ˜ rl).

As can be seen, it is quantitatively equal to the cost calculated in

l(˜ xf(l)− ˜ rl).

On the other hand, µβcorresponds to the transformed aggre-

When the aggregate transformed rates

of f’s children flows are larger than f’s transformed rate (i.e.,

?

fc∈C(f)˜ xfc> ˜ xf), the total cost paid by them to f for its efforts

f(?

fc∈C(f)˜ xfc− ˜ xf).

D1(µα,µβ) =

?

f∈FS

max

˜ xf∈?

Xf

Φ(˜ xf) +

?

f∈FA

max

˜ xf∈?

l(f)+ µβ

Xf

Ψ(˜ xf)

where

Φ(˜ xf) = Uf(1 − exp(−˜ xf)) − (µα

Ψ(˜ xf) = (−µα

In other words, the rate allocation problem can be solved

through separately solving the optimization problem of each source

flow (i.e., max˜ xf∈?

max˜ xf∈?

actually represent the “net benefit" of a flow.

Let us first study the optimization problem of each source flow

f ∈ FS. As previously discussed, Uf(1 − exp(−˜ xf)) is strictly

concave and twice continuously differentiable.

Φ(˜ xf) is strictly concave and smooth, and thus has a unique max-

imizer when

d˜ xf

the optimal solution can be easily identified. For example, assume

Uf(.) = ln(.), it follows that

d˜ xf

µβ

π(f))˜ xf

l(f)− µβ

π(f)+ µβ

f)˜ xf.

XfΦ(˜ xf)), and each aggregation flow (i.e.,

XfΨ(˜ xf)). Recall that µαand µβare the costs a flow

will be charged if it violates the constraints. Φ(˜ xf) and Ψ(˜ xf)

Consequently,

dΦ(˜ xf)

= 0. Thus, given a valid utility function,

dΦ(˜ xf)

=

exp(−˜ xf)

1−exp(−˜ xf)− (µα

l(f)+

π(f)) = 0 from where the maximizer can be solved as below:

˜ x∗

f= −ln

?

µα

l(f)+ µβ

l(f)+ µβ

π(f)

µα

π(f)+ 1

?

.

Page 8

When taking into account the feasible range of ˜ xf, which is

? Xf = [−ln(1 − mf),−ln(1 − Mf)], the optimal value of ˜ xf

given µαand µβshould be

˜ xf(µα,µβ) = arg max

˜ xf∈?

Xf

Φ(˜ xf)

=

˜ x∗

−ln(1 − mf)

−ln(1 − Mf)

f

if − ln(1 − mf) ≤ ˜ x∗

if ˜ x∗

if ˜ x∗

f≤ −ln(1 − Mf)

f< −ln(1 − mf)

f> −ln(1 − Mf)

(9)

On the other hand, for each aggregation flow f ∈ FA, since

dΨ(˜ xf)

d˜ xf

the feasible range? Xf = [0,−ln(1 − Mf)], together with µαand

= −µα

l(f)− µβ

π(f)+ µβ

fis a constant, it follows that given

µβ, the optimal value of ˜ xf can be calculated as below:

˜ xf(µα,µβ) = arg max

˜ xf∈?

Xf

Ψ(˜ xf)

=

0

if µβ

if µβ

otherwise

f< µα

f> µα

l(f)+ µβ

l(f)+ µβ

π(f)

−ln(1 − Mf)

any value in? Xf

π(f)

.

(10)

As previously discussed, strong duality holds in?P, and thus

µαand µβexist (Proposition 5.1.4 in [25]), denoted as µα∗and

µβ∗. If µα∗> 0 and µβ∗> 0 are dual optimal, then ˜ xf(µα∗,µβ∗)

is also primal optimal, given that ˜ xf is primal feasible (Proposition

5.1.5 in [25]). In other words, once the optimal prices µα∗and µβ∗

are available, the optimal rate ˜ x∗

and µβis two-fold. First, they serve as the pricing signal for a flow

to adjust its rate. Second, they decouple the primal problem, i.e.,

the global utility optimization into individual rate optimization of

each flow.

there is no duality gap. Thereby, the optimal dual variables (prices)

fcan be achieved. The role of µα

6.4 The Scheduling Problem

We now turn to the scheduling problem D2(µα). It is actually a

NP-hard problem, since it is equivalent to the maximum weighted

independent set problem over the conflict graph. Actually, the con-

flict graph depends on the underlying interference model. In this

paper, we consider node-exclusive interference model, i.e., links

that share a common node cannot transmit or receive simultane-

ously. This model has been used in many existing works [7, 8, 9] on

network utility maximization. With the node exclusive interference

model, the scheduling problem can be reduced to the maximum

weighted matching problem, which is polynomial-time solvable.

However, the existing polynomial-time solution [26] requires cen-

tralized implementation. In [27], a simple distributed approximate

algorithm is presented, which is at most a factor of 2 away from the

maximum, and has a linear running time O(|L|). We utilize this

algorithm to solve the scheduling problem D2(µα) in a distributed

manner.

Actually, the rate control strategy proposed in this paper is a gen-

eral framework and thus can be extended to other interference mod-

els. For any interference model, as long as an appropriate algorithm

can be designed to solve the scheduling problem D2(µα), it can be

integrated with our framework.

Additionally, in some applications of sensor networks, the duty-

cycle of the sensor nodes further complicate the scheduling prob-

lem [28, 29]. We will try to address this challenge in our future

work.

6.5 Subgradient Algorithm

Now let us see how we can minimize the dual objective function

D(µα,µβ). Gradient-based methods are, in general, attractive ap-

proaches to carry out minimizations of this type. Unfortunately, in

our case, D(µα,µβ) is nondifferentiable, and therefore its gradient

may not always exist. This is because in general, differentiability

of the dual requires a unique primal optimizer, whereas in our case,

the optimal values of ˜ xf (f ∈ FA) can be non-unique. Further-

more, D2(µα) is a piecewise linear function and not differentiable.

Therefore, we choose to use subgradient method to solve this prob-

lem.

The subgradient algorithm that we propose next is based on the

subgradient method developed by N. Z. Shor (Chapter 2 in [30]). In

ourproblem, althoughthedualgradientdoesnotexist, subgradients

do. Based on Proposition 6.1.1 of [25], we adjust µαand µβin the

opposite direction to the subgradients:

µα

l(t + 1) =

?

?

µα

l(t) − h(t)∂D(µα(t),µβ(t))

∂µα

l

?+

(11)

=µα

l(t) + h(t)(˜ xf(l)(µα(t),µβ(t)) − ˜ rl(µα(t)))

?+

µβ

f(t + 1) =

?

µβ

f(t) − h(t)∂D(µα(t),µβ(t))

∂µβ

f

?+

(12)

=

?

µβ

f(t) + h(t)(

?

fc∈C(f)

˜ xfc(µα(t),µβ(t)) − ˜ xf(µα(t),µβ(t)))

?+

In the above formulas, the ˜ xf(µα,µβ) and ˜ rl(µα) are the max-

imizers of D1(µα,µβ) and D2(µα), given µαand µβ; h(t) is a

positive scalar stepsize (note that the unit of t is time slot, not sub-

slot); ‘+’ denotes the projection onto the set R+ of non-negative

real numbers.

Equation (11) reflects the law of supply and demand. If the de-

mand of a flow f for bandwidth ˜ xf(l)exceeds its supply ˜ rl, the

transformed capacity constraint is violated. Thus, the price µα

raised. Otherwise, µα

dren flows fc ∈ C(f) demand an aggregate rate higher than the

rate of its parent flow f, the transformed aggregation constraint is

violated. Thus, the price µβ

l is

l is reduced. Similarly, in (12), if the chil-

fis raised. Otherwise, µβ

fis reduced.

6.6 Convergence Analysis

In this subsection, we justify the convergence property of the

subgradient algorithm. Subgradient may not be a direction of de-

scent, but makes an angle less than 90 degrees with all descent

directions. Using results on the convergence of the subgradient

method [25, 30], we show that, for a constant stepsize h, the al-

gorithm is guaranteed to converge to within a neighborhood of

the optimal value. The reason why we choose a constant step-

size is that it is convenient for distributed implementation. Since

the usual convergence criterion is not applicable for a subgra-

dient algorithm5, we are interested in the asymptotical conver-

gence. Similar to [7], we define µα(T) :=

µβ(T) :=

T

and let ˜ x :=

T

T. The following theorems guarantee the statistical convergence

of the subgradient method. The proofs are similar to [7], and are

omitted due to the limit of space.

1

T

?T

t=1µα(t) and

1

?T

t=1µβ(t) as the average dual variables by time T,

?T

1

t=1˜ x(t) be the average primal variable by time

5This is because the dual cost usually will not monotonically ap-

proach the optimal value, but wander around it under the subgradi-

ent algorithm.

Page 9

Theorem 4. Let µα∗and µβ∗be the optimal dual variables,

then, for some 0 < B < ∞, the following inequality holds

limsup

T→∞

D(µα,µβ) − D(µα∗,µβ∗) ≤ hB.

(13)

Theorem 5. Let ˜ x∗be the optimal rate of?P, then, for some

liminf

T→∞

0 < B < ∞, the following inequality holds

?P(˜ x) ≥?P(˜ x∗) − hB.

(14)

The above theorems imply that the time-average primal and dual

variables obtained by the subgradient algorithm can be made ar-

bitrarily close to the optimal values if we choose the stepsize h

sufficiently small.

7.DISTRIBUTED IMPLEMENTATION

In this section, we describe how the subgradient algorithm can

be implemented in a real network in a distributed and scalable way.

In our design, A source (aggregation) node needs to communicate

only with its parent and children nodes. In detail, each node col-

lectsthetransformedrate ˜ xfromitschildren, andupdatestheprices

(µαand µβ) based on Eqn. (11) and Eqn. (12). Then, it broadcasts

updated prices to its children. Upon receiving the price information

from its parent, each node calculates its transformed rate based on

Eqn. (9) or Eqn. (10). Then, it forwards its updated rate to its par-

ent. Moreover, the nodes solve the scheduling problem through the

distributedalgorithmaswediscussedpreviouslyinSection6.4, and

decide who will have a chance to transmit in the next slot. Before

convergence, each node transmits at a rate ˆ x = min(x,r). At each

subslot, it must conform to the data availability constraint.

In our algorithm, in each iteration, an independent set is picked

as the solution of the scheduling problem. From a long-term per-

spective, the algorithm jumps among the extreme points (i.e., ri)

of the capacity region (recall that each extreme point corresponds

to an independent set.), and never touches the inner area. As afore-

mentioned, the optimal solution (i.e., r∗) is the convex combination

of |L| extreme points (i.e., r∗=?|L|

percentage of iterations that the algorithm picks the ith independent

set, after the system converges.

i=1αiri), located on a facet of

the capacity region’s boundary. In reality, each αi is actually the

8.

8.1

DISCUSSIONS

Validity of Data Availability Constraint

As mentioned in Section 4.3, the data availability constraint is

not taken into account when we solve the optimization problem P.

However, this will not cause any problem as long as the rate of

each flow converges to a feasible point. In an aggregation node, it

maintains a queue for each of its children, and one more queue for

the available data. Suppose the packets in each queue are sorted by

their timestamps, as shown in Fig. 1 and Fig. 2. Thus, the height of

each queue is determined by the timestamp (i.e., Tt) of the packet

on the top of this queue. If the aggregation node behaves strictly

following the data availability constraint, the queue of the avail-

able data should have the same height as the shortest child queue.

Clearly, after the optimal solution which satisfies both the network

capacity constraint and data aggregation constraint is attained, the

height of each child queue as well as the queue storing the available

data will not grow infinitely.

Furthermore, oursolutionissuboptimal, andthusdoesnotutilize

the network resource to the extreme. Therefore, there is no doubt

that the proposed scheme in this paper will not overflow any node

in the aggregation tree.

8.2Periodic Data Collection

Some sensing tasks require periodic data collection, namely, the

intervals between the timestamps of consecutive packets are fixed.

In this case, if we further assume synchronized data collection, i.e.,

allthesourcesstarttheircollectionatthesametime, wecanachieve

the largest time-overlap of the packets, and thus maximize the rate

of each source node. However, in practice, the time-offsets of the

packets from different nodes may be time-varying, due to the dy-

namic join (leave) of sensor nodes, and the oscillation of the rates

caused by the variation of the environment as well as the underly-

ing MAC layer scheduling. In this scenario, the proposed algorithm

can be considered as a good approximation, since the probabilistic

rate model can capture the long-term expectation of time-offsets.

For example, in the scenario shown in Fig. 1, node A and B both

collect data in a periodic pattern, and their rates are1

are two possibilities for the time-offset of the packets from A and

B, as shown in Fig. 1(b) and Fig. 1(c). The rate of aggregated pack-

ets (i.e., node C’s sending rate) in these two cases are

respectively. If either case has the same chance to happen, the ex-

pected rate of the aggregation flow is

exactly equals the lower bound of node C’s rate derived by the data

aggregation constraint (1 − (1 −1

Even if the time-offsets can be controlled, however, in this sce-

nario it is extremely difficult to mathematically model the relation-

ship between a parent flow and its children flows in a convex func-

tion. Suppose the data collection are all synchronized, what we can

do is to provide some tricks which can improve the objective value

after the algorithm converges. In detail, we check the source flows

sharing the same parent. If their periods are co-prime to each other,

there is nothing can be improved since the data aggregation con-

straint precisely models the aggregation of the source flows with

coprime periods. If the periods of some flows share a greatest com-

mon divisor α (let Fαbe the set of them), we fix their rates as

constants in P, and use a virtual flow fα to replace them in the

data aggregation constraint. fα is resulted from aggregating the

flows in Fαwhen they are synchronized, and its rate is the con-

stant xfα=

the optimization of P. Since xfαis lower than the rate derived by

the data aggregation constraint, some network resources are saved,

and thus the rates of other flows can be improved. As an example,

suppose the rates of the flows from node A and B shown in Fig. 1

are1

ously, xfαis lower than the rate obtained based on Eqn (5), which

is3

at node C and allocated to its neighboring flows. After reoptimiza-

tion, the rates of these neighboring flows will be improved.

2and1

4. There

1

2and

3

4,

1

2× (1

2+

3

4) =

5

8. This

2)(1 −1

4) =5

8).

1

α(1 −?

f∈Fα(1 − αxf)). Subsequently, we restart

4and1

6, we have xfα=1

3according to above formula. Obvi-

8. Thus, some bandwidth can be saved from the flow originated

8.3Lossy Link

Due to the unliable nature of wireless communication, packets

may be lost during transmission. In our scheme, lost packets do

not matter at all, since from the perspective of the receiver, lost

packets look like “nonexistent packets", namely, the source nodes

never collect data at those subslots. Furthermore, if the average

reception probability of each link can be measured, the formulation

of the problem can be easily redefined so as to take it into account.

Retransmissions are not needed in our solution.

8.4Energy Constraint

Energy scarcity is a major challenge for the design of sensor net-

works. Our approach can also be adapted to address this problem.

The solution is to add an energy constraint to the problem formula-

tion. As a result, the energy budget of each node on the aggregation

Page 10

2

5

5

1

2

7

4

7

4

S

1

6

6

8

89

9 10

1012

12

3

3

11

11

13

13 15

15

14

14

16

16 17

17

Figure 4:

tree

An aggregation

0 50010001500 2000

0

0.02

0.04

0.06

0.08

0.1

Normalized Time

Normalized Rate

(a) Rate of flow 10

05001000 15002000

0

0.02

0.04

0.06

0.08

0.1

Normalized Time

Normalized Rate

(b) Rate of flow 14

Figure 5: Rates of source

flows

tree will be considered when the algorithm allocates the resource of

the network.

8.5Time Synchronization

The problem of synchronization has been considered and ad-

dressed by the prior work on data aggregation in sensor net-

works [1], and we just borrow the existing solutions.

9.PERFORMANCE EVALUATION

In this section, we provide simulation results to complement the

analysis in the previous sections. We consider a randomly gener-

ated aggregation tree shown in Fig. 4. On this tree, 10 source nodes

(shaded nodes) collect and forward data to the sink S, through 7

aggregation nodes. In addition, the number on each node (edge)

works as the index of this node (flow). First, we assume that all the

links have a normalized capacity of 0.5, and all the source nodes

use the same utility function U(x) = ln(x). Then, we apply our

joint rate control and scheduling algorithm with a fixed stepsize

h = 1 on this aggregation tree, and observe its performance.

Figure 5 shows the evolution of the rates of the source flow 10

and 14. The other source flows have similar behavior and thus

we omit their results. As one can see, they converge quickly to

a neighborhood of the optimal values and oscillate around the opti-

mal values. This oscillating behavior mathematically results from

the nondifferentiability of the dual function and physically can be

interpreted as due to the scheduling process.

0 500 100015002000

0

500

1000

1500

Normalized Time

Available Data

0 5001000 15002000

0

500

1000

1500

Normalized Time

Available Data

(a) Available data of node 5(b) Available data of node 7

Figure 6: Available data stored in aggregation nodes

Figure 6 describes the amount of available data stored in node

5 and 7. The other nodes have similar behavior and thus we omit

their results. In this test, we assume each time slot (length of step)

contains 100 subslots. As can be seen, although the two curves

both fluctuate with the time going, they are bounded reasonably.

Theriseoffluctuationcanbeascribedtotheunderlyingscheduling,

which prevents an aggregation node from receiving and transmiting

packets at the same time.

0 200400 600800 1000120014001600 1800 2000

0

5

10

15

20

25

30

Normalized Time

Delay

Delay of Flow 1

Delay of Flow 2

Delay of Flow 5

Delay of Flow 8

Figure 7: Delays of the packets delivered by flows

Figure 7 demonstrates the average delays of the packets deliv-

ered by four flows. Here the concept of delay is defined as the

period from the moment when a packet is generated by a source

to the time when this packet is delivered by a flow. For example,

suppose node 8 generates a packet at time 1. Based on the delay

values shown in Fig. 7, this packet will arrive at node 5 at time 6,

since the delay of flow 8 is roughly 5. Similarly, it will arrive at

node 2 at time 8 since the delay of flow 5 is 7, and node 1 at time

10 since the delay of flow 2 is 9. Finally it will reach the sink at

time 14 since the delay of flow 1 is 13. Here we should note that

this packet may be aggregated during this process. As can be seen

in the figure, the average delay of each flow converges to a stable

point soon after the algorithm is started.

0.1 0.20.30.4 0.5 0.60.7 0.8 0.9

−50

−40

−30

−20

−10

0

Normalized Capacity

(a) Optimal Value

Optimal Objective Value

Global Optimal Solution

Approximate Solution

0.10.20.3 0.40.5 0.6 0.70.80.9

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Normalized Capacity

Approximation Ratio

(b) Approximation Ratio

Figure 8: Optimal Solution V.S. Approximate Solution

Finally, Fig. 8 discloses the difference between the optimal ob-

jective value and the approximate objective value. In this test, we

tune the capacity of each link, and observe its impact on the objec-

tive value. Intuitively, as illustrated in Fig. 8(a), both the optimal

objective value and the approximate objective value increase with

the capacities growing. At first, when the capacity of each link is

as low as 0.1, the difference between the two values is negligible,

since the approximation ratio defined as

than 1%, as shown in Fig. 8(b). As the capacities grow up, the

difference as well as the approximation ratio increase accordingly.

They remain in a low level (less than 10%) until the capacity goes

beyond 0.5. Finally, the approximation ratio reaches around 20%

when the capacity is increased to 0.9.

To find a reasonable explanation for this point, let us observe

the function we use in all the variable substitutions, which is

f(x) = −ln(1−x). Thecurvature, i.e., thesecondorderderivative

|Optimal−Approximate|

|Approximate|

is less

Page 11

of this function is monotonously increasing with x. Thus, when the

original variable x is small, the value of the new variable y = f(x)

is close to x. For this reason, when we decrease the capacity of

each link, the boundary of the transformed region Π?on which the

approximate solution (i.e., ˜ r0shown in Fig. 3(c)) is located will be-

come closer to the boundary of the approximate region?Π on which

quently, ˜ r∗−˜ r0drops accordingly, resulting in a smaller difference

between the optimal value and the approximate value, by Theo-

rem 3. Finally, it should be noted that the optimal value shown in

this test is actually the optimal value of the approximate problem

(i.e.,?P∗), which has been proved to be the upper bound of the real

solution should be even closer to the real optimal solution than we

observe in this experiment.

the optimal solution (i.e., ˜ r∗shown in Fig. 3(c)) is located. Conse-

optimal value P∗. This implies that in practice, our approximate

10.CONCLUSIONS

In this paper, we identify the unique challenges of rate allocation

in the context of data aggregation in wireless sensor networks, and

formulate this problem as a network utility maximization problem.

After transforming this problem into a convex approximate prob-

lem, we decompose it based on the duality theory, and propose a

distributed algorithm to solve the decoupled problems. Theoretical

analysis and simulation results demonstrate the near-optimal per-

formance of our scheme.

11.ACKNOWLEDGMENTS

We are grateful to Professor P.R.Kumar for his kind help on

the formulation and solution of the problem. Thanks are also due

to Prof.Angelia Nedich, Prof.Tarek Abdelzaher, I-Hong Hou and

Yuhan Zhang for providing helpful comments and feedback.

12.

[1] S. M. Michael, M. J. Franklin, J. Hellerstein, and W. Hong,

“Tag: a tiny aggregation service for ad-hoc sensor networks,”

in OSDI, 2002.

[2] F. Kelly, “Charging and rate control for elastic traffic,”

European Transactions on Telecommunications, vol. 8, pp.

33–37, 1997.

[3] F. Kelly, A. Maulloo, and D. Tan, “Rate control in

communication networks: shadow prices, proportional

fairness and stability,” in Journal of the Operational

Research Society, vol. 49, 1998.

[4] S. H. Low and D. E. Lapsley, “Optimization flow control-i:

basic algorithm and convergence,” IEEE/ACM Trans. Netw.,

vol. 7, no. 6, pp. 861–874, 1999.

[5] K. Kar, S. Sarkar, and L. Tassiulas, “Optimization based rate

control for multirate multicast sessions,” in INFOCOM,

2001.

[6] Y. Cui, Y. Xue, and K. Nahrstedt, “Optimal resource

allocation in overlay multicast,” Parallel and Distributed

Systems, IEEE Transactions on, vol. 17, no. 8, pp. 808–823,

Aug. 2006.

[7] L. Chen, S. H. Low, M. Chiang, and J. C. Doyle,

“Cross-layer congestion control, routing and scheduling

design in ad hoc wireless networks,” in INFOCOM, 2006.

[8] X. Lin and N. B. Shroff, “Joint rate control and scheduling in

multihop wireless networks,” in CDC, 2004.

[9] X. Lin, N. Shroff, and R. Srikant, “A tutorial on cross-layer

optimization in wireless networks,” Selected Areas in

REFERENCES

Communications, IEEE Journal on, vol. 24, no. 8, pp.

1452–1463, Aug. 2006.

[10] A. Eryilmaz and R. Srikant, “Joint congestion control,

routing, and mac for stability and fairness in wireless

networks,” Selected Areas in Communications, IEEE Journal

on, vol. 24, no. 8, pp. 1514–1524, 2006.

[11] P.-J. Wan, K. Alzoubi, and O. Frieder, “Distributed

construction of connected dominating set in wireless ad hoc

networks,” in INFOCOM, 2002.

[12] Y. Wu, S. Fahmy, and N. B. Shroff, “On the construction of a

maximum-lifetime data gathering tree in sensor networks:

Np-completeness and approximation algorithm,” in

INFOCOM, 2008.

[13] B. Yu, J. Li, and Y. Li, “Distributed data aggregation

scheduling in wireless sensor networks,” in INFOCOM,

2009.

[14] P.-J. Wan, S. C.-H. Huang, L. Wang, Z. Wan, and X. Jia,

“Minimum-latency aggregation scheduling in multihop

wireless networks,” in MobiHoc, 2009.

[15] Y. T. Hou, Y. Shi, and H. D. Sherali, “Rate allocation and

network lifetime problems for wireless sensor networks,”

IEEE/ACM Trans. Netw., vol. 16, no. 2, pp. 321–334, 2008.

[16] Y. Xue, Y. Cui, and K. Nahrstedt, “Maximizing lifetime for

data aggregation in wireless sensor networks,” MONET,

vol. 10, no. 6, pp. 853–864, 2005.

[17] B. Przydatek, D. X. Song, and A. Perrig, “Sia: secure

information aggregation in sensor networks,” in SenSys,

2003.

[18] Y. Yang, X. Wang, S. Zhu, and G. Cao, “Sdap: a secure

hop-by-hop data aggregation protocol for sensor networks,”

in MobiHoc, 2006.

[19] A. Giridhar and P. R. Kumar, “Toward a theory of in-network

computation in wireless sensor networks,” IEEE

Communications Magazine, vol. 44, pp. 98–107, 2006.

[20] D. Marco, E. Duarte-Melo, M. Liu, and D. Neuhoff, “On the

many-to-one transport capacity of a dense wireless sensor

network and the compressibility of its data,” in IPSN, 2003.

[21] S. Chen and Z. Zhang, “Localized algorithm for aggregate

fairness in wireless sensor networks,” in MOBICOM, 2006.

[22] C. T. Ee and R. Bajcsy, “Congestion control and fairness for

many-to-one routing in sensor networks,” in SenSys, 2004.

[23] K. Jain, J. Padhye, V. N. Padmanabhan, and L. Qiu, “Impact

of interference on multi-hop wireless network performance,”

in MOBICOM, 2003.

[24] S. Boyd and L. Vandenberghe, Convex Optimization.

Cambridge University Press, 2004.

[25] D. P. Bertsekas, Nonlinear Programming.

Scientific, 1995.

[26] C. H. Papadimitriou and K. Steiglitz, Combinatorial

Optimization: Algorithms and Complexity.

Publications, 1998.

[27] J.-H. Hoepman, “Simple distributed weighted matchings,”

CoRR, 2004.

[28] L. Su, C. Liu, H. Song, and G. Cao, “Routing in

intermittently connected sensor networks,” in ICNP, 2008.

[29] L. Su, B. Ding, Y. Yang, T. F. Abdelzaher, G. Cao, and J. C.

Hou, “ocast: Optimal multicast routing protocol for wireless

sensor networks,” in ICNP, 2009.

[30] N. Z. Shor, K. C. Kiwiel, and A. Ruszcay` nski, Minimization

methods for non-differentiable functions.

USA: Springer-Verlag New York, Inc., 1985.

Athena

Dover

New York, NY,