Towards optimal rate allocation for data aggregation in wireless sensor networks.
ABSTRACT This paper aims at achieving optimal rate allocation for data aggregation in wireless sensor networks. We first formulate this rate allocation problem as a network utility maximization problem. Due to its nonconvexity, we take a couple of variable substitutions on the original problem and transform it into an approximate problem, which is convex. We then apply duality theory to decompose this approximate problem into a rate control subproblem and a scheduling subproblem. Based on this decomposition, a distributed algorithm for joint rate control and scheduling is designed, and proved to approach arbitrarily close to the optimum of the approximate problem. Finally, we show that our approximate solution can achieve nearoptimal performance through both theoretical analysis and simulations.

Conference Paper: Hierarchical aggregate classification with limited supervision for data reduction in wireless sensor networks.
[Show abstract] [Hide abstract]
ABSTRACT: The main challenge of designing classification algorithms for sensor networks is the lack of labeled sensory data, due to the high cost of manual labeling in the harsh locales where a sensor network is normally deployed. Moreover, delivering all the sensory data to the sink would cost enormous energy. Therefore, although some classification techniques can deal with limited label information, they cannot be directly applied to sensor networks since they are designed for centralized databases. To address these challenges, we propose a hierarchical aggregate classification (HAC) protocol which can reduce the amount of data sent by each node while achieving accurate classification in the face of insufficient label information. In this protocol, each sensor node locally makes cluster analysis and forwards only its decision to the parent node. The decisions are aggregated along the tree, and eventually the global agreement is achieved at the sink node. In addition, to control the tradeoff between the communication energy and the classification accuracy, we design an extended version of HAC, called the constrained hierarchical aggregate classification (cHAC) protocol. cHAC can achieve more accurate classification results compared with HAC, at the cost of more energy consumption. The advantages of our schemes are demonstrated through the experiments on not only synthetic data but also a real testbed.Proceedings of the 9th International Conference on Embedded Networked Sensor Systems, SenSys 2011, Seattle, WA, USA, November 14, 2011; 01/2011  [Show abstract] [Hide abstract]
ABSTRACT: Wireless Sensor Network (WSN) is highly budgeted by energy since sensor nodes are mostly batterypowered and deployed in hardtoreach area for prolonged duration. Moreover radio communication is very expensive for WSN. At the same time, a substantial portion WSN applications require periodic data collection. In this paper we investigate this issue in depth and present a solution architecture: 2PDA, that eliminates repeated transmission. The solution is founded upon temporal linear correlation among sensor data. Instead of sending each data packet we model them using method of least square that exploits temporal correlation among sensor data. 2PDA observes sensor data and performs operation parameterized by applicationprecision. After successful computation only the parameters of the model are sent over the radio to the applicationend or sink. 2PDA was implemented in TinyOS. Implementation showed a significant improvement (i.e. 80%) for the node's lifetime. Rigorous numerical analysis was done on various sensor data which indicated its modest efficiency under different scenario. Effects of various parameters such as type of sensory information, time and place of data collection were assessed. Finally a network simulation was carried out to evaluate its scalability.Ad Hoc Networks. 11/2013; 11(8):24702483.  SourceAvailable from: cs.gmu.edu[Show abstract] [Hide abstract]
ABSTRACT: There is currently tremendous interest in deploying energy harvesting wireless sensor networks. Engineering such systems requires striking a careful balance between sensing performance and energy management. Our work addresses this problem through the design and analysis of a harvesting aware utilitybased sensing rate allocation algorithm. Based on a network utility formulation, we show that our algorithm is optimal in terms of assigning rates to individual nodes to maximize overall utility, while ensuring energyneutral operation. To our knowledge, our work is the first optimal solution that maximizes network utility through rate assignments for treestructured energy harvesting sensor networks. Our algorithm is fast and efficient with running time O(N3), where N is the number of nodes. We evaluate the performance, scalability, and overhead of our algorithm for various utility functions and network sizes, underlining its significant advantages.Proceedings of the 14th International Symposium on Modeling Analysis and Simulation of Wireless and Mobile Systems, MSWiM 2011, Miami, Florida, USA, October 31  November 4, 2011; 01/2011
Page 1
Towards Optimal Rate Allocation for Data Aggregation
in Wireless Sensor Networks∗
Lu Su, Yan Gao, and Yong Yang
Department of Computer Science
University of Illinois at UrbanaChampaign
Urbana, IL, 61801, USA
{lusu2, yangao3, yang25}@illinois.edu
Guohong Cao
Department of Computer Science & Engineering
The Pennsylvania State University
University Park, PA, 16802, USA
gcao@cse.psu.edu
ABSTRACT
This paper aims at achieving optimal rate allocation for data ag
gregation in wireless sensor networks. We first formulate this rate
allocation problem as a network utility maximization problem. Due
to its nonconvexity, we take a couple of variable substitutions on
the original problem and transform it into an approximate prob
lem, which is convex. We then apply duality theory to decom
pose this approximate problem into a rate control subproblem and a
scheduling subproblem. Based on this decomposition, a distributed
algorithm for joint rate control and scheduling is designed, and
proved to approach arbitrarily close to the optimum of the approxi
mate problem. Finally, we show that our approximate solution can
achievenearoptimalperformancethroughboththeoreticalanalysis
and simulations.
Categories and Subject Descriptors
C.2.1 [ComputerCommunication Network]: Network Architec
ture and Design—Wireless communication; G.1.6 [Mathematics
of Computing]: Optimization—Convex programming
General Terms
Algorithms, Theory
Keywords
Wireless Sensor Networks, Data Aggregation, Rate Allocation,
Scheduling, Network Utility Optimization, Cross Layer Design
1.INTRODUCTION
Data aggregation has been put forward as an essential paradigm
for routing in wireless sensor networks [1]. The idea is to use a
function like average, max or min to combine the data coming from
different sources enroute to eliminate transmission redundancy and
thus save energy as well as bandwidth. In recent years, a large spec
trum of studies have been conducted on various problems of data
∗This work was supported in part by National Science Foundation
under grant CNS0916171.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MobiHoc’11, May 16–19, 2011, Paris, France.
Copyright 2011 ACM 9781450307222/11/05 ...$10.00.
aggregationinsensornetworks. However, thefollowingfundamen
tal question has not been answered thus far: “Does there exist an
optimal rate allocation for data aggregation which maximizes the
utilization of network resources and meanwhile maintains certain
fairness among all the sources?". This paper gives the answer to
this question.
Finding the optimal rate allocation for data aggregation in sen
sor networks can be regarded as a utilitybased resource alloca
tion problem. In particular, each source is associated with a utility,
which is defined as a function of the source’s sending rate. The
function value can be conceptually regarded as the quality of infor
mation provided by the source. For a given aggregation tree, there
exists a unique “maximum utility" rate allocation, at which the net
work resource utilization is optimal. Meanwhile, certain fairness
such as maxmin fairness and proportional fairness can be achieved
when we choose appropriate utility functions.
The problem of maximizing the network utilities has been ex
plored in the context of both wired [2, 3, 4, 5, 6] and wireless [7, 8,
9, 10] networks, for rate control in unicast or multicast. Although
using a similar approach, we show that rate allocation for data ag
gregation in wireless sensor networks faces unique challenges both
theoretically and practically, making this problem a completely dif
ferent one to which none of the existing solutions can be applied.
A
C
B
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
S
(b) Queues at node C
Scenario I
(c) Queues at node C
Scenario II
(a) Topology
Packets
from A
Packets
from B
Aggregated
by C
Packets
from A
Packets
from B
Aggregated
by C
Figure 1: An example of data aggregation constraint
Challenge I: Theoretically, rate allocation for data aggregation is
not only subject to the network capacity constraint, but also the
data aggregation constraint on the aggregation nodes (i.e., non
leaf nodes) of the aggregation tree.
Figure 1 provides an intuitive example of the data aggregation
constraint. A simple aggregation tree is shown in Fig. 1(a). In this
case, two source nodes A and B collect and transmit data to node C,
who aggregates the received data and forwards them to the sink S.
Fig. 1(b) and (c) illustrate two different scenarios of data aggrega
tion. In either scenario, three columns are displayed, which corre
spond to three queues maintained by node C. The first two queues
store the packets coming from A and B, while the last one (queue
C) keeps the packets generated by aggregating A and B’s packets.
Page 2
The packets in each of the three queues are sorted by the times
tamps recorded in their headers. In Fig. 1(b) and (c), the vertical
axis denotes the timestamp, which indicates the time when the car
ried data packet is collected. In this paper, we assume that only the
data collected at the same time can be aggregated. This assump
tion is valid in many applications of sensor networks, such as target
localization and fire alarming, since the data collected at the same
time usually contain the information about the same event. For this
reason, an aggregated packet has the same timestamp as the raw
packets involved in its aggregation. Sometimes, a packet coming
from a source node has no coincident packets with the same times
tamp from other source nodes, such as the packets with timestamp
3 and 7 in queue A. In this case, the aggregation node does nothing
but simply forwards it upwards.
In the first scenario shown in Fig. 1(b), the number of packets
stored in queue C is the same as the number of packets in queue
A, since the time slots when node B collects data are the subset of
the time slots when A collects data. Therefore, to keep the network
stable, in other words, to prevent the queue of node C from over
flow, the transmission rate of node C should be no less than A’s
rate. However, this doesn’t hold in the second scenario displayed
in Fig. 1(c), where the only difference from scenario I is that the
timestampsofallthepacketsinqueueBareincreasedbyonewhich
implies that node B postpones all its data collections by one time
slot. Surprisingly, this causes a fundamental change in queue C. In
particular, no aggregation can be made, since there is no coincident
packets of A and B. As a result, the number of packets in queue C
is the summation of queue A and B’s packets. Therefore, in this
scenario, the requirement of stability becomes that C should send
faster than the aggregate rate of A and B.
This example reveals the fact that the transmission rate of an
aggregation node is constrained by not only the rates of its children
but also their packet timestamp patterns. The packet timestamp
pattern of a node includes two components: the intervals between
the timestamps of consecutive packets and the timeoffsets of the
packets among the nodes who share the same parent.
1
2
3
4
5
6
7
8
9
10
A
C
N
S
B
(a) Topology
1
2
3
4
5
6
7
8
9
10
(b) Queues at node C
Time t1
Packets
from A
Packets
from B
Aggregated
by C
(c) Queues at node C
Time t2
Packets
from A
Packets
from B
Aggregated
by C
Figure 2: An example of data availability constraint
Challenge II: Practically, rate allocation for data aggregation has
an implicit constraint, which is referred to as the data availability
constraint in this paper. Figure 2 gives us an illustrative example
of this constraint. Similar to the previous example, node A and B
work as the source nodes. However, B is not directly connected to
the aggregation node C. An intermediate node N relays data for B.
Suppose at time t1, as shown in Fig. 2(b), A has delivered some
data to C, whereas B’s data has not arrived at C since they are de
layed at node N. At this moment, although lots of A’s packets are
waiting in its buffer, node C needs to wait until B’s data arrives
at time t2 (for the sake of simplicity, suppose A transmits no data
during this period), and then fulfills its aggregation task, as shown
in Fig. 2(c). This is because if C chooses to deliver A’s packets at
time t1, it has to do the same job again when B’s packets arrive. As
a result, double amount of traffic is injected into the downstream
links by C. This is definitely not an economic solution from the
perspective of network utility maximization.
The main purpose of our work is to address the above challenges.
We first formulate this rate allocation problem as a network utility
maximization problem. Due to its nonconvexity, we take a couple
of variable substitutions on the original problem and transform it
into an approximate problem, which is convex. We then apply du
ality theory to decompose this approximate problem vertically into
aratecontrolsubproblemandaschedulingsubproblemthatinteract
through shadow prices. Based on this decomposition, a distributed
subgradient algorithm for joint rate control and scheduling is de
signed, and proved to approach arbitrarily close to the optimum of
the approximate problem. Finally, we show that our approximate
solution can achieve nearoptimal performance through both the
oretical analysis and simulations. To the best of our knowledge,
this work is the first one to present a joint design of rate allocation
and scheduling in the context of data aggregation in wireless sensor
networks.
The rest of the paper is organized as follows. Section 2 summa
rizes the related work. We introduce the system model in Section 3
and formulate the problem in Section 4. In Section 5, the original
problem is transformed into a convex approximate problem, with
the solution given in Section 6. In Section 7, we explain how the
proposed solution is implemented in a decentralized manner. Then,
wediscusssomerelatedissuesinSection8, andevaluatetheperfor
mance of the proposed schemes in Section 9. Section 10 concludes
the paper.
2.RELATED WORK
In this section, we provide brief summaries of the existing work
on sensory data aggregation and network utility maximization re
spectively, and clarify the novelty of this paper.
Sensory data aggregation becomes a research hotspot after the
presentation of the seminal work [1]. A large variety of problems
regarding this topic have been extensively studied. Representative
problems include: how to construct the most energy efficient ag
gregation tree [11, 12], how to schedule the transmissions of sensor
nodes such that the aggregation delay can be minimized [13, 14],
how to maximize or bound the lifetime of the aggregation tree [15,
16], how to secure data aggregation [17, 18], how to derive theo
retic bound of aggregation capacity [19, 20], how to achieve fair
aggregation [21, 22], etc.
The framework of network utility maximization (NUM) was first
developed in the context of wireline network in [2, 3], followed
by [4, 5, 6]. The main idea of the framework is based on the de
composition of a systemwide optimization problem. A distributed
congestion control mechanism is designed to drive the solutions of
the decomposed problems towards the global optimum. Later on,
NUM was studied in the context of wireless networks. In wireless
networks, this problem is more difficult because of the interference
nature of wireless communication. To address this challenge, a
scheduling mechanism is integrated into the optimization frame
work to stabilize the system [7, 8, 9, 10].
This work is the first attempt to utilize NUM framework to ex
plore the optimal rate allocation for sensory data aggregation. The
theoretical and practical challenges aforementioned in Section 1
make the problem we target at completely different from previous
work, and thus none of the existing solutions can be applied.
3.SYSTEM MODEL
In this section, we explain in detail the system model.
Page 3
3.1Aggregation Model
We consider an aggregation tree T rooted at the sink. We denote
the set of tree edges (links) by L = L(T). The sensor nodes on
T can be divided into source nodes that collect sensory readings,
and aggregation nodes that aggregate and relay sensory readings.
In the rest of the paper, we assume that the source nodes are only at
the leaf nodes of the aggregation tree. However, it is possible that
a sensor node plays a dual role as both source node and aggrega
tion node. This problem can be easily addressed through a simple
graph transformation. In particular, we replace each source node at
a nonleaf node by creating a new leaf node and placing the source
node in it, and then connect the new leaf node to the nonleaf node
where the source node is originally located by a link with infinite
capacity1.
As in most applications of sensor networks, we assume that all
the sensor nodes are equipped with the same sensing and commu
nicating devices, and furthermore, the maximum collecting rate of
the sensing device is larger than the maximum transmitting rate of
the communicating device. The lifetime of the network is divided
into slots with equal duration, and each slot is further divided into
subslots. The sensing device of each node may or may not col
lect data at a subslot. Once it decides to work within a subslot,
it always collects data at its maximum speed. The data collected
within a subslot is encapsulated into a packet, which is referred to
as the Basic Transmission Unit (BTU).
3.2Probabilistic Rate Model
AsdiscussedinSection1, thetransmissionrateofanaggregation
nodeisconstrainedbynotonlytheratesofitschildrenbutalsotheir
packet timestamp patterns. To characterize the packet timestamp
patterns of sensor nodes, we introduce a probabilistic rate model,
which is defined and explained as below:
Definition 1. Probabilistic Rate Model. At each subslot, the
sensing device of a node chooses to collect data or not, based on a
probability, which is referred to as the data collection rate of this
node.
Suppose a time slot is composed of 100 subslots, and the data
collection rate of a given sensor node is 0.5. Then, it works at
roughly50randomlyselectedsubslotsandsleepsattherestoftime.
As a result, it collects around 50 packets (BTU) within a time slot.
We define the data transmission rate of a node as the ratio of the
numberofpackets(BTU)thisnodedeliverswithinatimeslottothe
number of subslots in each slot. Data transmission rate is actually
a normalized rate. Its relation with the genuine transmission rate,
which is equal to the number of bits that a node delivers within a
time slot, can be reflected by the formula: xN=
xNand xGdenote the normalized and genuine data transmission
rate, respectively. In addition, CB is the size of BTU, and Nsub
represents the number of subslots in each slot. For example, as
sume the data transmission rate of a node is 1000bps, CBis 100bits
and Nsubis 20. Based on the above formula, the normalized data
transmission rate is 0.5.
The probabilistic rate model can be considered as a generaliza
tion of a node’s packet timestamp pattern. This can be better un
derstood after we mathematically formulate the problem in the next
section. Additionally, in the sensing tasks which require periodic
data collection, the probabilistic rate model can capture the long
term expectation of timevarying timeoffsets of the packets from
xG
CB×Nsub, where
1In practice, we do not place this link in any independent set, and
thus it has no impact on the scheduling problem.
different nodes. It is extremely difficult to mathematically model
the fixed timeoffsets. More detailed discussion can be found in
Section 8.2.
4.
4.1
Link flow: We define link flow as the singlehop data traffic going
through a link (tree edge) of the aggregation tree. Conceptually, it
includes source flow originating at source nodes and aggregation
flow coming from aggregation nodes. The set of source flows and
the set of aggregation flows are denoted by FSand FA. In ad
dition, F = FS?FArepresents the set of all the flows on the
denote its parent flow and set of children flows, respectively. Fi
nally, f(l) implies the flow which goes through a link l ∈ L, and
l(f) means the link through which a flow f ∈ F passes. We denote
the normalized rate of each flow f ∈ F by xf. For a source flow
f ∈ FS, its rate is quantitatively equal to the data collection rate of
the corresponding source node. We use Xf to denote the interval
in which xf must lie:
PROBLEM FORMULATION
Terminology
aggregation tree. For any flow f ∈ F, we use π(f) and C(f) to
Xf:= {[mf,Mf]
where mf and Mf are the lower and upper bounds of xf.
Queue:Ateach aggregationnode, the packetscoming from each of
its children flows f is buffered in a queue, denoted by Q(f). The
packets in each queue are sorted by their timestamps, as shown
in Fig. 1 and Fig. 2. We use Tt(f) to denote the timestamp of
the packet on the top of Q(f) (largest timestamp), and Tb(f) to
denote the timestamp of the packet at the bottom of Q(f) (smallest
timestamp). In addition, the aggregated packets (available data) are
stored in a separate queue, waiting for transmission.
4.2Constraints
In this part, we elaborate on the constraints that our objective
function is subject to.
Network Capacity Constraint: Based on the network topology, a
conflict graph [23] can be constructed to capture the contention
relations among the links. In the conflict graph, each vertex rep
resents a link, and an edge between two vertices implies the con
tentionbetweenthetwocorrespondinglinks, i.e., theycannottrans
mit at the same time. Given a conflict graph, we can identify all its
independent sets of vertices. The links in an independent set can
transmit simultaneously.
Let I denote the set of independent sets. We represent an in
dependent set, Ii (i = 1,2,...,I), as a Ldimensional rate
vector, which is ri. In ri, the lth entry is ri
Ii;0
otherwise} where cl denotes the capacity of link l ∈ L.
Here we should note that this capacity is a normalized capacity,
which is defined as the ratio of the maximum number of packets
(BTU) that can be delivered through a link within a time slot to the
number of subslots in each slot. The feasible capacity region Π at
the link layer is defined as the convex hull of these rate vectors:
if f ∈ FS;[0,Mf]
if f ∈ FA}.
l := {cl
if l ∈
Π := {r  r =
I
?
i=1
αiri, αi≥ 0,
I
?
i=1
αi= 1}.
With above notations, now we can formally define the network
capacity constraint as follows:
xf(l)≤ rl
for all l ∈ L.
Page 4
Namely, the rate of each flow must not exceed the amount
of capacity allocated to the link it passes through.
{(rl1,rl2,...,rlL) li ∈ L}, and it should satisfy r ∈ Π.
Data Aggregation Constraint: As in the rate control of unicast
and multicast scenarios, it is essential to investigate the relationship
between the rate of a parent flow and the rates of its children flows
so as to stabilize the network. However, the difficulty of achieving
this goal in the context of data aggregation is much larger, since a
slight change in the packet timestamp pattern of a node may incur
significant change in the resulting aggregated packets, as disclosed
in Section 1. To overcome this difficulty, we adopt a probabilistic
rate model, which is defined in Section 3. Given the rate of a node,
this model can generalize all the possibilities of this node’s packet
timestamp patterns.
Under the probabilistic rate model, the data aggregation con
straint can be formulated as follows:
?
Here we give an interpretation of this constraint which may draw
a better understanding. We say a node covers a subslot if there
exists a packet transmitted by this node whose timestamp is exactly
this subslot. Then, 1 − xf denotes the percentage of the subslots
which are not covered by the sending node of f. Later, we use the
concepts of a flow and its sending node interchangeably. It follows
that?
needs to cover all the subslots covered by at least one child, which
is 1 −?
Data Availability Constraint: From the example shown in Fig. 2,
we learn that an aggregation node cannot take any actions before it
makes sure that all the packets collected at the same time have ar
rived. Given a timestamp, after receiving a packet with this times
tamp or larger timestamp from each child node, the aggregation
node will know that all the packets with this timestamp have ar
rived. Here we assume that packets arrive in the order of their
timestamps. Then, it performs the aggregation and puts the result
ing aggregated packet into the queue where the packets available
for transmission are stored. Thus, the timestamps of the available
packets are constrained within the time interval from the small
est Tb(f) to the smallest Tt(f) among all the queues. Recall that
Tb(f) and Tt(f) denote the timestamps of the packets on the bot
tom and the top of each queue, respectively.
With a binary indicator variable bτ(f) defined as below:
Let r :=
1 −
fc∈C(f)
(1 − xfc) ≤ xf
for all f ∈ FA.
fc∈C(f)(1 − xfc) implies the percentage of the subslots not
covered by any of f’s children nodes. Intuitively, a parent node
fc∈C(f)(1 − xfc). By this intuition, the data aggregation
constraint is presented.
bτ(f) =
?
1
0
There is a packet with timestamp τ in Q(f).
otherwise.
In terms of the number of BTUs, we denote the amount of the avail
able data at the sending node of f by λf. λf can be calculated as
follows:
λf=
minfj∈C(f)Tt(fj)
?
τ=minfi∈C(f)Tb(fi)
(
?
fk∈C(f)
bτ(fk))
(1)
where?denotes the bitwise operation “OR".
inFig.2, theamountoftheavailabledataatnodeC(storedinqueue
C) at time t1(Fig. 2(b)) and t2(Fig. 2(c)) are 0 and 6, respectively.
Note that in the scenario happens at t2, we do not take into account
the packet with timestamp 10 in queue B, since it is still unknown
By this formula, we can easily check that in the example shown
at this moment whether node A also has a packet with timestamp
10, and thus cannot mark this packet to be available.
Furthermore, let afbe the amount of data which can be transmit
ted by the sending node of f. Then, the data availability constraint
can be formally defined as follows:
af≤ λf
for all f ∈ FA.
In other words, for each flow f ∈ FA, it can not deliver more
datathantheamountoftheavailabledatastoredatitssendingnode.
Within a time slot, once the available data are all sent out, the send
ing node can not do more transmission even if some of the queues
for its children nodes are not empty. The data availability constraint
minimizes the amount of packets a node could inject into the net
work. By this constraint, for each timestamp, there is at most one
packet with this timestamp arriving at the sink. More importantly,
the data availability constraint is actually the prerequisite of the
data aggregation constraint, since the data aggregation constraint
implicitly assumes all the packets from different sources collected
at the same time are merged into a single packet.
4.3Problem Formulation
With the terminologies and constraints defined above, we can
now formulate the problem to be solved. We associate each source
flow f ∈ FSwith a utility function Uf(xf) : R+ → R+. In this
paper, we assume Ufis continuously differentiable, increasing, and
strictly concave. Our objective is to choose the rate of each flow
xf and the allocated capacity of each link rl so as to maximize
the aggregate utility function. We now formulate the problem of
optimal rate allocation for data aggregation in sensor networks as
the following constrained nonlinear optimization problem:
P : max
?
xf(l)≤ rl
af≤ λf
1 −
fc∈C(f)
x ∈ X,
f∈FS
Uf(xf)
(2)
subject tofor all l ∈ L
for all f ∈ FA
for all f ∈ FA
(3)
(4)
?
(1 − xfc) ≤ xf
(5)
r ∈ Π
In P, the data availability constraint (4) works as the prerequi
site of the data aggregation constraint (5). However, it is actually
an implicit constraint that need not be considered when solving this
optimization problem, although in practice each aggregation node
works following this constraint. We will give more detailed expla
nation on this point in Section 8.1.
By choosing appropriate utility functions, the optimal rate allo
cation can achieve different fairness models among the flows [2,
3]. For instance, if we let Uf(xf) = wfln(xf) for f ∈ FS, the
weighted proportional fairness can be achieved.
5.
5.1
APPROXIMATE PROBLEM
Variable Substitution
Though we formulate the problem of optimal rate allocation, it
turns out to be a nonconvex program, due to the nonconvexity of
the data aggregation constraint (5). To address this problem, we
reorganize the data aggregation constraint and take a log transform
on both sides: ln(1 − xf) ≤?
fc∈C(f)ln(1 − xfc). Next, we
substitute xf of each flow by ˜ xf = −ln(1 − xf), where we call
Page 5
˜ xf the transformed rate of f. By this variable substitution, the
data aggregation constraint becomes:?
aggregation constraint. In addition, based on the feasible region of
xf (f ∈ F), we can derive the feasible region of ˜ xf as
?
[0,−ln(1 − Mf)]
where −ln(1 − mf) ≥ 0 and −ln(1 − Mf) < ∞.
By the variable substitution described above, we transform the
data aggregation constraint into a linear constraint. However, this
substitution has a side effect, namely, it turns the network ca
pacity constraint into 1 − exp(−˜ xf(l)) − rl ≤ 0, another non
convex constraint. To overcome this problem, we reorganize this
constraint and take a log transform on both sides, then we have
˜ xf(l) ≤ −ln(1 − rl). Next, we take another variable substitu
tion on rl: ˜ rl = −ln(1 − rl) where we call ˜ rl the transformed
allocated capacity of link l ∈ L. By this variable substitution, the
nonconvex constraint is transformed into ˜ xf(l)−˜ rl≤ 0. We name
this constraint as the transformed capacity constraint. Recall that
rlis a normalized capacity allocated to link l, and thus it satisfies
0 ≤ rl ≤ 1. Since rl = 1 − exp(−˜ rl), it can be derived that
0 ≤ ˜ rl≤ ∞. By substituting ˜ rlfor rl, the original capacity region
Π is transformed into a transformed capacity region Π?:
fc∈C(f)˜ xfc≤ ˜ xf. In the
rest of this paper, we refer to this constraint as the transformed
? Xf:=
[−ln(1 − mf),−ln(1 − Mf)]f ∈ FS
f ∈ FA
Π?:= {˜ r  ˜ rli= −ln(1 − rli), i = 1,2,...,L, r ∈ Π}
However, Π?is not a convex region. Figure. 3 illustrates this
region transformation. Particularly, Fig. 3(a) shows an example of
twodimensional capacity region Π, and the transformed capacity
region Π?is drawn in Fig. 3(b). As can be seen, Π?(the shaded
area) is not a convex region.
(a)?
(b)?’
(c)?
(d)?
Figure 3: Region transformation
To tackle this problem, we constitute an approximate trans
formed capacity region?Π, which is convex. Recall that the original
all the independent sets. In fact, these rate vectors are the extreme
pointsofΠ, sinceeachofthemcannotberepresentedbytheconvex
combination of others. In the transformed capacity region Π?, let
˜ ridenote the point (vector) transformed from the ith extreme point
ri(rate vector of ith independent set) in Π. In ˜ ri, the lth entry is
˜ ri
if l ∈ Ii;
the transformed capacity, and defined by ˜ cl= −ln(1 − cl).
Now, we can define the approximate transformed capacity region
?Π as the convex hull of these transformed rate vectors:
?Π := {˜ r  ˜ r =
It is not difficult to prove that each ˜ ri, i = 1,2,...,I cannot be
represented by the convex combination of others either, and thus is
an extreme point of?Π. Therefore, for each (ith) independent set,
capacity region Π is actually the convex hull of the rate vectors of
l:= {˜ cl
0
otherwise} where ˜ cl is referred to as
I
?
i=1
αi˜ ri, αi≥ 0,
I
?
i=1
αi= 1}.
there is a onetoone mapping between its corresponding extreme
points in Π (i.e., ri) and?Π (i.e., ˜ ri).
area), which corresponds to the original capacity region Π in
Fig. 3(a). As can be seen, despite of the convexity it achieves, it
does not cover all the points of the transformed capacity region Π?
(the area enclosed by the dashed curve). Furthermore, it includes
some points outside the boundary of Π?, and this implies that?Π
problem. Actually, if we take a reverse variable substitution (i.e.,
rl= 1−exp(−˜ rl)) on each point ˜ r ∈?Π, a new region denoted by
see, it does have some points outside the original capacity region Π
(the area enclosed by the dashed curve).
However, in our algorithm that will be introduced in the next
section, we do not map the solution in?Π to Π in this way, namely,
scheme, which can guarantee that there always exists a feasible
point in Π, which corresponds to the solution attained in the context
of?Π.
lem is rG:= {(rG
may not be located inside Π. However, in Π, we can always find
a point, which is rL=?I
solution is located in?Π, our algorithm can identify a correspond
rGas the genuine mapping point of ˜ r0, and rLas the linear map
ping point of ˜ r0. Similarly, given a point r0 =?I
and ˜ rL=?I
Now, we can formally define the approximate problem as fol
lows:
Figure3(c)showstheapproximatecapacityregion?Π(theshaded
may result in some solutions which are not feasible in the original
?Π?is attained, and shown in Fig. 3(d) (the shaded area). As one can
through rl = 1 − exp(−˜ rl). Instead, we design a safe mapping
Based on the definition of?Π, any point in?Π, say ˜ r0, can be
l
= 1 − exp(−˜ r0l), l ∈ L} , and it
expressed as ˜ r0 =?I
i=1αi˜ ri. Its counterpart in the original prob
l) rG
i=1αiriwhere each αiequals the αiin
i=1αi˜ ri. By this mapping scheme, wherever the optimal
˜ r0 =?I
ing feasible solution inside Π. In the rest of this paper, we refer to
i=1αiriinside
Π, we can define ˜ rG:= {(˜ rG
i=1αi˜ rias the genuine mapping point and the linear
mapping point of r0in?Π.
l) ˜ rG
l
= −ln(1 − r0l), l ∈ L}
?P : max
subject to
?
f∈FS
˜ xf(l)− ˜ rl≤ 0
?
˜ x ∈? X,
Uf(1 − exp(−˜ xf))
(6)
for all l ∈ L
for all f ∈ FA
(7)
fc∈C(f)
˜ xfc− ˜ xf≤ 0
(8)
˜ r ∈?Π
According to [24] (Chapter 3.2.4), since 1 − exp(−˜ xf) is a
strictly concave and increasing function, the objective function (6)
remains strictly concave and increasing. Thus,?P is a convex prob
maximizer, we can use its linear mapping point in Π as the approx
imate solution of P.
5.2Approximation Analysis
In this subsection, we provide some theoretical analysis on both
the original problem P and the approximate problem?P.
Theorem 1. The optimal solution of P (?P) must be attained on
PROOF. Here we only show the proof for P, since the proof for
?P is similar. By contradiction, suppose the optimal solution of P,
lem, and always has a unique maximizer. Once we identify this
the boundary of Π (?Π).
Page 6
denoted by r∗, is a strictly interior point of Π2. Since in Π, the
components of r∗only appear in the network capacity constraint
(i.e., xf(l) ≤ rl), we do not need to check other constraints. At
optimality, the network capacity constraint may or may not be ac
tive (we say a constraint is active if it attains equality). If it is not
active, xf(l)will not change if we increase rl. On the other hand, if
it is active, xf(l)will go up to some extent with the increase of rl.
As a result, the objective value will be improved, since it’s strictly
increasing with xf(l). Since r∗is an interior point, there must exist
some room to increase some components of r∗, without changing
the others. This conflicts the assumption that r∗is the optimal so
lution. Therefore, r∗must be located on the boundary of Π.
In Π (?Π), which is a compact Ldimensional polyhedron, each
points. Thus, the optimal solution of P (?P) can be expressed as
solution, which is the linear mapping point of ˜ r∗, is also located at
the boundary of Π.
Now, we are interested in how far our approximate solution is
from the optimal solution. In other words, we want to know the
difference between our approximate objective value and the opti
mal objective value. We first introduce some notations. For any
point r0 (˜ r0) in Π (?Π), we define P(r0) (?P(˜ r0)) as the optimiza
P∗(r0) (?P∗(˜ r0)) be the optimal objective value of P(r0) (?P(˜ r0)).
P∗= P∗(r∗) (?P∗=?P∗(˜ r∗)) to denote the global optimal objec
Then, we investigate the performance of the approximate solu
tion. Suppose the objective value of our approximate solution is
ˆP∗. In the rest of this section, we first show that the difference be
tween the global optimal objective value of P (i.e., P∗) andˆP∗is
bounded by?P∗−ˆP∗through Theorem 2, and then give a looser
Theorem 2. The optimal objective value of the original prob
lem P is upper bounded by the optimal objective value of the ap
proximate problem?P.
in Fig. 3(a). Thus, r∗can be expressed as r∗=
Suppose its genuine mapping point in?Π is ˜ rG. As can be seen
ways find the linear mapping point of r∗, which is denoted by ˜ rL
and shown in Fig. 3(b). Since the function f(x) = −ln(1 − x)
is strictly convex, it can be derived that for each l ∈ L, ˜ rG
−ln(1 − r∗
Similar to the proof of Theorem 1, we can show that?P∗(˜ rG) ≤
P∗= ?P∗(˜ rG) and?P∗(˜ rL) ≤ ?P∗, it can be concluded that
By Theorem 2, the approximation ratio of our solution can be
bounded by
ˆ P∗
P∗−ˆP∗.
Theorem 3. Suppose that the optimal solution of?P is ˜ r∗, and
2In fact, a solution also includes the rate x, here we only consider
the capacity r simply for the ease of expression.
facet of its boundary is defined by the convex hull of L extreme
r∗=?L
i=1αiri(˜ r∗=?L
i=1αi˜ ri). In addition, the approximate
tion problem P (?P) when r (˜ r) is fixed to be r0(˜ r0). In addition, let
Suppose r∗(˜ r∗) is the global optimal solution in Π (?Π), we use
tive value of P (?P).
but simpler bound by Theorem 3.
PROOF. Let the point in Π which maximizes P be r∗, as shown
?L
i=1αiri.
in Fig. 3(b), it may not be inside?Π. However, in?Π, we can al
l
=
l.
l) ≤
?L
i=1αi(−ln(1 − ri
l)) =
?L
i=1αi˜ ri
l= ˜ rL
?P∗(˜ rL) by moving each component of ˜ rGtowards ˜ rL. Since
P∗≤?P∗.
?P∗−ˆ P∗
. Next, we give a looser but simpler bound of
its linear mapping point in Π, i.e., the approximate solution is rL.
Furthermore, let ˜ r0 be rL’s corresponding genuine mapping point
in?Π. Then, the value of P∗−ˆP∗is bounded by µα∗T(˜ r∗− ˜ r0),
problem?P(˜ r0).
is shown in Fig. 3(d). Since rLis the approximate solution, by The
orem 2, P∗− P∗(rL) is bounded by?P∗(˜ r∗) − P∗(rL), which is
tion and sensitivity (Chapter 5.6 in [24]), we denote the perturbed
version of the optimization problem?P(˜ r0) by?P˜ r0, in which the
Here u := (ul,l ∈ L) is the vector of perturbation variables. It is
evident that?P˜ r0coincides with problem?P(˜ r0) when u is a zero
relaxed the transformed capacity constraint of link l.
We denote the optimal objective value of?P˜ r0at u by?P∗
cave,?P∗
tween P∗and P∗(rL) can be bounded as follows: P∗−P∗(rL) ≤
?P∗(˜ r∗) −?P∗(˜ r0) =?P∗
Let f(x) = −ln(1 − x), and thus f−1(y) = 1 − exp(−y).
As previously discussed, ˜ r∗=?L
tation of a vector to delegate all of its components.) and rL=
?L
˜ r0 = f(rL), it can be inferred that ˜ r∗≥ ˜ r0(i.e., ˜ r∗
fore, each component of ˜ r∗−˜ r0is nonnegative. Furthermore, since
˜ r0 = f(?L
where µα∗represents the vector of the optimal dual variables of
PROOF. Ascanbeseen, ˜ r∗and ˜ r0areshowninFig.3(c), andrL
further equal to?P∗(˜ r∗)−?P∗(˜ r0). Based on the theory of perturba
transformed capacity constraint is replaced by ˜ xf(l)− ˜ r0l ≤ ul.
vector. On the other hand, when ulis positive it means that we have
˜ r0(u).
According to [24] (Chapter 5.6.1), since problem?P(˜ r0) is con
?P∗
˜ r0).
˜ r0(u) is a concave function of u. It follows that?P∗
˜ r0(u) ≤
˜ r0(0) + µα∗Tu. Therefore, let u = ˜ r∗− ˜ r0, the difference be
˜ r0(u) −?P∗
˜ r0(0) ≤ µα∗Tu = µα∗T(˜ r∗−
i=1αi˜ ri. It follows that rG=
f−1(?L
i=1αiri. Similar to the proof of Theorem 2, it can be proved
that rG≥ rL(i.e., rG
i=1αi˜ ri) (for the sake of simplicity, here we use the no
l ≥ rL
l). Since f(x) is strict increasing, and
l≥ ˜ r0l). There
i=1αif−1(˜ ri)), µα∗T(˜ r∗−˜ r0) is a function of µα∗and
˜ ri, i = 1,2,...,L.
From this bound, it can be seen that P∗−ˆP∗is proportional
to the difference between ˜ r∗and ˜ r0. Actually, it is not difficult to
show that when the capacity of each link decreases, the difference
between ˜ r∗and ˜ r0 will drop accordingly. However, this does not
necessarily means that µα∗T(˜ r∗− ˜ r0) will also drop, since µα∗
may increase with the decrease of capacities3. In fact, µα∗depends
on the particular utility function we choose, and thus there is no
universal conclusion on this point. In Section 9, we will show an
example in which P∗−ˆP∗drops when the capacity of each link is
reduced.
6. CROSS LAYER DESIGN VIA DUAL DE
COMPOSITION
6.1 The Dual Problem
Solving?P directly requires global coordination of all flows,
networks. Since?P is a convex program with compact feasible
Therefore, there exists a unique maximizer (˜ x∗, ˜ r∗) for?P, which
3For more detailed explanation on µα∗, please refer to Section 6.
4Slater’s condition can be guaranteed by assuming there exist vec
tors ˜ x ∈? X and ˜ r ∈?Π which satisfy all the constraints, i.e., strictly
which is impractical in a distributed environment such as sensor
region, strong duality can be achieved4(Chapter 5.2.3 in [24]).
can be attained by a distributed algorithm derived via formulating
feasible points exist.
Page 7
and solving the Lagrange dual problem of?P. In order to achieve
L(˜ x, ˜ r,µα,µβ) =Uf(1 − exp(−˜ xf)) −
this, we first take a look at the Lagrangian of?P:
?
−
f∈FS
?
l∈L
µα
l(˜ xf(l)− ˜ rl)
?
f∈FA
µβ
f(
?
fc∈C(f)
˜ xfc− ˜ xf)
.
In L(˜ x, ˜ r,µα,µβ), µα:= (µα
F) are vectors of Lagrangian multipliers, corresponding to the
transformed capacity constraint (7) and the transformed aggrega
tion constraint (8), respectively. They are also interpreted as the
“shadow prices" of the constraints, which can be understood as the
“costs" a flow will be charged if it violates the constraints.
Since it can be derived that
l,l ∈ L) and µβ:= (µβ
f,f ∈
?
?
f∈FA
µβ
f(
?
fc∈C(f)
˜ xfc− ˜ xf) =
?
?
f∈FA
µβ
f
?
fc∈C(f)
˜ xfc−
?
π(f)− µβ
f∈FA
(µβ
µβ
f˜ xf
=
f∈F
µβ
π(f)˜ xf−
?
f∈FA
µβ
f˜ xf=
f∈FS
µβ
π(f)˜ xf+
?
f∈FA
f)˜ xf
and
grangian as follows:
?
l∈Lµα
l˜ xf(l) =
?
f∈Fµα
l(f)˜ xf, we reorganize the La
L(˜ x, ˜ r,µα,µβ) =
?
f∈FS
+
?
Uf(1 − exp(−˜ xf)) − (µα
?
l(f)+ µβ
π(f))˜ xf
?
?
f∈FA
(−µα
l(f)− µβ
π(f)+ µβ
f)˜ xf
?
+
?
l∈L
µα
l˜ rl.
The dual of the primal problem?P is:
?D :min
µα,µβ≥0
D(µα,µβ),
where the dual objective function D(µα,µβ) is given as
D(µα,µβ) := max
X, ˜ r∈?Π˜ x∈?
L(˜ x, ˜ r,µα,µβ)
In the dual objective function, the Lagrangian multipliers
(shadow prices) µαand µβ, serve as the dual variables. Further
more, D(µα,µβ) can be decomposed into two separate optimiza
tion problems: D(µα,µβ) = D1(µα,µβ)+D2(µα). D1(µα,µβ)
and D2(µα) are defined below:
D1(µα,µβ) := max
˜ x∈?
X
?
f∈FS
+
?
Uf(1 − exp(−˜ xf)) − (µα
?
l(f)+ µβ
π(f))˜ xf
?
?
µα
f∈FA
?
(−µα
l(f)− µβ
π(f)+ µβ
f)˜ xf
?
D2(µα) := max
˜ r∈?Π
l∈L
l˜ rl
Among them, D1(µα,µβ) denotes the rate allocation problem,
while D2(µα) is the scheduling problem. In particular, the rate al
location problem aims at finding the rate of each source node that
maximizes the aggregate utilities of all sources, subject to the con
straint that the system is stable under some scheduling policy, while
the scheduling problem focuses on finding a scheduling policy that
stabilizes the system, for any rate vector of sources picked by the
rate allocation problem. In the rest of this section, we will first
elaborate on these two problems separately, and then explain how
to develop a crosslayer joint design of them.
6.2 Interpretation of the Prices
Before proceeding with the decoupled problems, we first provide
detailed explanation on the aforementioned shadow prices µαand
µβ. Theoretically, these prices represent the “costs" a flow will
be charged if it violates the constraints. In practice, they imply
the congestion information that the network elements need to share
with each other, so that the traffic rates on different links of the
network can be adjusted appropriately.
Let us first take a look at µα, which corresponds to the trans
formed capacity constraint in?P. When a flow f violates this
a cost of µα
cal interpretation of this cost in the context of the original prob
lem P. Since ˜ xf(l) = −ln(1 − xf(l)), it can be derived that
d˜ xf(l)=
to xf(l)is µα
l
In this paper, we call µα
l
it passes data at a rate of xf(l) = ˆ xf(l). With link price, when a
flow f violates the network capacity constraint (3) in the original
problem P, i.e., xf(l) > rl, the total cost it needs to pay can be
calculated as follows:?xf(l)
the context of?P, which is µα
gation constraint in?P.
of aggregating packets is µβ
When at optimality, according to the KarushKuhnTucker con
ditions, only the prices corresponding to active constraints are pos
itive, which implies the price of an uncongested link is zero.
6.3 The Rate Allocation Problem
The rate allocation problem can be further divided as follows:
constraint (i.e., ˜ xf(l) > ˜ rl), if f increases its rate for d˜ xf(l),
ld˜ xf(l)should be charged. Next, we give a practi
1
1−xf(l)dxf(l). Therefore, the cost charged with respect
ld˜ xf(l)= µα
1
1−ˆ xf(l)the link price of flow f when
1
1−xf(l)dxf(l).
rl
µα
l
1
1−ˆ xf(l)dˆ xf(l)= µα
l(˜ xf(l)− ˜ rl).
As can be seen, it is quantitatively equal to the cost calculated in
l(˜ xf(l)− ˜ rl).
On the other hand, µβcorresponds to the transformed aggre
When the aggregate transformed rates
of f’s children flows are larger than f’s transformed rate (i.e.,
?
fc∈C(f)˜ xfc> ˜ xf), the total cost paid by them to f for its efforts
f(?
fc∈C(f)˜ xfc− ˜ xf).
D1(µα,µβ) =
?
f∈FS
max
˜ xf∈?
Xf
Φ(˜ xf) +
?
f∈FA
max
˜ xf∈?
l(f)+ µβ
Xf
Ψ(˜ xf)
where
Φ(˜ xf) = Uf(1 − exp(−˜ xf)) − (µα
Ψ(˜ xf) = (−µα
In other words, the rate allocation problem can be solved
through separately solving the optimization problem of each source
flow (i.e., max˜ xf∈?
max˜ xf∈?
actually represent the “net benefit" of a flow.
Let us first study the optimization problem of each source flow
f ∈ FS. As previously discussed, Uf(1 − exp(−˜ xf)) is strictly
concave and twice continuously differentiable.
Φ(˜ xf) is strictly concave and smooth, and thus has a unique max
imizer when
d˜ xf
the optimal solution can be easily identified. For example, assume
Uf(.) = ln(.), it follows that
d˜ xf
µβ
π(f))˜ xf
l(f)− µβ
π(f)+ µβ
f)˜ xf.
XfΦ(˜ xf)), and each aggregation flow (i.e.,
XfΨ(˜ xf)). Recall that µαand µβare the costs a flow
will be charged if it violates the constraints. Φ(˜ xf) and Ψ(˜ xf)
Consequently,
dΦ(˜ xf)
= 0. Thus, given a valid utility function,
dΦ(˜ xf)
=
exp(−˜ xf)
1−exp(−˜ xf)− (µα
l(f)+
π(f)) = 0 from where the maximizer can be solved as below:
˜ x∗
f= −ln
?
µα
l(f)+ µβ
l(f)+ µβ
π(f)
µα
π(f)+ 1
?
.
Page 8
When taking into account the feasible range of ˜ xf, which is
? Xf = [−ln(1 − mf),−ln(1 − Mf)], the optimal value of ˜ xf
given µαand µβshould be
˜ xf(µα,µβ) = arg max
˜ xf∈?
Xf
Φ(˜ xf)
=
˜ x∗
−ln(1 − mf)
−ln(1 − Mf)
f
if − ln(1 − mf) ≤ ˜ x∗
if ˜ x∗
if ˜ x∗
f≤ −ln(1 − Mf)
f< −ln(1 − mf)
f> −ln(1 − Mf)
(9)
On the other hand, for each aggregation flow f ∈ FA, since
dΨ(˜ xf)
d˜ xf
the feasible range? Xf = [0,−ln(1 − Mf)], together with µαand
= −µα
l(f)− µβ
π(f)+ µβ
fis a constant, it follows that given
µβ, the optimal value of ˜ xf can be calculated as below:
˜ xf(µα,µβ) = arg max
˜ xf∈?
Xf
Ψ(˜ xf)
=
0
if µβ
if µβ
otherwise
f< µα
f> µα
l(f)+ µβ
l(f)+ µβ
π(f)
−ln(1 − Mf)
any value in? Xf
π(f)
.
(10)
As previously discussed, strong duality holds in?P, and thus
µαand µβexist (Proposition 5.1.4 in [25]), denoted as µα∗and
µβ∗. If µα∗> 0 and µβ∗> 0 are dual optimal, then ˜ xf(µα∗,µβ∗)
is also primal optimal, given that ˜ xf is primal feasible (Proposition
5.1.5 in [25]). In other words, once the optimal prices µα∗and µβ∗
are available, the optimal rate ˜ x∗
and µβis twofold. First, they serve as the pricing signal for a flow
to adjust its rate. Second, they decouple the primal problem, i.e.,
the global utility optimization into individual rate optimization of
each flow.
there is no duality gap. Thereby, the optimal dual variables (prices)
fcan be achieved. The role of µα
6.4 The Scheduling Problem
We now turn to the scheduling problem D2(µα). It is actually a
NPhard problem, since it is equivalent to the maximum weighted
independent set problem over the conflict graph. Actually, the con
flict graph depends on the underlying interference model. In this
paper, we consider nodeexclusive interference model, i.e., links
that share a common node cannot transmit or receive simultane
ously. This model has been used in many existing works [7, 8, 9] on
network utility maximization. With the node exclusive interference
model, the scheduling problem can be reduced to the maximum
weighted matching problem, which is polynomialtime solvable.
However, the existing polynomialtime solution [26] requires cen
tralized implementation. In [27], a simple distributed approximate
algorithm is presented, which is at most a factor of 2 away from the
maximum, and has a linear running time O(L). We utilize this
algorithm to solve the scheduling problem D2(µα) in a distributed
manner.
Actually, the rate control strategy proposed in this paper is a gen
eral framework and thus can be extended to other interference mod
els. For any interference model, as long as an appropriate algorithm
can be designed to solve the scheduling problem D2(µα), it can be
integrated with our framework.
Additionally, in some applications of sensor networks, the duty
cycle of the sensor nodes further complicate the scheduling prob
lem [28, 29]. We will try to address this challenge in our future
work.
6.5 Subgradient Algorithm
Now let us see how we can minimize the dual objective function
D(µα,µβ). Gradientbased methods are, in general, attractive ap
proaches to carry out minimizations of this type. Unfortunately, in
our case, D(µα,µβ) is nondifferentiable, and therefore its gradient
may not always exist. This is because in general, differentiability
of the dual requires a unique primal optimizer, whereas in our case,
the optimal values of ˜ xf (f ∈ FA) can be nonunique. Further
more, D2(µα) is a piecewise linear function and not differentiable.
Therefore, we choose to use subgradient method to solve this prob
lem.
The subgradient algorithm that we propose next is based on the
subgradient method developed by N. Z. Shor (Chapter 2 in [30]). In
ourproblem, althoughthedualgradientdoesnotexist, subgradients
do. Based on Proposition 6.1.1 of [25], we adjust µαand µβin the
opposite direction to the subgradients:
µα
l(t + 1) =
?
?
µα
l(t) − h(t)∂D(µα(t),µβ(t))
∂µα
l
?+
(11)
=µα
l(t) + h(t)(˜ xf(l)(µα(t),µβ(t)) − ˜ rl(µα(t)))
?+
µβ
f(t + 1) =
?
µβ
f(t) − h(t)∂D(µα(t),µβ(t))
∂µβ
f
?+
(12)
=
?
µβ
f(t) + h(t)(
?
fc∈C(f)
˜ xfc(µα(t),µβ(t)) − ˜ xf(µα(t),µβ(t)))
?+
In the above formulas, the ˜ xf(µα,µβ) and ˜ rl(µα) are the max
imizers of D1(µα,µβ) and D2(µα), given µαand µβ; h(t) is a
positive scalar stepsize (note that the unit of t is time slot, not sub
slot); ‘+’ denotes the projection onto the set R+ of nonnegative
real numbers.
Equation (11) reflects the law of supply and demand. If the de
mand of a flow f for bandwidth ˜ xf(l)exceeds its supply ˜ rl, the
transformed capacity constraint is violated. Thus, the price µα
raised. Otherwise, µα
dren flows fc ∈ C(f) demand an aggregate rate higher than the
rate of its parent flow f, the transformed aggregation constraint is
violated. Thus, the price µβ
l is
l is reduced. Similarly, in (12), if the chil
fis raised. Otherwise, µβ
fis reduced.
6.6 Convergence Analysis
In this subsection, we justify the convergence property of the
subgradient algorithm. Subgradient may not be a direction of de
scent, but makes an angle less than 90 degrees with all descent
directions. Using results on the convergence of the subgradient
method [25, 30], we show that, for a constant stepsize h, the al
gorithm is guaranteed to converge to within a neighborhood of
the optimal value. The reason why we choose a constant step
size is that it is convenient for distributed implementation. Since
the usual convergence criterion is not applicable for a subgra
dient algorithm5, we are interested in the asymptotical conver
gence. Similar to [7], we define µα(T) :=
µβ(T) :=
T
and let ˜ x :=
T
T. The following theorems guarantee the statistical convergence
of the subgradient method. The proofs are similar to [7], and are
omitted due to the limit of space.
1
T
?T
t=1µα(t) and
1
?T
t=1µβ(t) as the average dual variables by time T,
?T
1
t=1˜ x(t) be the average primal variable by time
5This is because the dual cost usually will not monotonically ap
proach the optimal value, but wander around it under the subgradi
ent algorithm.
Page 9
Theorem 4. Let µα∗and µβ∗be the optimal dual variables,
then, for some 0 < B < ∞, the following inequality holds
limsup
T→∞
D(µα,µβ) − D(µα∗,µβ∗) ≤ hB.
(13)
Theorem 5. Let ˜ x∗be the optimal rate of?P, then, for some
liminf
T→∞
0 < B < ∞, the following inequality holds
?P(˜ x) ≥?P(˜ x∗) − hB.
(14)
The above theorems imply that the timeaverage primal and dual
variables obtained by the subgradient algorithm can be made ar
bitrarily close to the optimal values if we choose the stepsize h
sufficiently small.
7.DISTRIBUTED IMPLEMENTATION
In this section, we describe how the subgradient algorithm can
be implemented in a real network in a distributed and scalable way.
In our design, A source (aggregation) node needs to communicate
only with its parent and children nodes. In detail, each node col
lectsthetransformedrate ˜ xfromitschildren, andupdatestheprices
(µαand µβ) based on Eqn. (11) and Eqn. (12). Then, it broadcasts
updated prices to its children. Upon receiving the price information
from its parent, each node calculates its transformed rate based on
Eqn. (9) or Eqn. (10). Then, it forwards its updated rate to its par
ent. Moreover, the nodes solve the scheduling problem through the
distributedalgorithmaswediscussedpreviouslyinSection6.4, and
decide who will have a chance to transmit in the next slot. Before
convergence, each node transmits at a rate ˆ x = min(x,r). At each
subslot, it must conform to the data availability constraint.
In our algorithm, in each iteration, an independent set is picked
as the solution of the scheduling problem. From a longterm per
spective, the algorithm jumps among the extreme points (i.e., ri)
of the capacity region (recall that each extreme point corresponds
to an independent set.), and never touches the inner area. As afore
mentioned, the optimal solution (i.e., r∗) is the convex combination
of L extreme points (i.e., r∗=?L
percentage of iterations that the algorithm picks the ith independent
set, after the system converges.
i=1αiri), located on a facet of
the capacity region’s boundary. In reality, each αi is actually the
8.
8.1
DISCUSSIONS
Validity of Data Availability Constraint
As mentioned in Section 4.3, the data availability constraint is
not taken into account when we solve the optimization problem P.
However, this will not cause any problem as long as the rate of
each flow converges to a feasible point. In an aggregation node, it
maintains a queue for each of its children, and one more queue for
the available data. Suppose the packets in each queue are sorted by
their timestamps, as shown in Fig. 1 and Fig. 2. Thus, the height of
each queue is determined by the timestamp (i.e., Tt) of the packet
on the top of this queue. If the aggregation node behaves strictly
following the data availability constraint, the queue of the avail
able data should have the same height as the shortest child queue.
Clearly, after the optimal solution which satisfies both the network
capacity constraint and data aggregation constraint is attained, the
height of each child queue as well as the queue storing the available
data will not grow infinitely.
Furthermore, oursolutionissuboptimal, andthusdoesnotutilize
the network resource to the extreme. Therefore, there is no doubt
that the proposed scheme in this paper will not overflow any node
in the aggregation tree.
8.2Periodic Data Collection
Some sensing tasks require periodic data collection, namely, the
intervals between the timestamps of consecutive packets are fixed.
In this case, if we further assume synchronized data collection, i.e.,
allthesourcesstarttheircollectionatthesametime, wecanachieve
the largest timeoverlap of the packets, and thus maximize the rate
of each source node. However, in practice, the timeoffsets of the
packets from different nodes may be timevarying, due to the dy
namic join (leave) of sensor nodes, and the oscillation of the rates
caused by the variation of the environment as well as the underly
ing MAC layer scheduling. In this scenario, the proposed algorithm
can be considered as a good approximation, since the probabilistic
rate model can capture the longterm expectation of timeoffsets.
For example, in the scenario shown in Fig. 1, node A and B both
collect data in a periodic pattern, and their rates are1
are two possibilities for the timeoffset of the packets from A and
B, as shown in Fig. 1(b) and Fig. 1(c). The rate of aggregated pack
ets (i.e., node C’s sending rate) in these two cases are
respectively. If either case has the same chance to happen, the ex
pected rate of the aggregation flow is
exactly equals the lower bound of node C’s rate derived by the data
aggregation constraint (1 − (1 −1
Even if the timeoffsets can be controlled, however, in this sce
nario it is extremely difficult to mathematically model the relation
ship between a parent flow and its children flows in a convex func
tion. Suppose the data collection are all synchronized, what we can
do is to provide some tricks which can improve the objective value
after the algorithm converges. In detail, we check the source flows
sharing the same parent. If their periods are coprime to each other,
there is nothing can be improved since the data aggregation con
straint precisely models the aggregation of the source flows with
coprime periods. If the periods of some flows share a greatest com
mon divisor α (let Fαbe the set of them), we fix their rates as
constants in P, and use a virtual flow fα to replace them in the
data aggregation constraint. fα is resulted from aggregating the
flows in Fαwhen they are synchronized, and its rate is the con
stant xfα=
the optimization of P. Since xfαis lower than the rate derived by
the data aggregation constraint, some network resources are saved,
and thus the rates of other flows can be improved. As an example,
suppose the rates of the flows from node A and B shown in Fig. 1
are1
ously, xfαis lower than the rate obtained based on Eqn (5), which
is3
at node C and allocated to its neighboring flows. After reoptimiza
tion, the rates of these neighboring flows will be improved.
2and1
4. There
1
2and
3
4,
1
2× (1
2+
3
4) =
5
8. This
2)(1 −1
4) =5
8).
1
α(1 −?
f∈Fα(1 − αxf)). Subsequently, we restart
4and1
6, we have xfα=1
3according to above formula. Obvi
8. Thus, some bandwidth can be saved from the flow originated
8.3Lossy Link
Due to the unliable nature of wireless communication, packets
may be lost during transmission. In our scheme, lost packets do
not matter at all, since from the perspective of the receiver, lost
packets look like “nonexistent packets", namely, the source nodes
never collect data at those subslots. Furthermore, if the average
reception probability of each link can be measured, the formulation
of the problem can be easily redefined so as to take it into account.
Retransmissions are not needed in our solution.
8.4Energy Constraint
Energy scarcity is a major challenge for the design of sensor net
works. Our approach can also be adapted to address this problem.
The solution is to add an energy constraint to the problem formula
tion. As a result, the energy budget of each node on the aggregation
Page 10
2
5
5
1
2
7
4
7
4
S
1
6
6
8
89
9 10
1012
12
3
3
11
11
13
13 15
15
14
14
16
16 17
17
Figure 4:
tree
An aggregation
0 50010001500 2000
0
0.02
0.04
0.06
0.08
0.1
Normalized Time
Normalized Rate
(a) Rate of flow 10
05001000 15002000
0
0.02
0.04
0.06
0.08
0.1
Normalized Time
Normalized Rate
(b) Rate of flow 14
Figure 5: Rates of source
flows
tree will be considered when the algorithm allocates the resource of
the network.
8.5Time Synchronization
The problem of synchronization has been considered and ad
dressed by the prior work on data aggregation in sensor net
works [1], and we just borrow the existing solutions.
9.PERFORMANCE EVALUATION
In this section, we provide simulation results to complement the
analysis in the previous sections. We consider a randomly gener
ated aggregation tree shown in Fig. 4. On this tree, 10 source nodes
(shaded nodes) collect and forward data to the sink S, through 7
aggregation nodes. In addition, the number on each node (edge)
works as the index of this node (flow). First, we assume that all the
links have a normalized capacity of 0.5, and all the source nodes
use the same utility function U(x) = ln(x). Then, we apply our
joint rate control and scheduling algorithm with a fixed stepsize
h = 1 on this aggregation tree, and observe its performance.
Figure 5 shows the evolution of the rates of the source flow 10
and 14. The other source flows have similar behavior and thus
we omit their results. As one can see, they converge quickly to
a neighborhood of the optimal values and oscillate around the opti
mal values. This oscillating behavior mathematically results from
the nondifferentiability of the dual function and physically can be
interpreted as due to the scheduling process.
0 500 100015002000
0
500
1000
1500
Normalized Time
Available Data
0 5001000 15002000
0
500
1000
1500
Normalized Time
Available Data
(a) Available data of node 5(b) Available data of node 7
Figure 6: Available data stored in aggregation nodes
Figure 6 describes the amount of available data stored in node
5 and 7. The other nodes have similar behavior and thus we omit
their results. In this test, we assume each time slot (length of step)
contains 100 subslots. As can be seen, although the two curves
both fluctuate with the time going, they are bounded reasonably.
Theriseoffluctuationcanbeascribedtotheunderlyingscheduling,
which prevents an aggregation node from receiving and transmiting
packets at the same time.
0 200400 600800 1000120014001600 1800 2000
0
5
10
15
20
25
30
Normalized Time
Delay
Delay of Flow 1
Delay of Flow 2
Delay of Flow 5
Delay of Flow 8
Figure 7: Delays of the packets delivered by flows
Figure 7 demonstrates the average delays of the packets deliv
ered by four flows. Here the concept of delay is defined as the
period from the moment when a packet is generated by a source
to the time when this packet is delivered by a flow. For example,
suppose node 8 generates a packet at time 1. Based on the delay
values shown in Fig. 7, this packet will arrive at node 5 at time 6,
since the delay of flow 8 is roughly 5. Similarly, it will arrive at
node 2 at time 8 since the delay of flow 5 is 7, and node 1 at time
10 since the delay of flow 2 is 9. Finally it will reach the sink at
time 14 since the delay of flow 1 is 13. Here we should note that
this packet may be aggregated during this process. As can be seen
in the figure, the average delay of each flow converges to a stable
point soon after the algorithm is started.
0.1 0.20.30.4 0.5 0.60.7 0.8 0.9
−50
−40
−30
−20
−10
0
Normalized Capacity
(a) Optimal Value
Optimal Objective Value
Global Optimal Solution
Approximate Solution
0.10.20.3 0.40.5 0.6 0.70.80.9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Normalized Capacity
Approximation Ratio
(b) Approximation Ratio
Figure 8: Optimal Solution V.S. Approximate Solution
Finally, Fig. 8 discloses the difference between the optimal ob
jective value and the approximate objective value. In this test, we
tune the capacity of each link, and observe its impact on the objec
tive value. Intuitively, as illustrated in Fig. 8(a), both the optimal
objective value and the approximate objective value increase with
the capacities growing. At first, when the capacity of each link is
as low as 0.1, the difference between the two values is negligible,
since the approximation ratio defined as
than 1%, as shown in Fig. 8(b). As the capacities grow up, the
difference as well as the approximation ratio increase accordingly.
They remain in a low level (less than 10%) until the capacity goes
beyond 0.5. Finally, the approximation ratio reaches around 20%
when the capacity is increased to 0.9.
To find a reasonable explanation for this point, let us observe
the function we use in all the variable substitutions, which is
f(x) = −ln(1−x). Thecurvature, i.e., thesecondorderderivative
Optimal−Approximate
Approximate
is less
Page 11
of this function is monotonously increasing with x. Thus, when the
original variable x is small, the value of the new variable y = f(x)
is close to x. For this reason, when we decrease the capacity of
each link, the boundary of the transformed region Π?on which the
approximate solution (i.e., ˜ r0shown in Fig. 3(c)) is located will be
come closer to the boundary of the approximate region?Π on which
quently, ˜ r∗−˜ r0drops accordingly, resulting in a smaller difference
between the optimal value and the approximate value, by Theo
rem 3. Finally, it should be noted that the optimal value shown in
this test is actually the optimal value of the approximate problem
(i.e.,?P∗), which has been proved to be the upper bound of the real
solution should be even closer to the real optimal solution than we
observe in this experiment.
the optimal solution (i.e., ˜ r∗shown in Fig. 3(c)) is located. Conse
optimal value P∗. This implies that in practice, our approximate
10.CONCLUSIONS
In this paper, we identify the unique challenges of rate allocation
in the context of data aggregation in wireless sensor networks, and
formulate this problem as a network utility maximization problem.
After transforming this problem into a convex approximate prob
lem, we decompose it based on the duality theory, and propose a
distributed algorithm to solve the decoupled problems. Theoretical
analysis and simulation results demonstrate the nearoptimal per
formance of our scheme.
11.ACKNOWLEDGMENTS
We are grateful to Professor P.R.Kumar for his kind help on
the formulation and solution of the problem. Thanks are also due
to Prof.Angelia Nedich, Prof.Tarek Abdelzaher, IHong Hou and
Yuhan Zhang for providing helpful comments and feedback.
12.
[1] S. M. Michael, M. J. Franklin, J. Hellerstein, and W. Hong,
“Tag: a tiny aggregation service for adhoc sensor networks,”
in OSDI, 2002.
[2] F. Kelly, “Charging and rate control for elastic traffic,”
European Transactions on Telecommunications, vol. 8, pp.
33–37, 1997.
[3] F. Kelly, A. Maulloo, and D. Tan, “Rate control in
communication networks: shadow prices, proportional
fairness and stability,” in Journal of the Operational
Research Society, vol. 49, 1998.
[4] S. H. Low and D. E. Lapsley, “Optimization flow controli:
basic algorithm and convergence,” IEEE/ACM Trans. Netw.,
vol. 7, no. 6, pp. 861–874, 1999.
[5] K. Kar, S. Sarkar, and L. Tassiulas, “Optimization based rate
control for multirate multicast sessions,” in INFOCOM,
2001.
[6] Y. Cui, Y. Xue, and K. Nahrstedt, “Optimal resource
allocation in overlay multicast,” Parallel and Distributed
Systems, IEEE Transactions on, vol. 17, no. 8, pp. 808–823,
Aug. 2006.
[7] L. Chen, S. H. Low, M. Chiang, and J. C. Doyle,
“Crosslayer congestion control, routing and scheduling
design in ad hoc wireless networks,” in INFOCOM, 2006.
[8] X. Lin and N. B. Shroff, “Joint rate control and scheduling in
multihop wireless networks,” in CDC, 2004.
[9] X. Lin, N. Shroff, and R. Srikant, “A tutorial on crosslayer
optimization in wireless networks,” Selected Areas in
REFERENCES
Communications, IEEE Journal on, vol. 24, no. 8, pp.
1452–1463, Aug. 2006.
[10] A. Eryilmaz and R. Srikant, “Joint congestion control,
routing, and mac for stability and fairness in wireless
networks,” Selected Areas in Communications, IEEE Journal
on, vol. 24, no. 8, pp. 1514–1524, 2006.
[11] P.J. Wan, K. Alzoubi, and O. Frieder, “Distributed
construction of connected dominating set in wireless ad hoc
networks,” in INFOCOM, 2002.
[12] Y. Wu, S. Fahmy, and N. B. Shroff, “On the construction of a
maximumlifetime data gathering tree in sensor networks:
Npcompleteness and approximation algorithm,” in
INFOCOM, 2008.
[13] B. Yu, J. Li, and Y. Li, “Distributed data aggregation
scheduling in wireless sensor networks,” in INFOCOM,
2009.
[14] P.J. Wan, S. C.H. Huang, L. Wang, Z. Wan, and X. Jia,
“Minimumlatency aggregation scheduling in multihop
wireless networks,” in MobiHoc, 2009.
[15] Y. T. Hou, Y. Shi, and H. D. Sherali, “Rate allocation and
network lifetime problems for wireless sensor networks,”
IEEE/ACM Trans. Netw., vol. 16, no. 2, pp. 321–334, 2008.
[16] Y. Xue, Y. Cui, and K. Nahrstedt, “Maximizing lifetime for
data aggregation in wireless sensor networks,” MONET,
vol. 10, no. 6, pp. 853–864, 2005.
[17] B. Przydatek, D. X. Song, and A. Perrig, “Sia: secure
information aggregation in sensor networks,” in SenSys,
2003.
[18] Y. Yang, X. Wang, S. Zhu, and G. Cao, “Sdap: a secure
hopbyhop data aggregation protocol for sensor networks,”
in MobiHoc, 2006.
[19] A. Giridhar and P. R. Kumar, “Toward a theory of innetwork
computation in wireless sensor networks,” IEEE
Communications Magazine, vol. 44, pp. 98–107, 2006.
[20] D. Marco, E. DuarteMelo, M. Liu, and D. Neuhoff, “On the
manytoone transport capacity of a dense wireless sensor
network and the compressibility of its data,” in IPSN, 2003.
[21] S. Chen and Z. Zhang, “Localized algorithm for aggregate
fairness in wireless sensor networks,” in MOBICOM, 2006.
[22] C. T. Ee and R. Bajcsy, “Congestion control and fairness for
manytoone routing in sensor networks,” in SenSys, 2004.
[23] K. Jain, J. Padhye, V. N. Padmanabhan, and L. Qiu, “Impact
of interference on multihop wireless network performance,”
in MOBICOM, 2003.
[24] S. Boyd and L. Vandenberghe, Convex Optimization.
Cambridge University Press, 2004.
[25] D. P. Bertsekas, Nonlinear Programming.
Scientific, 1995.
[26] C. H. Papadimitriou and K. Steiglitz, Combinatorial
Optimization: Algorithms and Complexity.
Publications, 1998.
[27] J.H. Hoepman, “Simple distributed weighted matchings,”
CoRR, 2004.
[28] L. Su, C. Liu, H. Song, and G. Cao, “Routing in
intermittently connected sensor networks,” in ICNP, 2008.
[29] L. Su, B. Ding, Y. Yang, T. F. Abdelzaher, G. Cao, and J. C.
Hou, “ocast: Optimal multicast routing protocol for wireless
sensor networks,” in ICNP, 2009.
[30] N. Z. Shor, K. C. Kiwiel, and A. Ruszcay` nski, Minimization
methods for nondifferentiable functions.
USA: SpringerVerlag New York, Inc., 1985.
Athena
Dover
New York, NY,