Delay performance of a Publish Subscribe system deployed over a memoryconstrained, Delay Tolerant Network
ABSTRACT In this work we assess the delay performance of a Publish Subscribe system built on top of a Delay Tolerant Network (DTN) composed of nodes with limited storage capacities. Many DTN routing protocols replicate the same data over several nodes, in order to deliver data to destination in a faster or in a more reliable way. Of course, increasing the number of replicas has the effect of decreasing the delivery delay perceived by the users, but increases the use of the system memory. Our goal is to investigate the tradeoff between reduction of delay and storage requirements when nodes are memory constrained, in a Topicbased, Publish Subscribe system where we have different topics with different popularity. We provide some insights in this tradeoff, which implies some unanticipated issues, and propose simple rules to dimension the number of replicas per topic. To this end, we derive analytical models and we validate them with simulations.
 Citations (9)
 Cited In (0)

Article: Random Waypoint Considered Harmful
[Show abstract] [Hide abstract]
ABSTRACT: This study examines the random waypoint model widely used in the simulation studies of mobile ad hoc networks. Our findings show that this model fails to provide a steady state in that the average nodal speed consistently decreases over time, and therefore should not be directly used for simulation. We show how unreliable results can be obtained by using this model. In particular, certain ad hoc routing metrics can drop by as much as 40% over the course of a 900second simulation using the random waypoint model. We give both an intuitive and a formal explanation for this phenomenon. We also propose a simple fix of the problem and discuss a few alternatives. Our modified random waypoint model is able to reach a steady state and simulation results are presented.Proceedings  IEEE INFOCOM 03/2003;  SourceAvailable from: Avri Doria[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we address the problem of routing in intermittently connected networks. In such networks there is no guarantee that a fully connected path between source and destination exists at any time, rendering traditional routing protocols unable to deliver messages between hosts. There does, however, exist a number of scenarios where connectivity is intermittent, but where the possibility of communication still is desirable. Thus, there is a need for a way to route through networks with these properties. We propose PRoPHET, a probabilistic routing protocol for intermittently connected networks and compare it to the earlier presented Epidemic Routing protocol through simulations. We show that PRoPHET is able to deliver more messages than Epidemic Routing with a lower communication overhead.09/2004: pages 239254;  SourceAvailable from: Thrasyvoulos Spyropoulos[Show abstract] [Hide abstract]
ABSTRACT: Intermittently connected mobile networks are wireless networks where most of the time there does not exist a complete path from the source to the destination. There are many real networks that follow this model, for example, wildlife tracking sensor networks, military networks, vehicular ad hoc networks, etc. In this context, conventional routing schemes fail, because they try to establish complete endtoend paths, before any data is sent. To deal with such networks researchers have suggested to use floodingbased routing schemes. While floodingbased schemes have a high probability of delivery, they waste a lot of energy and suffer from severe contention which can significantly degrade their performance. Furthermore, proposed efforts to reduce the overhead of floodingbased schemes have often been plagued by large delays. With this in mind, we introduce a new family of routing schemes that "spray" a few message copies into the network, and then route each copy independently towards the destination. We show that, if carefully designed, spray routing not only performs significantly fewer transmissions per message, but also has lower average delivery delays than existing schemes; furthermore, it is highly scalable and retains good performance under a large range of scenarios. Finally, we use our theoretical framework proposed in our 2004 paper to analyze the performance of spray routing. We also use this theory to show how to choose the number of copies to be sprayed and how to optimally distribute these copies to relays.IEEE/ACM Transactions on Networking 03/2008; 16(1):7790. · 2.01 Impact Factor
Page 1
Delay Performance of a Publish Subscribe system
deployed over a MemoryConstrained, Delay
Tolerant Network
Lorenzo Bracciale∗, Donato Battaglino∗, Andrea Detti∗, Giuseppe Bianchi∗, Nicola BlefariMelazzi∗,
Andrea Bragagnini†, Maura Santina Turolla†
,
∗DIE, Universit´ a di Roma “Tor Vergata”, Rome, Italy
{lorenzo.bracciale,donato.battaglino,andrea.detti,giuseppe.bianchi,blefari}@uniroma2.it
†Telecom Italia, Turin, Italy
{andrea.bragagnini,maurasantina.turolla@telecomitalia.it}@telecomitalia.it
Abstract—In this work we assess the delay performance of a
Publish Subscribe system built on top of a Delay Tolerant Net
work (DTN) composed of nodes with limited storage capacities.
Many DTN routing protocols replicate the same data over several
nodes, in order to deliver data to destination in a faster or in a
more reliable way. Of course, increasing the number of replicas
has the effect of decreasing the delivery delay perceived by the
users, but increases the use of the system memory.
Our goal is to investigate the tradeoff between reduction
of delay and storage requirements when nodes are memory
constrained, in a Topicbased, Publish Subscribe system where
we have different topics with different popularity.
We provide some insights in this tradeoff, which implies some
unanticipated issues, and propose simple rules to dimension the
number of replicas per topic. To this end, we derive analytical
models and we validate them with simulations.1.
I. INTRODUCTION
PublishSubscribe represents a very popular communication
paradigm mainly because of the loose coupling it introduces
between actors (publishers / subscribers), which well suits
asynchronous and manytomany communications. One of the
most popular publishsubscribe form is the socalled topic
based [1]. In a topicbased publishsubscribe system, users
are interested only in a certain set of events, named topics
and represented through keywords (e.g. the topic name). Once
subscribers have subscribed to a specific topic, they will
receive datasamples emitted by publishers on that topic. In
other words, topics are logic channels that connect publishers
to subscribers.
The publishsubscribe paradigm is a very convenient service
model in Delay Tolerant Networks (DTNs). These are net
works characterized by intermittent connectivity and/or long
propagation delays between the nodes of the network. Typ
ical application scenarios are deepspace, underwater, sensor
and tactical networks where connectivity cannot always be
guaranteed. Moreover, DTNs well satisfy the requirements of
user communities that do not require real time services, e.g. a
messaging system in a university campus.
1This work has been carried out in cooperation with and with funding from
Telecom Italia.
DTN routing/forwarding schemes employ the so called
storecarryforward paradigm: when connectivity towards a
neighbor node is not available, or deemed not convenient by
the forwarding scheme employed, a (relay) node temporarily
buffer messages. These will be forwarded at a later time,
when nodes’ movement will make available a link towards
the destination. Clearly, the endtoend delay depends on the
mobility model of the nodes (speed, distribution). To improve
delay performance and delivery probability, the typical strategy
consists in creating multiple copies of a same datasample over
several nodes. Of course, increasing the number of replicas
has the effect of decreasing the delivery delay perceived by
the users (subscribers) but increases the use of the system
memory.
In this paper we present an optimization of a topicbased
publishsubscribe system that operates on top of a DTN which
uses a multiple copy routing approach. The topics could have
different popularity, i.e. different number of subscribers. The
DTN is formed by nodes that have a limited memory to store
and carry the published datasamples. This case is of interest
for DTNs realized with embedded systems such as [2] or,
more in general, when the size of datasamples are comparable
with the storage space available in the nodes. The optimization
goal is to minimize the mean delivery delay perceived by the
subscribers.
The presence at the same time of memory constraints
and topics with different popularity makes the optimization
problem a challenging one. Indeed, as shown later, greedily
exploiting all the memory resources available in the network
nodes is counterproductive with respect to the system perfor
mance. To the best of our knowledge, such issue has not been
formerly addressed in the literature. Accordingly, we provide
convenient formulae which allow to properly dimension the
number of replicas per topic. We further address the case
of topics with different popularity and we prove that the
optimal number of replicas is proportional to the square root
of the topic popularity. We verify the effectiveness of our
analytical modeling by means of simulations. To simplify
the optimization we preliminarily focus our attention on a
2011 The 10th IFIP Annual Mediterranean Ad Hoc Networking Workshop
9781457708992/11/$26.00 ©2011 IEEE25
Page 2
very simple DTN, employing the sprayandwait [3] routing
scheme, and on random mobility assumptions. However, we
believe that our results may be applicable also to other multiple
copy routing schemes where the number of replicas per data
sample is an input parameter.
II. PROBLEM DESCRIPTION
In this section, we introduce the considered scenario and
assumptions, we state the delay optimization problem, and
we illustrate at a glance the tradeoffs that emerge in the
optimization.
A. Scenario
We consider a wireless network composed of N mobile
nodes that move within a service area according to a random
mobility model. Each mobile node may have publishing and/or
subscribing entities for topics; there are T topics and the i
th topic has Si subscribing entities. Throughout the paper
we refer to a publishing (subscribing) entity with the term
publisher (subscriber). Moreover, we refer to the parameter Si
as topic popularity.
If a publisher and a subscriber for a same topic are within
their radio coverage range, they can directly exchange data
samples. When this is not the case, we assume that data
samples are forwarded throughout the DTN using the “spray
and wait” approach. Specifically, the datasamples for the
different topics are replicated into the memories of nodes by
using the socalled binary spray [3]. Binary spray spreads
the replicas of a datasample for the ith topic as follows:
the source generates Rireplicas (including the original data
sample); any node A with x > 1 replicas2of the datasample
that meets a node B with no replica forwards �n/2� of its
copies to B and keeps the remaining copies.
Unlike [3], we focus on the case of nodes whose memory
made available for the DTN forwarding process is bounded.
Specifically, we assume that each node may store at most
Cndatasamples. This sets forth the additional condition that
a datasample is replicated only whenever a neighbor node
has at least one memory slot available. The introduction of a
memory bound in a topicbased system (where datasamples
are persistently generated) implies that a suitable mechanism
must be introduced to control the data obsolescence. Viable
approaches consist in either defining datasample lifetimes,
or automatic eventbased deletion mechanisms, for instance
when a node ”sees” a newer datasample for a same topic in a
neighbor node [2]. In what follows, for simplicity, we idealize
such data obsolescence mechanism by assuming that all the
datasamples for a given topic are removed from the memory
when a new datasample is published for that same topic.
Finally, we assume that the time needed to exchange data
sample among neighbor nodes is negligible, and that a contact
2Note that when a node, because of the spray operation, holds x > 1
replicas for a datasample, all the x−1 copies do not account in the memory
occupation of the node, i.e. only one copy of the datasample is actually stored
in the node memory, together with a counter that keeps track of the number
of local replicas.
among nodes lasts for a sufficient amount of time to exchange
all the datasamples that are required by the DTN forwarding
rules.
B. Problem Statement and definitions
We recall that the main characterizing feature of the spray
and wait protocol is the number of replicas Ri that are
disseminated during the spray phase for each datasample
published to the ith topic. In this work, we aim at globally
optimizing said number of per topic replicas R = [R1···RT],
for minimizing the average time D between the publication
of a datasample and its reception by subscribers. We call this
delay the mean delivery delay. Obviously, this delay depends
on the different topic popularity Si. It is readily expressed
as the weighted sum of the mean delays Di perceived by
subscribers of topic i, where the weights are the ratio between
the number of subscribers for topic i and the total number of
subscribers in the system3. In formula:
D =
T
�
i=1
Si
�T
j=1Sj
· Di
(1)
It is convenient to express the number of replicas Ri
in terms of two parameters which we refer to as memory
utilization (ρ) and sharing factor (φi). We define the memory
utilization ρ as the ratio between the networkwide memory
space used to store datasamples Cu=�
Cu
Ctot
Moreover, for each topic i we define sharing factor φias the
fraction of the used memory space Cuassigned to topic i for
its replicas:
φi=Ri
Cu
iRiand the overall
available memory Ctot= NCn:
ρ ==
�T
i=1Ri
NCn
(2)
=
Ri
�T
j=1Rj
(3)
The number of replicas deployed for a given topic i is readily
expressed in terms of these two above defined parameters as:
Ri= ρ · φi· Ctot= ρ · φi· NCn
The problem thus consists in determining the optimal set of
parameters ρ and {φi} so that the delay D given in equation
(1) is minimized, under the condition that no node is able to
locally store more than Cndistinct replicas.
(4)
C. Discussion
At a first glance, the need to account for the memory
utilization ρ may not be evident. Indeed, the per topic average
delay Diclearly depends on the number of deployed replicas
Ri, and without memory constraints the delay performance
would improve with a larger number of replicas. It could
therefore naively appears that the sum of the replicas should
3We note that every node may be a subscriber for one or more topics,
so the number of subscriptions could be greater than the number of nodes.
Throughout this paper we consider one subscriber per subscription, so also
the number of subscribers could be greater than the number of nodes.
26
Page 3
be set to the maximum value allotted by the global memory
constrains, i.e., ρ = 1. However, a closer look reveals that this
is not the case. Indeed, the optimal choice derives from trade
offs that involve both the memory utilization and the sharing
of the memory space:
• memory utilization  as we increase the overall num
ber of replicas in the system, the probability that the
subscriber meets a node having one of these replicas
increases and that tends to reduce the delivery delay.
Conversely, as we increase the overall number of replicas
in the system, the probability of finding a node with a free
memory space decreases. Consequently, the duration of
the spray phase is longer, the time required to spread in
the system a given number of replicas increases and that
tends to increase the delivery delay.
• memory sharing  let we consider a situation where
each topic has the same number of replicas and hence all
topics fairly share the system memory. As we increase
the number of replicas for a given topic i, we have to
decrement the number of replicas for another topic j, so
as to maintain constant the memory occupancy. In doing
so, the delivery delay of topic i decreases and the delivery
delay of topic j increases. This delay unfairness could
reduce the overall mean delivery delay (D), when the
number of subscribers Si of topic i is greater than Sj.
However, an excessive unfairness could cause so high
delay penalty for subscribers of topic j as to vanish the
benefits of delay reduction obtained by subscribers of
topic i.
III. PROBLEM RESOLUTION
In the general case of different topic popularity, a delay
optimization jointly targeting the parameters ρ and {φi}
appears numerically cumbersome and this could mine the
comprehension of the problems and the derivation of practical
guidelines. Therefore, we resorted to a suboptimal sequential
approach which consists in two subsequent steps:
1) memory utilization optimization: first, we derive the
memory utilization ρ which minimizes delay in the
assumption that all the topics have the same popularity
and, hence, φi= 1/T;
2) sharing factor optimization: second, we derive the
sharing factors φi, by assuming that the available mem
ory is the one derived by the previous memory utilization
optimization.
To simplify the analytical computation we consider exponen
tially distributed intermeeting time between any pair of nodes.
This implies that each node sees another (uniform distributed)
random node every 1/λ seconds (in mean).
A. Memory Utilization Optimization
Under the assumption that all the topics have the same
popularity and deploy the same number of per topic replica
R = Ri, the subscribers of the different topics experience
the same delay that is equal to the mean delivery delay, i.e.
D = Di. It therefore suffices to study the delivery delay for a
Fig. 1.
a given datasample.
Markov chain representing the number of replicas in the system for
single topic, by determining an analytic formula (eq. 8) which
expresses the delivery delay as a function of the memory
utilization, and numerically find the optimal value of ρ that
provides the minimum delay value.
To make the computation of D analytically tractable we
resort to the following assumptions:
• we approximate the binary spray mechanisms as an epi
demic diffusion of datasample replicas that stops when
the target number of replicas Riis reached.
• we assume only one spray at a time; in other words,
during the spray phase of the considered topic, the
amount of memory used for datasample replicas of other
topics does not change. This is a good approximation
for the case of infrequently emissions of datasamples,
whereas it should be considered as a worst case if data
samples production becomes frequent.
We model the evolution of spray and wait using the con
tinuous time Absorption Markov Chain represented in figure
1. The state variable of the Markov chain is the number of
deployed replicas of the considered topic. When a datasample
is first produced, the chain starts with state 1, meaning that
only the producer has a copy of the datasample. Whenever
the producer ”meets” another node, the chain moves to state
2, meaning that two nodes in the network carry a datasample
replica. When the chain reaches state R, meaning that the data
sample is copied over R nodes, the soapproximated spray
phase stops.
In addition, a special state called ”absorption” state models
the reception of the datasample by a selected subscriber
for that topic, hereafter referred to as target subscriber. As
such, the average datasample delivery delay can be simply
computed by determining the absorption time in the Markov
chain. For ease of notation, we conveniently call n the number
of nodes in the network excluding the target subscriber (i.e.
n = N − 1).
Each state i ∈ (1 : R − 1) in the chain has two outgoing
transition rates: one towards the state i + 1, and the other
towards the absorption state. These transition rates are derived
and justified as follows:
• transition rate {i → i + 1}. When the chain is in state
27
Page 4
i, a transition to state i + 1 occurs whenever one of the
i replica nodes encounters any of the remaining nodes
except the target subscriber. Since the rate of encounters
for each node is λ, the aggregate rate of encounters for
the i nodes storing a datasample replica is λ·i. Among
these encounters, the datasample is further replicated
only when i) the encountered node is not the target
subscriber and it does not already hold the replica, and
ii) the encountered node has at least one memory slot
available to accommodate the replica. Upon an actual
encounter, condition (i) occurs with probability (n−i)/n,
whereas, to model condition (ii), we define fR to be
the probability that the encountered node has at least a
memory slot available. fR is derived in the subsequent
expression (7). We thus conclude that the transition rate
{i → i + 1} is expressed by the product iλfR(n − i)/n.
• transition rate {i → absorption}. Transition to the
absorption state occurs when the target subscriber meets
any of the i nodes holding the datasample. In formula,
this is expressed by iλ/n.
Using this chain, the mean delivery delay D of the con
sidered topic can be calculated as the delay between the state
1 and the absorption state (state 0). We define ∆i and δi,j
respectively the mean time the system remains in the state i
and the average time needed to move from the state i to j.
According to the Markov chain theory, the state equations
can be written as:
δ1,0
=∆1+
fR(n − 1)
1 + fR(n − 1)δ2,0
fR(n − 2)
1 + fR(n − 2)δ3,0
fR(n − 3)
1 + fR(n − 3)δ4,0
(5)
δ2,0
=∆2+
δ3,0
=∆3+
...
δR,0
=∆R
The mean time ∆ithe system remains in the state i is given
by the inverse of the outbound transition rates from each state:
∆1
=
n
λ[1 + fR(n − 1)]
n
2λ[1 + fR(n − 2)]
n
3λ[1 + fR(n − 3)]
...
n
Rλ
(6)
∆2
=
∆3
=
∆R
=
To conclude the analysis, it remains to derive the probability
fR used in the chain. This is defined as the probability that
an encountered node has a free memory slot to accommodate
the replica. We remark that fRdoes not change with the chain
evolution4, and can be computed by quantifying the probability
4We recall that we model the process of datasample distribution for a given
topic, meanwhile assuming that any remaining topic has already ”sprayed”
its datasample and thus reached steady state. A joint modeling of all topics
would yield an exponential increase in the space state of the chain.
that a node has at least one free memory slot before the start
of the spray phase, i.e. when there are only R(T −1) replicas
in the system generated by the remaining topics. Specifically,
note that the replicas generated for the considered topic do not
affect the probability fR. Indeed, nodes which already hold a
replica for the considered topic are not accounted in the term
fR, but are accounted in the complementary term (n − i)/n
within the state transition rate. It follows that fRcan be well
approximated by:
fR= 1 −
�
n
n + 1
�Ctot−R(T−1)
(7)
This formula is obtained considering that before the start of
the spray there are Ctot− R(T − 1) memory slots available
for the considered topic, distributed on node memories. Con
sidering free memory slots as ”tokens“ randomly assigned to
nodes, the probability that a node has at least a free memory
slot is the complementary probability that the node has not
been assigned any free space.
Recursively solving equation (7) using the values (6), we
obtain the following expression for the mean delivery delay
D = δ1,0as a function of the parameter R:
D =
R−1
�
k=1
�k
w=1
fR(n−w+1)
1+fR(n−w)
λfRk
+ n
�R−1
w=1
fR(n−w)
1+fR(n−w)
λR
(8)
which can be rewritten in terms of the memory utilization
ρ (and numerically minimized with respect to such unique
variable) by simply substituting:
R = ρ ·Ctot
T
= ρ ·N · Cn
T
B. Sharing Factor Optimization
In this section we derive the optimal sharing factors φi,
under the assumption that the number of memory slots used
in the network is fixed to the value Cu = ρCtot, where ρ
is the memory utilization value which minimizes expression
8. Although (8) provides a delay expression which accounts
for both the spray and the wait phases, to achieve a closed
form solution, similar to [3] we approximate the delay by
neglecting the spray phase5. In this case, as shown in [3]
and as a corollary of the previous analysis (i.e. considering
D ≈ δR,0= ∆R) the mean delivery delay for topic i reduces
to the very simple expression:
Di=
n
λRi
=
n
λCuφi
(9)
5Note that a more precise analysis, though necessarily addressed via
numerical means, may be developed by leveraging the results derived in the
previous section instead of using the approximation (9). In fact, it is easy to
recognize that equation (8) is readily extended to the general case of different
per topic delays Diversus the corresponding φi, as it depends only upon the
φivalue for the considered topic. Such dependence is indeed confined to the
exponent in equation (7), where we note that Ctot−R(T −1), in the general
case of different popularity, can be rewritten as Ctot− Cu+ Cuφi.
28
Page 5
Consequently, to compute the sharing factors φiwe need to
solve the following constrained optimization:
min
φ
T
�
i=1
nSi
λCuφi
�
�
φi= 1
jSj
(10)
s.t.
i
(11)
A closed form solution can be accomplished by means of
a Lagrange approach. As a result, we obtain the following
simple result:
φi=
√Si
√S1+ ...√ST
(12)
This result states that, for a relatively large number of nodes
compared to the number of deployed replicas, i.e. when the
spray time is small with respect to the wait time and hence
approximation (9) is tight, the number of deployed replicas per
topic is proportional to the square root of the topic popularity,
namely the number of topic subscribers.
IV. PERFORMANCE EVALUATION
To validate the effectiveness of the proposed optimization
and the modeling results, we developed an adhoc eventdriven
simulator. The simulator has been written from scratch and its
code is available at [4]. As reference scenario, we consider
N = 100 nodes that move in an area of 500X500 m2. The
radio coverage range is set to 50m, simulation duration is
500000 seconds. Every node has enough memory to store up to
10 different datasamples. We used movement traces in which
the mean intermeeting time between a specific pair of nodes
is 1400s (i.e. N/λ = 1400, the intermeeting time between
any pair of nodes hence being 1/λ = 14s).
In the simulator datasamples are generated as follows. For
each topic, a first datasample is generated by a randomly
chosen publisher node at the start of the simulation (initial
random offsets being used to avoid synchronized start). Then,
every topic proceeds in parallel by independently generating
subsequent datasamples. Each new datasample is generated
by a topic only when all the subscribers have received the
previous datasample. Specifically, after a 300s supplementary
time gap at the end of the previous datasample delivery phase,
a new random publisher is selected for the considered topic.
We carried out the three kinds of performance evaluation,
aimed at providing insight about the following aspects:
• effectiveness of the overall optimization (memory utiliza
tion + sharing factor) with respect to a notoptimized
system;
• performance tradeoffs related to the memory utilization;
• performance tradeoffs related to the sharing factor.
A. Optimization Effectiveness
We consider nodes moving according to the Random Way
Point (RWP) model [5] with constant speed of 1 m/s and a
constant pause time of 5 s to be sure to have a stationary
process [6]. We considered 5 scenarios with different number
of topics, i.e. T = 30, 50, 70, 90, 100 topics. The topics
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10
Number of subscribers
Topic group
Zipf distribution
Uniform distribution
Fig. 2. Number of subscribers per topic
0
0.2
0.4
0.6
0.8
1
30 40 50 60 70 80 90 100
Memory utilization ρ
Number of topics
0
0.02
0.04
0.06
0.08
0.1
1 2 3 4 5 6 7 8 9 10
Sharing factor φ
Topic group
Fig. 3.
sharing factors in case of 30 topics
Memory utilization configuration versus the number of topics and
have different popularities; we group the topics in 10 different
groups of equal size and set the popularity of each group
according to a Zipf distribution [7], with parameter α = 0.8.
The topics of the same group have the same popularity. The
number of subscribers per topic is reported in figure 2 as a
function of the topic group (for instance, a topic of the first
group has 27 subscribers).
The subscriptions are uniformly assigned to nodes, so that
nodes without any subscriptions are only used for storecarry
forward DTN operations.
The upper plot of the figure 3 shows the values of the
memory utilization ρ resulting from the optimization versus
the different number of topics. We observe that, varying the
number of topics, the optimal memory utilization remains in
the interval 0.80.9; this tell us that it is not convenient to use
all the available memory. Setting ρ in this interval can also
be considered as a good rule of thumb for these intermediate
cases where the number of topics is neither too high nor too
low.
29
Page 6
The lower plot of the figure 3 shows the sharing factors {φi}
of the different topics in case of 30 topics. Those topics are
divided in 10 groups (3 topics for each group) and popularities
of groups are the ones presented in fig. 2. We observe that the
sharing factors are proportional to the topic popularities and
this proportion depends on the square root of the number of
subscriptions as stated in eq. 12.
In figure 4 we report the comparison of the mean delivery
delay for an optimized a non optimized system. In the non
optimized system, all the available memory is used (ρ = 1)
and the memory spaces is equally distributed among topics
(i.e. φi= 1/T).
We observe that the performance improvement provided by
the suboptimal configuration increases with the number of
topics. However, even if it is not reported in the plot, this
improvement does not always grow as the number of topic.
As an example, if we consider the extreme case of 1000 topics,
each topic has the possibility of deploying only 1 replica,
thus the optimization returns the same configuration of the
nonoptimized system (i.e. 1 replica pertopic) and the delay
performances are equals.
In general, we argue that the presented optimization is not
effective in cases of very little or very large amount of topics
(respect to the number of nodes an their capacity), but in
the cases of “moderate” number the presented optimization
provides a valuable performance improvement. In fact:
• in case of very few amount of topic, both the optimized
and nonoptimized systems put a large amount of replicas
for each topic. This results in good delay performance of
the nonoptimized system so the improvement carried by
the optimization is very limited (see for example the case
of 20 topic in figure 6).
• in case of many topics, there are few replication options,
e.g., we could only choose between 1 or 2 replicas for
a given topic. This means that the optimization process
decides on a very limited set of possibilities, and, as a
matter of fact, the performance improvement provided
by the optimization is limited. For instance, in the worst
case of T = 1000,Cn = 10,N = 100 the only viable
solution to diffuse all the datasamples is to use only one
replica per topic (i.e., no replication). This configuration
is the same of both the optimized and the nonoptimized
systems.
• in case of a moderate number of topics when the previous
conditions do not occurs, the presented optimization
provides valuable performance improvements.
B. Memory Utilization Tradeoffs
In this section we analyze the performance tradeoffs related
to the memory utilization.
We consider scenarios where the topics have the same popu
larity, then φi= 1/T. We feed the simulator with two different
mobility trace that we call “negexp” and “RWP”. The former
trace is derived considering an exponentially distributed inter
meeting times and we use this trace to validate the analytical
0
50
100
150
200
250
Mean Delivery Delay (s)
300
350
30 40 50 60
Number of topics
70 80 90 100
Without optimizations
With optimizations
Fig. 4.
the number of the topics
Mean delivery delay with and without optimal configuration varying
0
200
400
600
800
1000
Mean Delivery Delay (s)
1200
1400
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Memory utilization
math, 110 rep
sim 110 rep
Fig. 5.
100 topics and varying memory utilization ρ
Mean delivery delay between eq 8 and simulations in the case of
results; the latter trace follows the random waypoint mobility
model previously described.
Figure 5 shows a comparison between formula 8 and simu
lation results in the case of 100 topics and negexp movements,
varying the memory utilization. The analytical model exhibits
a good fitting on the simulation curve. As discussed in section
IIC, the graph confirms the presence of an optimum point
after which the benefits of introducing more replicas in the
system are overwhelmed by a longer spray phase.
Then we changed the mobility trace and consider the RWP
movement trace. This allows us to test the usefulness of the
optimization on a mobility scenario that is not consistent with
the mobility assumptions of the model. Figure 6 shows the
mean delivery delay varying the memory utilization and the
number of topics. The optimum point varies with the number
of topics and the performance worsening due to memory
saturation increases with the number of topics, as we can
see for instance by comparing the difference between optimal
memory utilization and ρ = 1 for the cases 100 topics and 20
30
Page 7
0
200
400
600
800
1000
Mean Delivery Delay (s)
1200
1400
0 0.1 0.2 0.3 0.4
Memory utilization
0.5 0.6 0.7 0.8 0.9 1
Analytical optimal utilization
100 topics, 110 rep
70 topics, 114 rep
50 topics, 120 rep
20 topics, 150 rep
Fig. 6.
utilization ρ
Mean delivery delay varying the number of topics and memory
topics.
The reason is due to the different number of replicas per
topic in these two cases, that respectively varies in the ranges
110 and 150. When we add more replicas in the system,
the spray time increases as well. During the spray process,
the first replicas are diffused more rapidly, while the latest
ones are diffused more slowly. In fact, datasamples can be
replicated only on nodes with available memory and that are
not already holding a replica for that datasample. However,
when the spray time become significantly high, several replicas
are already displaced in the case of 20 topics, while just few
replicas are spread for the case 100 topics. For this reason
the performance gap between the optimum point and the full
memory occupation is smaller in the case of 20 topics and
bigger in the case of 100 topics. The graph also shows by
arrows the values of memory utilization (i.e. values of the x
axis) returned by the utilization optimization (sec. IIIA). As
we can see, the optimization succeeds in finding the optimum
point in all the presented cases. Graph 7 shows the duration of
the spray phase in the same conditions of the previous graph.
Here, after a memory utilization of 0.8  0.9, the spray time
curve sharply raises.
C. Sharing Factor Tradeoffs
In this section we analyze the performance tradeoffs related
to the sharing factor.
We consider a scenario where we have 100 topics equally
divided into two groups, named ”popular“ and ”unpopular“ :
topics belonging to the popular group are have 50 subscribers,
while topics belonging to the unpopular group have 10 sub
scriber. Nodes move according to a RWP mobility model.
Figure 8 represents the delivery delay of popular and unpop
ular topics and the overall average delivery delay (D) varying
the sharing factor of the popular topics while keeping ρ = 1.
Each point x of the xaxis represents the sharing factor for
one popular topic (1/1000 ≤ x ≤ 9/1000); in that point each
unpopular topic has a sharing factor equal to (10 − x)/1000.
0
200
400
600
800
1000
1200
1400
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Mean Spray Delay (s)
Memory utilization
100 topics, 110 rep
70 topics, 114 rep
50 topics, 120 rep
20 topics, 150 rep
Fig. 7.
memory utilization
Mean spray duration time varying the number of topics and the
0
0.001
200
400
600
800
1000
1200
Mean Delivery Delay (s)
1400
1600
0.002 0.003 0.004
Sharing Factor of popular topic
0.005 0.006 0.007 0.008 0.009
Popular topic
Unpopular topic
Overall mean delay
Fig. 8.
two groups, named popular and unpopular
Mean delivery delay for the case of 100 topics equally divided in
We observe that the minimum of the average delivery delay
is achieved with a sharing factor of 7/1000 for popular topic,
and 3/1000 to unpopular ones. This result confirms the one
given by formula 12 that finds the optimal value in the point
0.00690983.
V. RELATED WORK
Several DTN routing strategies have been proposed so
far. If we classify these strategies according to the number
or replicas introduced in the network, we have singlecopy
routing schemes, where only one single copy of a given
message is routed and forwarded, and multiplecopy routing
schemes, where multiple copies of the same message are
routed and forwarded independently.
The singlecopy routing scheme proposed in [8] belongs
to the first category. In this work, authors propose a routing
scheme called “seek and focus” where a single copy of the
message is passed by nodes according to an utility function
that depends from the latest time a node encounters the
31
Page 8
destination. In the seek phase, if the utility around the node
holding the data is “low”, the node hands over the data to a
neighbor node, with probability p until a node with an high
utility is found. When this node is found, the focus phase starts
and nodes use utilitybased forwarding to deliver the message
to the final destination.
With regard to the multiplecopy routing schemes, one
possible approach is to distribute a message to all or almost
all the nodes in the network. This is the case of [9] where
authors propose the so called epidemic routing. This approach
consists in distributing messages to node called “carriers” that
are within a connected portions of adhoc networks. During
their movements, carriers will come into contact with another
connected portion of the network thus “infecting” additional
islands of connected nodes.
The probabilistic routing approach proposed in [10] called
PROPHET (Routing Protocol using History of Encounters and
Transitivity) aims to overcoming the two main limitations
of the epidemic routing approach: the assumption of infinite
buffer space and bandwidth in the nodes and the inability of
predict user movements.
To overcome such limitations, PROPHET exploits a forward
ing strategy based on a probabilistic metric, the “delivery
predictability”: nodes that are often encountered have a high
delivery predictability and delivery predictability reflects the
transitive property, according to which if node a frequently
encounters node b, and node b frequently encounters node c,
then node c probably is a good node to forward messages
destined for node a.
There exists also a multiplecopy routing version of the
singlecopy routing scheme proposed in [8]. This scheme has
been presented by the same authors in [3], it is called as “spray
and wait” and it is the one we used in this paper. In the same
work, the authors propose a second scheme, denoted as “spray
and focus”, which consists of two different phases as well :
i) the spray phase, which is the same as the spray phase in
the “spray and wait” scheme, ii) the focus phase, during which
each of the L message copy is independently routed according
to the singlecopy scheme proposed in [8]. In addition, they
provide some insight on how to choose the number of replica
to spray in the network to achieve a required expected delay
(expressed as a multiple of the optimum minimum delay).
For what concern DTN routing used with the publish
subscribe communication paradigm, we cite SocialCast [11].
This work relies on the notion of utility U of a node N with
respect to interest I, that represents how good a carrier N is to
carry messages matching the interest I. The basic assumption
is that hosts which have same interest spend time colocated,
so that the SocialCast routing aims at exploiting as carrier
for messages hosts which have been colocated often with the
interested. Kalman filter forecasting techniques are used to
predict the future evolution of the movement based on previous
observations on some attributes characterizing social behavior
(e.g., connectivity changes, colocation).
VI. CONCLUSIONS
We tackled the problem of how to optimize the mean
delivery delay for a Publish Subscribe system build on top of a
Delay Tolerant Network (DTN). We consider scenarios where
i) the DTN nodes make available only a limited memory space
for routing purposes, and ii) topics have different popularity,
i.e. different number of subscribers. We focused on spray
andwait routing scheme and therefore the only configuration
parameter that could be optimized is the number of replicas
per topic.
Given that providing an holistic approach results too com
plex, we split the optimization problem in two parts and we
provide answers to these basic questions: i) which is the
optimal amount of available memory to use for replication
purposes (memory utilization), and ii) which is the right way
to share this amount of memory among topics (sharing factor).
In doing so, albeit we derived a suboptimal solution, we have
handily and effective formulas. Indeed, with the exclusion of
extreme cases of too few or too many topics, the optimization
provides a reduction of the mean delivery delay up to 60%,
with respect to a nonoptimized system
Finally, we argue that our optimization results can be used
not only in case of sprayandwait routing but also in cases of
other multiplecopy routing schemes, which have the number
of replicas as input parameter. Obviously, in these cases the
number of replicas could be just one of the many parameters
to optimize.
REFERENCES
[1] P. T. Eugster, P. A. Felber, R. Guerraoui, A. Kermarrec, The Many Faces
of Publish/Subscribe, in ACM Computing Surveys (CSUR), Volume 35,
Issue 2, 2003
[2] A. Detti, D. Battaglino, L. Bracciale, F.Lo Piccolo, A. Bragagnini,
M.S. Turolla, N. Blefari Melazzi, A topicbased, publishsubscribe ar
chitecture for intermittently connected 802.15.4 networks to be pre
sented in Mobile Summit 2011 and available for peer review at
http://stud.netgroup.uniroma2.it/ lorenzo/mobilesummit.pdf
[3] T. Spyropoulos , K. Psounis , C. S. Raghavendra, Efficient routing in inter
mittently connected mobile networks: the multiplecopy case, IEEE/ACM
Transactions on Networking (TON), v.16 n.1, p.7790, February 2008
[4] http://stud.netgroup.uniroma2.it/ lorenzo/campus++/simulator.tar.bz2
[5] T. Camp, J. Boleng, V. Davies, A Survey of Mobility Models for Ad
Hoc Network Research, Wiley, Wireless Communications & Mobile
Computing, vol. 2, 2002
[6] J. Yoon, M. Liu, B. Noble, Random waypoint considered harmful, in
Proceedings of INFOCOM 2003, San Francisco, April 2003.
[7] Spyros Voulgaris, Etienne Rivire, AnneMarie Kermarrec, and Maarten
van Steen, Sub2sub: Selforganizing contentbased publish and sub
scribe for dynamic and large scale collaborative networks, Research
Report RR5772, INRIA, Rennes, France, December 2005.
[8] T. Spyropoulos, K. Psounis, and C. S. Raghavendra, Singlecopy routing
in intermittently connected mobile networks, in Proc. IEEE Conf. Sensor
and Ad Hoc Communications and Networks (SECON), 2004, pp. 235244.
[9] A. Vahdat and D. Becker, Epidemic routing for partially connected ad
hoc networks, Duke Univ., Durham, NC, Tech. Rep. CS200006, Apr.
2000.
[10] A. Lindgren, A. Doria, and O. Schelen, Probabilistic routing in inter
mittently connected networks, SIGMOBILE Mobile Comput. Commun.
Rev., vol. 7, no. 3, 2003.
[11] P. Costa, C. Mascolo, M. Musolesi, G. P. Picco, SociallyAware Routing
for PublishSubscribe in DelayTolerant Mobile Ad Hoc Networks, in
IEEE Journal of Selected Areas in Communication (JSAC), vol. 26, no.
5, pp. 748760, June 2008
32
View other sources
Hide other sources
 Available from Laura Bracciale · May 21, 2014
 Available from ucla.edu