Page 1

Delay Performance of a Publish Subscribe system

deployed over a Memory-Constrained, Delay

Tolerant Network

Lorenzo Bracciale∗, Donato Battaglino∗, Andrea Detti∗, Giuseppe Bianchi∗, Nicola Blefari-Melazzi∗,

Andrea Bragagnini†, Maura Santina Turolla†

,

∗DIE, Universit´ a di Roma “Tor Vergata”, Rome, Italy

{lorenzo.bracciale,donato.battaglino,andrea.detti,giuseppe.bianchi,blefari}@uniroma2.it

†Telecom Italia, Turin, Italy

{andrea.bragagnini,maurasantina.turolla@telecomitalia.it}@telecomitalia.it

Abstract—In this work we assess the delay performance of a

Publish Subscribe system built on top of a Delay Tolerant Net-

work (DTN) composed of nodes with limited storage capacities.

Many DTN routing protocols replicate the same data over several

nodes, in order to deliver data to destination in a faster or in a

more reliable way. Of course, increasing the number of replicas

has the effect of decreasing the delivery delay perceived by the

users, but increases the use of the system memory.

Our goal is to investigate the trade-off between reduction

of delay and storage requirements when nodes are memory-

constrained, in a Topic-based, Publish Subscribe system where

we have different topics with different popularity.

We provide some insights in this trade-off, which implies some

unanticipated issues, and propose simple rules to dimension the

number of replicas per topic. To this end, we derive analytical

models and we validate them with simulations.1.

I. INTRODUCTION

Publish-Subscribe represents a very popular communication

paradigm mainly because of the loose coupling it introduces

between actors (publishers / subscribers), which well suits

asynchronous and many-to-many communications. One of the

most popular publish-subscribe form is the so-called topic-

based [1]. In a topic-based publish-subscribe system, users

are interested only in a certain set of events, named topics

and represented through keywords (e.g. the topic name). Once

subscribers have subscribed to a specific topic, they will

receive data-samples emitted by publishers on that topic. In

other words, topics are logic channels that connect publishers

to subscribers.

The publish-subscribe paradigm is a very convenient service

model in Delay Tolerant Networks (DTNs). These are net-

works characterized by intermittent connectivity and/or long

propagation delays between the nodes of the network. Typ-

ical application scenarios are deep-space, underwater, sensor

and tactical networks where connectivity cannot always be

guaranteed. Moreover, DTNs well satisfy the requirements of

user communities that do not require real time services, e.g. a

messaging system in a university campus.

1This work has been carried out in cooperation with and with funding from

Telecom Italia.

DTN routing/forwarding schemes employ the so called

store-carry-forward paradigm: when connectivity towards a

neighbor node is not available, or deemed not convenient by

the forwarding scheme employed, a (relay) node temporarily

buffer messages. These will be forwarded at a later time,

when nodes’ movement will make available a link towards

the destination. Clearly, the end-to-end delay depends on the

mobility model of the nodes (speed, distribution). To improve

delay performance and delivery probability, the typical strategy

consists in creating multiple copies of a same data-sample over

several nodes. Of course, increasing the number of replicas

has the effect of decreasing the delivery delay perceived by

the users (subscribers) but increases the use of the system

memory.

In this paper we present an optimization of a topic-based

publish-subscribe system that operates on top of a DTN which

uses a multiple copy routing approach. The topics could have

different popularity, i.e. different number of subscribers. The

DTN is formed by nodes that have a limited memory to store

and carry the published data-samples. This case is of interest

for DTNs realized with embedded systems such as [2] or,

more in general, when the size of data-samples are comparable

with the storage space available in the nodes. The optimization

goal is to minimize the mean delivery delay perceived by the

subscribers.

The presence at the same time of memory constraints

and topics with different popularity makes the optimization

problem a challenging one. Indeed, as shown later, greedily

exploiting all the memory resources available in the network

nodes is counter-productive with respect to the system perfor-

mance. To the best of our knowledge, such issue has not been

formerly addressed in the literature. Accordingly, we provide

convenient formulae which allow to properly dimension the

number of replicas per topic. We further address the case

of topics with different popularity and we prove that the

optimal number of replicas is proportional to the square root

of the topic popularity. We verify the effectiveness of our

analytical modeling by means of simulations. To simplify

the optimization we preliminarily focus our attention on a

2011 The 10th IFIP Annual Mediterranean Ad Hoc Networking Workshop

978-1-4577-0899-2/11/$26.00 ©2011 IEEE25

Page 2

very simple DTN, employing the spray-and-wait [3] routing

scheme, and on random mobility assumptions. However, we

believe that our results may be applicable also to other multiple

copy routing schemes where the number of replicas per data-

sample is an input parameter.

II. PROBLEM DESCRIPTION

In this section, we introduce the considered scenario and

assumptions, we state the delay optimization problem, and

we illustrate at a glance the trade-offs that emerge in the

optimization.

A. Scenario

We consider a wireless network composed of N mobile

nodes that move within a service area according to a random

mobility model. Each mobile node may have publishing and/or

subscribing entities for topics; there are T topics and the i-

th topic has Si subscribing entities. Throughout the paper

we refer to a publishing (subscribing) entity with the term

publisher (subscriber). Moreover, we refer to the parameter Si

as topic popularity.

If a publisher and a subscriber for a same topic are within

their radio coverage range, they can directly exchange data-

samples. When this is not the case, we assume that data-

samples are forwarded throughout the DTN using the “spray

and wait” approach. Specifically, the data-samples for the

different topics are replicated into the memories of nodes by

using the so-called binary spray [3]. Binary spray spreads

the replicas of a data-sample for the i-th topic as follows:

the source generates Rireplicas (including the original data-

sample); any node A with x > 1 replicas2of the data-sample

that meets a node B with no replica forwards �n/2� of its

copies to B and keeps the remaining copies.

Unlike [3], we focus on the case of nodes whose memory

made available for the DTN forwarding process is bounded.

Specifically, we assume that each node may store at most

Cndata-samples. This sets forth the additional condition that

a data-sample is replicated only whenever a neighbor node

has at least one memory slot available. The introduction of a

memory bound in a topic-based system (where data-samples

are persistently generated) implies that a suitable mechanism

must be introduced to control the data obsolescence. Viable

approaches consist in either defining data-sample lifetimes,

or automatic event-based deletion mechanisms, for instance

when a node ”sees” a newer data-sample for a same topic in a

neighbor node [2]. In what follows, for simplicity, we idealize

such data obsolescence mechanism by assuming that all the

data-samples for a given topic are removed from the memory

when a new data-sample is published for that same topic.

Finally, we assume that the time needed to exchange data-

sample among neighbor nodes is negligible, and that a contact

2Note that when a node, because of the spray operation, holds x > 1

replicas for a data-sample, all the x−1 copies do not account in the memory

occupation of the node, i.e. only one copy of the data-sample is actually stored

in the node memory, together with a counter that keeps track of the number

of local replicas.

among nodes lasts for a sufficient amount of time to exchange

all the data-samples that are required by the DTN forwarding

rules.

B. Problem Statement and definitions

We recall that the main characterizing feature of the spray

and wait protocol is the number of replicas Ri that are

disseminated during the spray phase for each data-sample

published to the i-th topic. In this work, we aim at globally

optimizing said number of per topic replicas R = [R1···RT],

for minimizing the average time D between the publication

of a data-sample and its reception by subscribers. We call this

delay the mean delivery delay. Obviously, this delay depends

on the different topic popularity Si. It is readily expressed

as the weighted sum of the mean delays Di perceived by

subscribers of topic i, where the weights are the ratio between

the number of subscribers for topic i and the total number of

subscribers in the system3. In formula:

D =

T

�

i=1

Si

�T

j=1Sj

· Di

(1)

It is convenient to express the number of replicas Ri

in terms of two parameters which we refer to as memory

utilization (ρ) and sharing factor (φi). We define the memory

utilization ρ as the ratio between the network-wide memory

space used to store data-samples Cu=�

Cu

Ctot

Moreover, for each topic i we define sharing factor φias the

fraction of the used memory space Cuassigned to topic i for

its replicas:

φi=Ri

Cu

iRiand the overall

available memory Ctot= NCn:

ρ ==

�T

i=1Ri

NCn

(2)

=

Ri

�T

j=1Rj

(3)

The number of replicas deployed for a given topic i is readily

expressed in terms of these two above defined parameters as:

Ri= ρ · φi· Ctot= ρ · φi· NCn

The problem thus consists in determining the optimal set of

parameters ρ and {φi} so that the delay D given in equation

(1) is minimized, under the condition that no node is able to

locally store more than Cndistinct replicas.

(4)

C. Discussion

At a first glance, the need to account for the memory

utilization ρ may not be evident. Indeed, the per topic average

delay Diclearly depends on the number of deployed replicas

Ri, and without memory constraints the delay performance

would improve with a larger number of replicas. It could

therefore naively appears that the sum of the replicas should

3We note that every node may be a subscriber for one or more topics,

so the number of subscriptions could be greater than the number of nodes.

Throughout this paper we consider one subscriber per subscription, so also

the number of subscribers could be greater than the number of nodes.

26

Page 3

be set to the maximum value allotted by the global memory

constrains, i.e., ρ = 1. However, a closer look reveals that this

is not the case. Indeed, the optimal choice derives from trade-

offs that involve both the memory utilization and the sharing

of the memory space:

• memory utilization - as we increase the overall num-

ber of replicas in the system, the probability that the

subscriber meets a node having one of these replicas

increases and that tends to reduce the delivery delay.

Conversely, as we increase the overall number of replicas

in the system, the probability of finding a node with a free

memory space decreases. Consequently, the duration of

the spray phase is longer, the time required to spread in

the system a given number of replicas increases and that

tends to increase the delivery delay.

• memory sharing - let we consider a situation where

each topic has the same number of replicas and hence all

topics fairly share the system memory. As we increase

the number of replicas for a given topic i, we have to

decrement the number of replicas for another topic j, so

as to maintain constant the memory occupancy. In doing

so, the delivery delay of topic i decreases and the delivery

delay of topic j increases. This delay unfairness could

reduce the overall mean delivery delay (D), when the

number of subscribers Si of topic i is greater than Sj.

However, an excessive unfairness could cause so high

delay penalty for subscribers of topic j as to vanish the

benefits of delay reduction obtained by subscribers of

topic i.

III. PROBLEM RESOLUTION

In the general case of different topic popularity, a delay

optimization jointly targeting the parameters ρ and {φi}

appears numerically cumbersome and this could mine the

comprehension of the problems and the derivation of practical

guidelines. Therefore, we resorted to a sub-optimal sequential

approach which consists in two subsequent steps:

1) memory utilization optimization: first, we derive the

memory utilization ρ which minimizes delay in the

assumption that all the topics have the same popularity

and, hence, φi= 1/T;

2) sharing factor optimization: second, we derive the

sharing factors φi, by assuming that the available mem-

ory is the one derived by the previous memory utilization

optimization.

To simplify the analytical computation we consider exponen-

tially distributed inter-meeting time between any pair of nodes.

This implies that each node sees another (uniform distributed)

random node every 1/λ seconds (in mean).

A. Memory Utilization Optimization

Under the assumption that all the topics have the same

popularity and deploy the same number of per topic replica

R = Ri, the subscribers of the different topics experience

the same delay that is equal to the mean delivery delay, i.e.

D = Di. It therefore suffices to study the delivery delay for a

Fig. 1.

a given data-sample.

Markov chain representing the number of replicas in the system for

single topic, by determining an analytic formula (eq. 8) which

expresses the delivery delay as a function of the memory

utilization, and numerically find the optimal value of ρ that

provides the minimum delay value.

To make the computation of D analytically tractable we

resort to the following assumptions:

• we approximate the binary spray mechanisms as an epi-

demic diffusion of data-sample replicas that stops when

the target number of replicas Riis reached.

• we assume only one spray at a time; in other words,

during the spray phase of the considered topic, the

amount of memory used for data-sample replicas of other

topics does not change. This is a good approximation

for the case of infrequently emissions of data-samples,

whereas it should be considered as a worst case if data-

samples production becomes frequent.

We model the evolution of spray and wait using the con-

tinuous time Absorption Markov Chain represented in figure

1. The state variable of the Markov chain is the number of

deployed replicas of the considered topic. When a data-sample

is first produced, the chain starts with state 1, meaning that

only the producer has a copy of the data-sample. Whenever

the producer ”meets” another node, the chain moves to state

2, meaning that two nodes in the network carry a data-sample

replica. When the chain reaches state R, meaning that the data-

sample is copied over R nodes, the so-approximated spray

phase stops.

In addition, a special state called ”absorption” state models

the reception of the data-sample by a selected subscriber

for that topic, hereafter referred to as target subscriber. As

such, the average data-sample delivery delay can be simply

computed by determining the absorption time in the Markov

chain. For ease of notation, we conveniently call n the number

of nodes in the network excluding the target subscriber (i.e.

n = N − 1).

Each state i ∈ (1 : R − 1) in the chain has two outgoing

transition rates: one towards the state i + 1, and the other

towards the absorption state. These transition rates are derived

and justified as follows:

• transition rate {i → i + 1}. When the chain is in state

27

Page 4

i, a transition to state i + 1 occurs whenever one of the

i replica nodes encounters any of the remaining nodes

except the target subscriber. Since the rate of encounters

for each node is λ, the aggregate rate of encounters for

the i nodes storing a data-sample replica is λ·i. Among

these encounters, the data-sample is further replicated

only when i) the encountered node is not the target

subscriber and it does not already hold the replica, and

ii) the encountered node has at least one memory slot

available to accommodate the replica. Upon an actual

encounter, condition (i) occurs with probability (n−i)/n,

whereas, to model condition (ii), we define fR to be

the probability that the encountered node has at least a

memory slot available. fR is derived in the subsequent

expression (7). We thus conclude that the transition rate

{i → i + 1} is expressed by the product iλfR(n − i)/n.

• transition rate {i → absorption}. Transition to the

absorption state occurs when the target subscriber meets

any of the i nodes holding the data-sample. In formula,

this is expressed by iλ/n.

Using this chain, the mean delivery delay D of the con-

sidered topic can be calculated as the delay between the state

1 and the absorption state (state 0). We define ∆i and δi,j

respectively the mean time the system remains in the state i

and the average time needed to move from the state i to j.

According to the Markov chain theory, the state equations

can be written as:

δ1,0

=∆1+

fR(n − 1)

1 + fR(n − 1)δ2,0

fR(n − 2)

1 + fR(n − 2)δ3,0

fR(n − 3)

1 + fR(n − 3)δ4,0

(5)

δ2,0

=∆2+

δ3,0

=∆3+

...

δR,0

=∆R

The mean time ∆ithe system remains in the state i is given

by the inverse of the outbound transition rates from each state:

∆1

=

n

λ[1 + fR(n − 1)]

n

2λ[1 + fR(n − 2)]

n

3λ[1 + fR(n − 3)]

...

n

Rλ

(6)

∆2

=

∆3

=

∆R

=

To conclude the analysis, it remains to derive the probability

fR used in the chain. This is defined as the probability that

an encountered node has a free memory slot to accommodate

the replica. We remark that fRdoes not change with the chain

evolution4, and can be computed by quantifying the probability

4We recall that we model the process of data-sample distribution for a given

topic, meanwhile assuming that any remaining topic has already ”sprayed”

its data-sample and thus reached steady state. A joint modeling of all topics

would yield an exponential increase in the space state of the chain.

that a node has at least one free memory slot before the start

of the spray phase, i.e. when there are only R(T −1) replicas

in the system generated by the remaining topics. Specifically,

note that the replicas generated for the considered topic do not

affect the probability fR. Indeed, nodes which already hold a

replica for the considered topic are not accounted in the term

fR, but are accounted in the complementary term (n − i)/n

within the state transition rate. It follows that fRcan be well

approximated by:

fR= 1 −

�

n

n + 1

�Ctot−R(T−1)

(7)

This formula is obtained considering that before the start of

the spray there are Ctot− R(T − 1) memory slots available

for the considered topic, distributed on node memories. Con-

sidering free memory slots as ”tokens“ randomly assigned to

nodes, the probability that a node has at least a free memory

slot is the complementary probability that the node has not

been assigned any free space.

Recursively solving equation (7) using the values (6), we

obtain the following expression for the mean delivery delay

D = δ1,0as a function of the parameter R:

D =

R−1

�

k=1

�k

w=1

fR(n−w+1)

1+fR(n−w)

λfRk

+ n

�R−1

w=1

fR(n−w)

1+fR(n−w)

λR

(8)

which can be rewritten in terms of the memory utilization

ρ (and numerically minimized with respect to such unique

variable) by simply substituting:

R = ρ ·Ctot

T

= ρ ·N · Cn

T

B. Sharing Factor Optimization

In this section we derive the optimal sharing factors φi,

under the assumption that the number of memory slots used

in the network is fixed to the value Cu = ρCtot, where ρ

is the memory utilization value which minimizes expression

8. Although (8) provides a delay expression which accounts

for both the spray and the wait phases, to achieve a closed-

form solution, similar to [3] we approximate the delay by

neglecting the spray phase5. In this case, as shown in [3]

and as a corollary of the previous analysis (i.e. considering

D ≈ δR,0= ∆R) the mean delivery delay for topic i reduces

to the very simple expression:

Di=

n

λRi

=

n

λCuφi

(9)

5Note that a more precise analysis, though necessarily addressed via

numerical means, may be developed by leveraging the results derived in the

previous section instead of using the approximation (9). In fact, it is easy to

recognize that equation (8) is readily extended to the general case of different

per topic delays Diversus the corresponding φi, as it depends only upon the

φivalue for the considered topic. Such dependence is indeed confined to the

exponent in equation (7), where we note that Ctot−R(T −1), in the general

case of different popularity, can be rewritten as Ctot− Cu+ Cuφi.

28

Page 5

Consequently, to compute the sharing factors φiwe need to

solve the following constrained optimization:

min

φ

T

�

i=1

nSi

λCuφi

�

�

φi= 1

jSj

(10)

s.t.

i

(11)

A closed form solution can be accomplished by means of

a Lagrange approach. As a result, we obtain the following

simple result:

φi=

√Si

√S1+ ...√ST

(12)

This result states that, for a relatively large number of nodes

compared to the number of deployed replicas, i.e. when the

spray time is small with respect to the wait time and hence

approximation (9) is tight, the number of deployed replicas per

topic is proportional to the square root of the topic popularity,

namely the number of topic subscribers.

IV. PERFORMANCE EVALUATION

To validate the effectiveness of the proposed optimization

and the modeling results, we developed an ad-hoc event-driven

simulator. The simulator has been written from scratch and its

code is available at [4]. As reference scenario, we consider

N = 100 nodes that move in an area of 500X500 m2. The

radio coverage range is set to 50m, simulation duration is

500000 seconds. Every node has enough memory to store up to

10 different data-samples. We used movement traces in which

the mean inter-meeting time between a specific pair of nodes

is 1400s (i.e. N/λ = 1400, the inter-meeting time between

any pair of nodes hence being 1/λ = 14s).

In the simulator data-samples are generated as follows. For

each topic, a first data-sample is generated by a randomly

chosen publisher node at the start of the simulation (initial

random offsets being used to avoid synchronized start). Then,

every topic proceeds in parallel by independently generating

subsequent data-samples. Each new data-sample is generated

by a topic only when all the subscribers have received the

previous data-sample. Specifically, after a 300s supplementary

time gap at the end of the previous data-sample delivery phase,

a new random publisher is selected for the considered topic.

We carried out the three kinds of performance evaluation,

aimed at providing insight about the following aspects:

• effectiveness of the overall optimization (memory utiliza-

tion + sharing factor) with respect to a not-optimized

system;

• performance trade-offs related to the memory utilization;

• performance trade-offs related to the sharing factor.

A. Optimization Effectiveness

We consider nodes moving according to the Random Way-

Point (RWP) model [5] with constant speed of 1 m/s and a

constant pause time of 5 s to be sure to have a stationary

process [6]. We considered 5 scenarios with different number

of topics, i.e. T = 30, 50, 70, 90, 100 topics. The topics

0

5

10

15

20

25

30

1 2 3 4 5 6 7 8 9 10

Number of subscribers

Topic group

Zipf distribution

Uniform distribution

Fig. 2. Number of subscribers per topic

0

0.2

0.4

0.6

0.8

1

30 40 50 60 70 80 90 100

Memory utilization ρ

Number of topics

0

0.02

0.04

0.06

0.08

0.1

1 2 3 4 5 6 7 8 9 10

Sharing factor φ

Topic group

Fig. 3.

sharing factors in case of 30 topics

Memory utilization configuration versus the number of topics and

have different popularities; we group the topics in 10 different

groups of equal size and set the popularity of each group

according to a Zipf distribution [7], with parameter α = 0.8.

The topics of the same group have the same popularity. The

number of subscribers per topic is reported in figure 2 as a

function of the topic group (for instance, a topic of the first

group has 27 subscribers).

The subscriptions are uniformly assigned to nodes, so that

nodes without any subscriptions are only used for store-carry-

forward DTN operations.

The upper plot of the figure 3 shows the values of the

memory utilization ρ resulting from the optimization versus

the different number of topics. We observe that, varying the

number of topics, the optimal memory utilization remains in

the interval 0.8-0.9; this tell us that it is not convenient to use

all the available memory. Setting ρ in this interval can also

be considered as a good rule of thumb for these intermediate

cases where the number of topics is neither too high nor too

low.

29

Page 6

The lower plot of the figure 3 shows the sharing factors {φi}

of the different topics in case of 30 topics. Those topics are

divided in 10 groups (3 topics for each group) and popularities

of groups are the ones presented in fig. 2. We observe that the

sharing factors are proportional to the topic popularities and

this proportion depends on the square root of the number of

subscriptions as stated in eq. 12.

In figure 4 we report the comparison of the mean delivery

delay for an optimized a non optimized system. In the non-

optimized system, all the available memory is used (ρ = 1)

and the memory spaces is equally distributed among topics

(i.e. φi= 1/T).

We observe that the performance improvement provided by

the sub-optimal configuration increases with the number of

topics. However, even if it is not reported in the plot, this

improvement does not always grow as the number of topic.

As an example, if we consider the extreme case of 1000 topics,

each topic has the possibility of deploying only 1 replica,

thus the optimization returns the same configuration of the

non-optimized system (i.e. 1 replica per-topic) and the delay

performances are equals.

In general, we argue that the presented optimization is not

effective in cases of very little or very large amount of topics

(respect to the number of nodes an their capacity), but in

the cases of “moderate” number the presented optimization

provides a valuable performance improvement. In fact:

• in case of very few amount of topic, both the optimized

and non-optimized systems put a large amount of replicas

for each topic. This results in good delay performance of

the non-optimized system so the improvement carried by

the optimization is very limited (see for example the case

of 20 topic in figure 6).

• in case of many topics, there are few replication options,

e.g., we could only choose between 1 or 2 replicas for

a given topic. This means that the optimization process

decides on a very limited set of possibilities, and, as a

matter of fact, the performance improvement provided

by the optimization is limited. For instance, in the worst

case of T = 1000,Cn = 10,N = 100 the only viable

solution to diffuse all the data-samples is to use only one

replica per topic (i.e., no replication). This configuration

is the same of both the optimized and the non-optimized

systems.

• in case of a moderate number of topics when the previous

conditions do not occurs, the presented optimization

provides valuable performance improvements.

B. Memory Utilization Trade-offs

In this section we analyze the performance trade-offs related

to the memory utilization.

We consider scenarios where the topics have the same popu-

larity, then φi= 1/T. We feed the simulator with two different

mobility trace that we call “neg-exp” and “RWP”. The former

trace is derived considering an exponentially distributed inter-

meeting times and we use this trace to validate the analytical

0

50

100

150

200

250

Mean Delivery Delay (s)

300

350

30 40 50 60

Number of topics

70 80 90 100

Without optimizations

With optimizations

Fig. 4.

the number of the topics

Mean delivery delay with and without optimal configuration varying

0

200

400

600

800

1000

Mean Delivery Delay (s)

1200

1400

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Memory utilization

math, 1-10 rep

sim 1-10 rep

Fig. 5.

100 topics and varying memory utilization ρ

Mean delivery delay between eq 8 and simulations in the case of

results; the latter trace follows the random waypoint mobility

model previously described.

Figure 5 shows a comparison between formula 8 and simu-

lation results in the case of 100 topics and neg-exp movements,

varying the memory utilization. The analytical model exhibits

a good fitting on the simulation curve. As discussed in section

II-C, the graph confirms the presence of an optimum point

after which the benefits of introducing more replicas in the

system are overwhelmed by a longer spray phase.

Then we changed the mobility trace and consider the RWP

movement trace. This allows us to test the usefulness of the

optimization on a mobility scenario that is not consistent with

the mobility assumptions of the model. Figure 6 shows the

mean delivery delay varying the memory utilization and the

number of topics. The optimum point varies with the number

of topics and the performance worsening due to memory

saturation increases with the number of topics, as we can

see for instance by comparing the difference between optimal

memory utilization and ρ = 1 for the cases 100 topics and 20

30

Page 7

0

200

400

600

800

1000

Mean Delivery Delay (s)

1200

1400

0 0.1 0.2 0.3 0.4

Memory utilization

0.5 0.6 0.7 0.8 0.9 1

Analytical optimal utilization

100 topics, 1-10 rep

70 topics, 1-14 rep

50 topics, 1-20 rep

20 topics, 1-50 rep

Fig. 6.

utilization ρ

Mean delivery delay varying the number of topics and memory

topics.

The reason is due to the different number of replicas per

topic in these two cases, that respectively varies in the ranges

1-10 and 1-50. When we add more replicas in the system,

the spray time increases as well. During the spray process,

the first replicas are diffused more rapidly, while the latest

ones are diffused more slowly. In fact, data-samples can be

replicated only on nodes with available memory and that are

not already holding a replica for that data-sample. However,

when the spray time become significantly high, several replicas

are already displaced in the case of 20 topics, while just few

replicas are spread for the case 100 topics. For this reason

the performance gap between the optimum point and the full

memory occupation is smaller in the case of 20 topics and

bigger in the case of 100 topics. The graph also shows by

arrows the values of memory utilization (i.e. values of the x-

axis) returned by the utilization optimization (sec. III-A). As

we can see, the optimization succeeds in finding the optimum

point in all the presented cases. Graph 7 shows the duration of

the spray phase in the same conditions of the previous graph.

Here, after a memory utilization of 0.8 - 0.9, the spray time

curve sharply raises.

C. Sharing Factor Trade-offs

In this section we analyze the performance trade-offs related

to the sharing factor.

We consider a scenario where we have 100 topics equally

divided into two groups, named ”popular“ and ”unpopular“ :

topics belonging to the popular group are have 50 subscribers,

while topics belonging to the unpopular group have 10 sub-

scriber. Nodes move according to a RWP mobility model.

Figure 8 represents the delivery delay of popular and unpop-

ular topics and the overall average delivery delay (D) varying

the sharing factor of the popular topics while keeping ρ = 1.

Each point x of the x-axis represents the sharing factor for

one popular topic (1/1000 ≤ x ≤ 9/1000); in that point each

unpopular topic has a sharing factor equal to (10 − x)/1000.

0

200

400

600

800

1000

1200

1400

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Mean Spray Delay (s)

Memory utilization

100 topics, 1-10 rep

70 topics, 1-14 rep

50 topics, 1-20 rep

20 topics, 1-50 rep

Fig. 7.

memory utilization

Mean spray duration time varying the number of topics and the

0

0.001

200

400

600

800

1000

1200

Mean Delivery Delay (s)

1400

1600

0.002 0.003 0.004

Sharing Factor of popular topic

0.005 0.006 0.007 0.008 0.009

Popular topic

Unpopular topic

Overall mean delay

Fig. 8.

two groups, named popular and unpopular

Mean delivery delay for the case of 100 topics equally divided in

We observe that the minimum of the average delivery delay

is achieved with a sharing factor of 7/1000 for popular topic,

and 3/1000 to unpopular ones. This result confirms the one

given by formula 12 that finds the optimal value in the point

0.00690983.

V. RELATED WORK

Several DTN routing strategies have been proposed so

far. If we classify these strategies according to the number

or replicas introduced in the network, we have single-copy

routing schemes, where only one single copy of a given

message is routed and forwarded, and multiple-copy routing

schemes, where multiple copies of the same message are

routed and forwarded independently.

The single-copy routing scheme proposed in [8] belongs

to the first category. In this work, authors propose a routing

scheme called “seek and focus” where a single copy of the

message is passed by nodes according to an utility function

that depends from the latest time a node encounters the

31

Page 8

destination. In the seek phase, if the utility around the node

holding the data is “low”, the node hands over the data to a

neighbor node, with probability p until a node with an high

utility is found. When this node is found, the focus phase starts

and nodes use utility-based forwarding to deliver the message

to the final destination.

With regard to the multiple-copy routing schemes, one

possible approach is to distribute a message to all or almost

all the nodes in the network. This is the case of [9] where

authors propose the so called epidemic routing. This approach

consists in distributing messages to node called “carriers” that

are within a connected portions of ad-hoc networks. During

their movements, carriers will come into contact with another

connected portion of the network thus “infecting” additional

islands of connected nodes.

The probabilistic routing approach proposed in [10] called

PROPHET (Routing Protocol using History of Encounters and

Transitivity) aims to overcoming the two main limitations

of the epidemic routing approach: the assumption of infinite

buffer space and bandwidth in the nodes and the inability of

predict user movements.

To overcome such limitations, PROPHET exploits a forward-

ing strategy based on a probabilistic metric, the “delivery

predictability”: nodes that are often encountered have a high

delivery predictability and delivery predictability reflects the

transitive property, according to which if node a frequently

encounters node b, and node b frequently encounters node c,

then node c probably is a good node to forward messages

destined for node a.

There exists also a multiple-copy routing version of the

single-copy routing scheme proposed in [8]. This scheme has

been presented by the same authors in [3], it is called as “spray

and wait” and it is the one we used in this paper. In the same

work, the authors propose a second scheme, denoted as “spray

and focus”, which consists of two different phases as well :

i) the spray phase, which is the same as the spray phase in

the “spray and wait” scheme, ii) the focus phase, during which

each of the L message copy is independently routed according

to the single-copy scheme proposed in [8]. In addition, they

provide some insight on how to choose the number of replica

to spray in the network to achieve a required expected delay

(expressed as a multiple of the optimum minimum delay).

For what concern DTN routing used with the publish-

subscribe communication paradigm, we cite SocialCast [11].

This work relies on the notion of utility U of a node N with

respect to interest I, that represents how good a carrier N is to

carry messages matching the interest I. The basic assumption

is that hosts which have same interest spend time co-located,

so that the SocialCast routing aims at exploiting as carrier

for messages hosts which have been co-located often with the

interested. Kalman filter forecasting techniques are used to

predict the future evolution of the movement based on previous

observations on some attributes characterizing social behavior

(e.g., connectivity changes, co-location).

VI. CONCLUSIONS

We tackled the problem of how to optimize the mean

delivery delay for a Publish Subscribe system build on top of a

Delay Tolerant Network (DTN). We consider scenarios where

i) the DTN nodes make available only a limited memory space

for routing purposes, and ii) topics have different popularity,

i.e. different number of subscribers. We focused on spray-

and-wait routing scheme and therefore the only configuration

parameter that could be optimized is the number of replicas

per topic.

Given that providing an holistic approach results too com-

plex, we split the optimization problem in two parts and we

provide answers to these basic questions: i) which is the

optimal amount of available memory to use for replication

purposes (memory utilization), and ii) which is the right way

to share this amount of memory among topics (sharing factor).

In doing so, albeit we derived a sub-optimal solution, we have

handily and effective formulas. Indeed, with the exclusion of

extreme cases of too few or too many topics, the optimization

provides a reduction of the mean delivery delay up to 60%,

with respect to a non-optimized system

Finally, we argue that our optimization results can be used

not only in case of spray-and-wait routing but also in cases of

other multiple-copy routing schemes, which have the number

of replicas as input parameter. Obviously, in these cases the

number of replicas could be just one of the many parameters

to optimize.

REFERENCES

[1] P. T. Eugster, P. A. Felber, R. Guerraoui, A. Kermarrec, The Many Faces

of Publish/Subscribe, in ACM Computing Surveys (CSUR), Volume 35,

Issue 2, 2003

[2] A. Detti, D. Battaglino, L. Bracciale, F.Lo Piccolo, A. Bragagnini,

M.S. Turolla, N. Blefari Melazzi, A topic-based, publish-subscribe ar-

chitecture for intermittently connected 802.15.4 networks to be pre-

sented in Mobile Summit 2011 and available for peer review at

http://stud.netgroup.uniroma2.it/ lorenzo/mobilesummit.pdf

[3] T. Spyropoulos , K. Psounis , C. S. Raghavendra, Efficient routing in inter-

mittently connected mobile networks: the multiple-copy case, IEEE/ACM

Transactions on Networking (TON), v.16 n.1, p.77-90, February 2008

[4] http://stud.netgroup.uniroma2.it/ lorenzo/campus++/simulator.tar.bz2

[5] T. Camp, J. Boleng, V. Davies, A Survey of Mobility Models for Ad

Hoc Network Research, Wiley, Wireless Communications & Mobile

Computing, vol. 2, 2002

[6] J. Yoon, M. Liu, B. Noble, Random waypoint considered harmful, in

Proceedings of INFOCOM 2003, San Francisco, April 2003.

[7] Spyros Voulgaris, Etienne Rivire, Anne-Marie Kermarrec, and Maarten

van Steen, Sub-2-sub: Self-organizing content-based publish and sub-

scribe for dynamic and large scale collaborative networks, Research

Report RR5772, INRIA, Rennes, France, December 2005.

[8] T. Spyropoulos, K. Psounis, and C. S. Raghavendra, Single-copy routing

in intermittently connected mobile networks, in Proc. IEEE Conf. Sensor

and Ad Hoc Communications and Networks (SECON), 2004, pp. 235244.

[9] A. Vahdat and D. Becker, Epidemic routing for partially connected ad

hoc networks, Duke Univ., Durham, NC, Tech. Rep. CS-200006, Apr.

2000.

[10] A. Lindgren, A. Doria, and O. Schelen, Probabilistic routing in inter-

mittently connected networks, SIGMOBILE Mobile Comput. Commun.

Rev., vol. 7, no. 3, 2003.

[11] P. Costa, C. Mascolo, M. Musolesi, G. P. Picco, Socially-Aware Routing

for Publish-Subscribe in Delay-Tolerant Mobile Ad Hoc Networks, in

IEEE Journal of Selected Areas in Communication (JSAC), vol. 26, no.

5, pp. 748-760, June 2008

32