Conference PaperPDF Available

Distributed Dataset Synchronization in Disruptive Networks

Authors:

Figures

Processing logic of a received sync Interest its local vector (Section V-A) and sends a new sync Interest containing its updated state vector without delay. This ensures that new state is propagated as fast as possible in the network. If the received vector contains older state than the local vector, the receiver suppresses itself and does not send a new sync Interest. In this way, nodes avoid propagating outdated state information. If the received state is the same as the local state, the receiver again suppresses itself, since this is treated as an indication that some other node in the vicinity has previously transmitted the same state information. 3) Acknowledging State Transmission: A sync Reply acts as an acknowledgment for the reception of a sync Interest and ensures that a bidirectional exchange of state information between members take place. By comparing the received state vector and the receiver's local state vector, the receiver can deduce the missing data of the sync Interest sender, and optionally contain it in the sync Reply. Multiple members may be within the communication range of each other, thus receive the same sync Interest. DDSN adopts a prioritization mechanism to avoid redundant replies, making sure that only the member that has the most accumulative up-to-date state compared to the state vector in the received Interest will reply. That is, if a receiver of a sync Interest has more recent state than the state in the Interest, it schedules the transmission of a sync Reply based on how much accumulative new state it has under each member prefix compared to the received state vector. This is achieved by selecting an appropriate value for a sync Reply transmission delay timer t sync_rep . Specifically, if the sync Interest sender's state vector contains the value pair set {(p i , seq i )}, where 0 < i < m, and the receiver's state vector contains the value pair set {(p j , seq j )}, where 0 < j < n, then:
… 
DDSN sync process example if a receiver of a sync Interest has older state compared to the state vector in the Interest, it only updates its local state by merging the vector in the Interest with its local vector. Example: In Figure 6a, C's sync Interest is received by B and D. Assuming that B has more accumulative new state than D compared to C's state vector, B will receive priority to send a sync Reply. This is achieved by having B schedule its t sync_rep timer to expire before D's t sync_rep timer. B will send a sync Reply first, which includes its own state vector. D will overhear this reply and cancel the transmission of its own sync Reply. Then, D sends a sync Interest to let B and C know about its own state vector. B and C receive D's Interest, and C replies first and suppresses B from replying 4 . After B, C, and D have received each other's vectors, they all converge to a consistent dataset state. As mentioned in Section II-B, we refer to this process of synchronizing the dataset state among members as state sync, which allows each member to have a consistent view of the shared dataset. Once nodes exchange state vectors and converge to a consistent state, each node can identify the data it is missing based on the difference between its local dataset (i.e., the data it has already fetched) and its local state vector. Nodes start fetching missing data from others by creating a queue of missing data names and sending Interests to fetch the corresponding packets. For example, in Figure 6b, we illustrate the process of node B sending an Interest for missing data, which is received by C and D. C has the requested data and responds to this Interest. As mentioned in Section II-B, we refer to this process of fetching missing data based on established state and application requirements as data sync, which is decoupled from the state sync process. State sync has a higher priority to data sync in terms of transmission scheduling, as keeping an updated state information facilitates the fetching of missing data.
… 
Content may be subject to copyright.
Distributed Dataset Synchronization in Disruptive
Networks
Tianxiang Li
tianxiang@cs.ucla.edu
UCLA
Zhaoning Kong
jonnykong@cs.ucla.edu
UCLA
Spyridon Mastorakis
smastorakis@unomaha.edu
University of Nebraska, Omaha
Lixia Zhang
lixia@cs.ucla.edu
UCLA
Abstract—Disruptive network scenarios with ad hoc, inter-
mittent connectivity and mobility create unique challenges to
supporting distributed applications. In this paper, we propose
Distributed Dataset Synchronization over disruptive Networks
(DDSN), a protocol which provides resilient multi-party com-
munication in adverse communication environments. DDSN is
designed to work on top of the Named-Data Networking protocol
and utilizes semantically named, and secured, packets to achieve
distributed dataset synchronization through an asynchronous
communication model. A unique design feature of DDSN is
letting individual entities exchange their dataset states directly,
instead of using some compressed form of the states. We have
implemented a DDSN prototype and evaluated its performance
through simulation experimentation under various packet loss
rates. Our results show that, compared to an epidemic routing
based data dissemination solution, DDSN achieves 33-56% lower
data retrieval delays and 40-44% lower overheads, with up to
20% packet losses. When compared to the existing NDN dataset
synchronization protocols, DDSN can lower the state and data
synchronization delays from one-third to two-third, and lower
the protocol overhead by up to one-third, with the performance
difference becoming more pronounced as network loss rates go
up.
Index Terms—Disruptive Networks, Named-Data Networking,
Distributed Dataset Synchronization
I. INTRODUCTION
Dataset synchronization is a common need for distributed
applications that share information among multiple partici-
pants, such as group text messaging or collaborative editing.
With today’s TCP/IP networking, such distributed applications
use centralized cloud servers and achieve dataset synchro-
nization at the application layer. This approach matches well
to the traditional client-server application paradigm, but has
two major drawbacks. First, it requires the provisioning of
a centralized server to enable a distributed application, with
consequential requirement on mitigating single point of fail-
ures. Second, it depends on network infrastructure support to
connect all the involved parties to the centralized server, even
when those communicating parties are located next to each
other but far away from the centralized servers.
These drawbacks may be minor nuisance in a well provi-
sioned network, however they become roadblocks in support-
ing distributed applications in adverse environments where the
infrastructure support is poor (e.g., in developing countries),
or simply non-exists (e.g., during a disaster recovery). In such
situations, it is conceivable to achieve dataset synchronization
by peer-to-peer communication via local connectivity, however
one must address two major challenges first. The first one
is what namespace to use to communicate, as IP addresses
identify topological locations and become meaningless in
infrastructure-free scenarios. The second one is how to ensure
communication security: the certificate authority servers may
be unavailable, and the TLS or DTLS setup processes may
become infeasible in a fast changing, disruptive network.
In this paper, we leverage a newly designed Internet ar-
chitecture, Named Data Networking (NDN) [1], to propose a
data-centric solution for distributed dataset synchronization1,
dubbed Distributed Dataset Synchronization over disruptive
Networks (DDSN). NDN directly uses application level names
to fetch data at network layer, and secures each data packet
directly by a cryptographic signature which binds its name
and content together. Thus NDN provides answers to the two
challenges mentioned above, and DDSN is designed to bridge
the gap between an NDN network’s datagram delivery of
named, secured data packets and the applications’ need for
dataset synchronization among multiple parties.
The contributions of this work are two-folds. First, we
designed a new dataset Synchronization protocol, DDSN,
specifically for disruptive network environments. Taking a
departure from the previous NDN sync protocol designs,
DDSN lets individual entities communicate using dataset state
directly instead of its compressed form. Doing so enables syn-
chronization packets to be processed without any assumption
on the receiver’s state, bringing resiliency to communication
when the state at individual nodes diverges under adverse
conditions. Second, we implemented a prototype of DDSN
and conducted simulation evaluations under various network
loss rates. Our results show that, compared to an epidemic
routing [2] solution for data dissemination, DDSN achieves
33-56% lower data retrieval delays and 40-44% lower over-
heads under 0%-20% packet loss rates. When compared to
previous NDN sync protocols, DDSN achieves 27-77% lower
synchronization delay, 26-76% lower data retrieval delay, and
3-36% lower overhead, with the lower bound representing the
difference under zero loss and the upper bound the difference
under 20% packet loss rate.
The rest of the paper is organized as follows. Section II
provides a brief background on NDN and previous work on
distributed dataset synchronization in NDN, and Section III
summarized related work. Section IV presents an overview of
1In the rest of this paper we use the term “sync” to refer to both distributed
dataset synchronization protocol and the overall synchronization process.
c IEEE, 2019. This is the author’s version of this work. It is posted here by permission of IEEE for you personal use. Not for
redistribution. The definitive version will be published in the proceedings of the 16th IEEE International Conference on Mobile Ad-
Hoc and Smart Systems (IEEE MASS), 2019
the DDSN design, and Section V describes the DDSN building
blocks in detail. Section VI reports our evaluation results.
Section VII discusses extensions of the DDSN design, and
Section VIII concludes the paper and discusses future work.
II. BACKGROU ND
In this section, we first provide a brief overview of the
NDN architecture, and then defines the problem of distributed
dataset synchronization in the NDN context.
A. Named Data Networking
The basic idea of Named Data Networking (NDN) [1]
is to communicate via named packets. In NDN, each piece
of data is assigned a semantically meaningful name. This
name identifies the data itself and is independent of the data
container, its location, or the underlying network connectivity.
To communicate, data consumers express requests, called
Interests, which carry the names of desired data generated by
data producers. Each producer secures all its data at the data
production time, by using a cryptographic signature to bind
the data name its content.
Each NDN node Nutilizes four modules in communication.
(i) The Forwarding Information Base (FIB) contains a list
of name prefixes together with one or multiple outgoing
interfaces for each prefix; (ii) The Pending Interest Table (PIT)
keeps track of all Interests that Nhas forwarded but has
not received the corresponding data packets; (iii) The Content
Store (CS) opportunistically caches passing-by data packets;
and (iv) The forwarding Strategy makes interest forwarding
decisions based on multiple inputs including the FIB.
Equipped with the above four modules, each NDN node N
forwards Interests based on their names. After Nforwards an
Interest, it adds an entry to its PIT. Once an Interest reaches
a node that has the requested data packet Pdata,Pdata goes
back to the original requester by following the reverse path
of the corresponding Interest based on the state kept at each
forwarder’s PIT. Along the way each forwarder may cache
Pdata in its Content Store, to be used to satisfy future requests
for the same data.
B. Distributed Dataset Synchronization in NDN
Sync can be viewed as the transport layer abstraction in
a data-centric network architecture. Sync bridges the gap
between the functionality required by distributed applica-
tions, namely synchronizing the dataset of distributed entities,
and the datagram delivery provided by the underlying NDN
network. Since each NDN data name uniquely identifies a
piece of immutable data, Sync achieves the shared dataset
synchronization among distributed parties by synchronizing
the data namespace of all the involved parties [3]. For example,
in a group messaging application (textchat), each user keeps
a local view of the shared dataset (i.e. all the messages
generated by the other users in the same chat group). We refer
to this local view as the user’s shared dataset state, and all
the users in the chat group as members of a sync group2.
2We use the terms nodes and members in an interchangeable way.
Other members
in the Sync group
Application
Local Dataset
publish new
data State
Synchronization
Data
Synchronization
Shared Dataset
State
detect state
change
Sync node
fetch
data
Fig. 1: Main sync components
Each member running the group text messaging application
is identified by a name. We call this name “member prefix”.
Members use a multicast “group prefix” to communicate with
all other members in a sync group.
Figure 1 shows the main components of sync. Whenever
a piece of new data is published, Sync is responsible for
updating the shared dataset state. Upon the reception of newly
generated data from other members, Sync updates the dataset
state and notifies applications about the new data. The goal of
a sync protocol is to maintain a consistent dataset state among
members in a sync group (i.e., achieve state synchronization),
so that each member has up-to-date knowledge about all the
data that has been generated. Then each member fetches the
data (i.e., achieve data synchronization) based on application
requirements and/or network constraints [4]. In the rest of this
subsection, we discuss three fundamental design aspects of a
sync protocol.
Dataset state representation: The dataset for each application
instance is generally named under the prefix “/application
/instance”, which is abbreviated to “/[group-prefix]
in this paper. Within an application instance, one approach
to efficiently naming all the data generated by each member
is to name each data packet (e.g., a chat message) sequen-
tially, i.e. by appending a sequence number to its producer’s
name, with the resulting data name “/[group-prefix]
/[member-prefix]/[data-seq-number]”). Given that se-
quence numbers monotonically increase, the shared dataset
state among members can be represented by a collection of
such names, with one name for each member, and the sequence
number indicating the latest generated data packet by that
member. Sync protocols have adopted different mechanisms
to encode the dataset state for transmission over the network.
We refer to that as state encoding, and we present specific
examples in Sections III and V-A.
Dataset state change detection: To detect changes in the
shared dataset, members share their locally maintained state
information with other members via a sync protocol. When a
member produces a piece of new data, it increases its data se-
quence number by one and informs others. A member detects
changes in the dataset namespace by comparing the locally
maintained state information with received state information
from others. If a change is detected (i.e., someone in the
sync group have generated new data), sync notifies the local
application.
State and data synchronization: By running a sync protocol,
members learn the latest data generated by others in the sync
group (state synchronization). After state synchronization,
whether to fetch the newly learned data (data synchronization)
is determined by the application. In other words, the state and
data synchronization processes are decoupled: the sync pro-
tocol running at each member learns the latest data generated
by other members; the application decides whether, or when,
to retrieved the data.
III. REL ATED WORK
A number of sync protocols, running over NDN, have been
developed over the last few years [3], [5], [6], [7], [8], [9],
[10]. Two of them have been widely used: ChronoSync [6]
and PSync [7]. Despite their differences, a common design
assumption is the operation in an infrastructure-based envi-
ronment, where stable connectivity is the norm and nodes are
synchronized most of the time.
ChronoSync adopts a cryptographic digest data structure for
the dataset state encoding. However, this representation is not
semantically meaningful, thus it cannot be used to interpret
the difference in the dataset state. As nodes form disconnected
network clusters, they accumulate different state locally over
time. This leads the state information of members to severely
diverge; each member may encounter different members over
time, thus be aware of different data produced by different
members. To recover from such cases of state divergence,
ChronoSync needs to fall back to a recovery process in order
to resolve differences in dataset state. This process requires at
least three, in practice several, rounds of message exchanges.
iSync [5] and Psync uses an invertible Bloom Filter
(IBF) [11] to encode dataset state, relying on the IBF sub-
traction operation to infer the state difference. However, the
IBF subtraction allows state comparison between only two
members at a time. This becomes inefficient as the number
of members with diverged state in the network increases.
IBF by design also has certain limitations on the amount of
information it can decode under a single operation due to
the probability of false positive errors [11]. As a result, in
disruptive environments with severe state divergence, it may
take several rounds of exchanges to decode state differences
among encountered members.
VectorSync [8] introduced the idea of using a version
vector [12] for state encoding. Instead of using digest based
state encoding, VectorSync enumerates the most recent data
sequence numbers of all group members. To allow each
member correctly interpreting the vector (which member owns
which sequence number), VectorSync adopted a leader-based
mechanism to maintain a consistent view of the current sync
group members (membership list). DSSN [10] was designed
to synchronize dataset of sensor groups that are wirelessly
connected. Due to energy conservation, not all sensors are
online all the time, making membership agreement infeasible.
Thus DSSN extended the VectorSync design by explicitly
listing the member name prefix in the version vector. However,
DSSN is designed for stationary sensor groups within a single-
hop communication range, and does not work in ad hoc mobile
scenarios with intermittent multi-hop connectivity.
Member E Member C
1. Sync Interest
name = /<group-prefix>/<E’s-state-vector>
2. Sync Reply
name = /<group-prefix>/<E’s-state-vector>
Content = <C’s updated state vector>
3. Interest for Data
4. Missing Data
Fig. 2: High-level example of the DDSN operation
DDSN inherited a few basic ideas from the previous work,
including: (i) naming data sequentially [6], (ii) using a vector
based approach for state encoding [8], [10], and (iii) using
periodic transmissions for state change detection [6], [7].
However, different from all the previous NDN sync protocols,
DDSN is specifically designed to operate under ad hoc mobile
environments with intermittent connectivity and potentially
severe packet losses.
IV. DESIGN OVE RVI EW
The goal of DDSN is to meet the needs of distributed
applications for data sharing in disruptive network scenarios,
such as disaster recovery (Section IV-A). The fundamental
design challenges that DDSN aims to address include:
How to efficiently recover from state divergence, a situation
that is the norm in intermittently connected networks.
How to asynchronously exchange state information under
intermittent ad hoc connectivity in a timely manner, while
being resilient to lossy network channels.
How to prioritize transmissions of up-to-date state infor-
mation and minimize redundant or outdated state information
transmissions under limited network capacity and short-lived
connectivity.
To tackle the first challenge, DDSN uses a vector based
structure for state encoding (Section V-A). We refer to this
structure as state vector, which enables lightweight state
divergence recovery between multiple members.
To address the second challenge, DDSN utilizes two dif-
ferent messages, a sync Interest and a sync Reply (messages
1 and 2 in Figure 2), to synchronize the dataset state among
members. By attaching a state vector to the name of sync
Interests, receivers can directly interpret the state information
within a single round of message exchanges (Section V-B).
This minimizes the transmission overhead and DDSN resilient
to different degrees of network losses. Once members synchro-
nize their state, they may send Interests to retrieve missing data
from each other based on application needs (messages 3 and
4 in Figure 2).
To tackle the third challenge, DDSN’s semantically mean-
ingful naming abstractions can be interpreted by any receiving
member. Members overhearing exchanged Sync Interests and
Sync Replies can interpret the state information directly, and
utilize this information to prioritize their transmissions. As
a result, members with up-to-date state information receive
priority, while members with outdated state information sup-
press their otherwise redundant transmissions (Section V-B).
This ensures that up-to-date state information propagates in
the network without any delay.
DDSN also takes into account the case of node isolation.
DDSN switches between regular and inactive modes based
on whether a member overhears transmissions from others. In
inactive mode, members suppress transmissions and switch to
a probing phase for neighbor discovery (Section V-C).
At the same time, DDSN members perform hop-by-hop
propagation of sync Interests over multiple wireless hops,
and probabilistic forwarding of Interests for missing data
(Section V-D). This enables members to synchronize their state
with other members multiple hops away from each other, as
well as retrieve data available more than one hop away.
To support partial data sync, we extend DDSN’s baseline
state vector design to contain application-defined preference
information. This allows members to build local knowledge
about the data that their neighbors are interested in. As a result,
members prioritize the retrieval of data that themselves and
most of their neighbors are missing (Section V-E).
Throughout the sync process, DDSN leverages the built-in
NDN security mechanisms [13] to secure data directly at the
network layer. All the data packets carry the cryptographic sig-
nature of the data producer, and each data can be authenticated
regardless of where it is stored. Members also authenticate the
sync Interests containing the state vectors of others based on
pre-established common trust anchors [13]. (Section V-F).
A. Example Scenario
We use an example scenario throughout this paper to help
illustrate our design. In disaster recovery scenarios, such as
earthquakes, the communication infrastructure may be severely
damaged. As a result, communication needs to be performed
via local network connectivity. In the aftermath of an earth-
quake (Figure 3), first responders (e.g., nodes A, B, C, D and
E) move around on a field to perform rescue operations. We
assume that the first responders run an information sharing
application and an instance of the DDSN protocol on their
devices, and they are members of the same sync group. In this
way, they can collect/generate information about survivors or
the status of the disaster, which they would like to synchronize
and share with all the other responders in the group.
D
C
B
C
E
Area 1 Area 2
Fig. 3: An example of a disaster recovery scenario
For example, let us assume that E has information about a
survivor that was found in area 1. As E moves around, it
encounters C and exchanges this new survivor information
with C. When C moves into area 2, it further disseminates
this information to B and D. In this way, new information
is shared asynchronously among members of the rescue team
under intermittent connectivity.
/chatroom /A/111
/chatroom /A/112
/chatroom /B/129
/chatroom /B/130
/chatroom /C/120
/chatroom /C/121
/chatroom /D/118
/chatroom /D/119
Shared
Dataset
A: 112 B: 130 C: 121 D: 119
[A: 112, B: 130, C: 121, D: 119]
State
Vect or
Dataset
State
Fig. 4: State vector example
V. DESIGN BUILDING BLOCKS
A. Dataset State Representation
As mentioned in Section II-B, a sync protocol synchronizes
the namespace of a shared dataset. State divergence among
members raises challenges when it comes to interpreting the
differences of a large state mismatch. This is common in
networks with intermittent connectivity, since members may
not be connected to each other when they generate new data.
To address this challenge, DDSN adopts a vector based state
encoding, called state vector [10]. Each member maintains
locally a state vector that represents the latest data generated
by every other member in the sync group that it is aware
of. The vector contains a set of key-value pairs, the ith pair
{(pi:seqi)}, indicates that the latest data packet produced by
the member with prefix pihas a sequence number of seqi. A
state vector example for a “/chatroom” sync group is shown
in Figure 4, where the sequence number of the latest data
generated by members A, B, C, and D is 112, 130, 121, and
119 respectively.
When a member receives a state vector from another mem-
ber, it first compares the received vector to its local vector
by computing the difference between the sequence numbers
under the same member prefixes. For example, in Figure 3
(after C moves to area 2), let us assume that C’s state vector
is [B: 1, C : 2, D : 2, E : 3] and that C receives D’s state
vector, which is [B: 2, D : 4, E : 1]. By comparing its local
vector to D’s vector, C identifies that B has generated a new
data packet with sequence number 2 and D has generated
two new packets with sequence numbers 3 and 4 respectively.
A member merges the received vector with its local vector
by choosing the pairwise maximum between the sequence
numbers under the same member prefix in the vectors. It also
adds new member prefixes found in the received vector to its
local vector. In the previous example, assuming that D also
receives C’s vector, D’s merged vector will be [B: 2, C :
2, D : 4, E : 3]. A state vector is raw state information
expressed in a condensed way, which is meaningful on its
own. To this end, we do not make any assumptions on the
receiver state or the network topology, allowing members to
recover from any degree of state divergence.
B. State Synchronization
The dataset state synchronization process consists of the
exchange of a sync Interest and a sync Reply. Figure 2 shows
an example of a basic sync Interest-Reply exchange between
members E and C of Figure 3 (while C is still in area 1).
A sync Interest name contains a group prefix, which allows
members in a sync group to multicast 3their sync Interest(s) to
others in the same group. The second part of the sync Interest
name is the member’s state vector, which is used to express the
dataset state of the sender. Members receiving a sync Interest
first merge the state vector in the Interest’s name with their
local state vector, and then send back a sync Reply containing
their updated (merged) state vector.
As members move around, they encounter other members
within their communication range. Encountered members syn-
chronize their dataset state until they converge to a consistent
state. For example, in Figure 3, node C encounters B and D.
To sync their dataset state, one of the nodes, for example C,
initiates the sync process by sending a sync Interest with its
local state vector to let B and D know about its latest state.
1) Detecting State Mismatch: To resiliently detect state
mismatch in a dynamic network environment, DDSN adopts
both an event-driven and a periodic approach for the transmis-
sion of sync Interests. Based on the former mechanism, when
a node generates a new data packet (or after it has accumulated
a number of new data packets, depending on application
requirements), it sends a sync Interest to members within
its communication range. This Interest serves as a proactive
notification of state change. Based on the latter mechanism,
nodes periodically send a sync Interest to detect state mismatch
with others within their communication range. This is needed,
since nodes move in and out of each other’s communication
range, thus, we cannot assume stable state among them.
Specifically, nodes send a sync Interest periodically based
on a countdown timer (tsync_int) from a maximum value tmax
to 0. The generation of new data by a node triggers the trans-
mission of a sync Interest without delay and refreshes tsync_int
to postpone the periodic transmission of state information.
2) Sync State Propagation: To achieve timely synchroniza-
tion of state among members, DDSN has two fundamental
design goals for state propagation: (i) propagate new state
information in the network with minimum delay; (ii) avoid
propagating redundant or outdated state information.
DDSN’s semantically meaningful message names allow
nodes to interpret the received state, and prioritize their
subsequent transmissions to achieve these design goals. The
processing logic of a received sync Interest is illustrated in
Figure 5. The receiver first assesses whether the state vector
in the Interest contains the same, more recent, or older state
information than its local vector. This is achieved by directly
comparing the sequence number under each member prefix in
the received vector to the sequence numbers in its local vector.
If the received vector contains more recent state (i.e., the
sequence number under any member prefix is higher than the
receiver’s local vector or if the received vector contains new
member prefixes), the receiver merges the received vector with
3A member of a sync group broadcasts a frame containing a sync Interest
at the MAC layer. This is a multicast transmission at the network layer, since
this Interest will be handled by the NDN forwarder of a receiving node based
on the Interest’s group prefix. The NDN forwarder passes the sync Interest to
upper layers, including the sync transport layer, only if the Interest’s group
prefix matches the sync group prefix of the node.
Fig. 5: Processing logic of a received sync Interest
its local vector (Section V-A) and sends a new sync Interest
containing its updated state vector without delay. This ensures
that new state is propagated as fast as possible in the network.
If the received vector contains older state than the local vector,
the receiver suppresses itself and does not send a new sync
Interest. In this way, nodes avoid propagating outdated state
information. If the received state is the same as the local state,
the receiver again suppresses itself, since this is treated as an
indication that some other node in the vicinity has previously
transmitted the same state information.
3) Acknowledging State Transmission: A sync Reply acts
as an acknowledgment for the reception of a sync Interest
and ensures that a bidirectional exchange of state information
between members take place. By comparing the received state
vector and the receiver’s local state vector, the receiver can
deduce the missing data of the sync Interest sender, and
optionally contain it in the sync Reply. Multiple members
may be within the communication range of each other, thus
receive the same sync Interest. DDSN adopts a prioritization
mechanism to avoid redundant replies, making sure that only
the member that has the most accumulative up-to-date state
compared to the state vector in the received Interest will reply.
That is, if a receiver of a sync Interest has more recent state
than the state in the Interest, it schedules the transmission of
a sync Reply based on how much accumulative new state it
has under each member prefix compared to the received state
vector. This is achieved by selecting an appropriate value for
a sync Reply transmission delay timer tsync_rep . Specifically,
if the sync Interest sender’s state vector contains the value
pair set {(pi, seqi)}, where 0< i < m, and the receiver’s
state vector contains the value pair set {(pj, seqj)}, where
0< j < n, then:
tsync_rep =tdelay
P{i,j|pi=pjseqi>seqj}(seqiseqj)+1
Note that tdelay is a fixed delay value, and mand nare
the number of member prefixes in the vector of the sync
Interest’s sender and receiver respectively. On the other hand,
C
D
B
Sync Interest
Sync reply
4
3
1
12
4
2
3
(a) State sync process
C
D
B
Interest for Data
Data
4
4
3
5
5
6
6
(b) Fetching missing data
Fig. 6: DDSN sync process example
if a receiver of a sync Interest has older state compared to the
state vector in the Interest, it only updates its local state by
merging the vector in the Interest with its local vector.
Example: In Figure 6a, C’s sync Interest is received by B
and D. Assuming that B has more accumulative new state
than D compared to C’s state vector, B will receive priority
to send a sync Reply. This is achieved by having B schedule
its tsync_rep timer to expire before D’s tsync_rep timer. B will
send a sync Reply first, which includes its own state vector.
D will overhear this reply and cancel the transmission of its
own sync Reply. Then, D sends a sync Interest to let B and C
know about its own state vector. B and C receive D’s Interest,
and C replies first and suppresses B from replying4. After B,
C, and D have received each other’s vectors, they all converge
to a consistent dataset state. As mentioned in Section II-B, we
refer to this process of synchronizing the dataset state among
members as state sync, which allows each member to have a
consistent view of the shared dataset.
Once nodes exchange state vectors and converge to a
consistent state, each node can identify the data it is missing
based on the difference between its local dataset (i.e., the
data it has already fetched) and its local state vector. Nodes
start fetching missing data from others by creating a queue
of missing data names and sending Interests to fetch the
corresponding packets. For example, in Figure 6b, we illustrate
the process of node B sending an Interest for missing data,
which is received by C and D. C has the requested data
and responds to this Interest. As mentioned in Section II-B,
we refer to this process of fetching missing data based on
established state and application requirements as data sync,
which is decoupled from the state sync process. State sync
has a higher priority to data sync in terms of transmission
scheduling, as keeping an updated state information facilitates
the fetching of missing data.
C. Inactive Mode
In disruptive networks with mobility and intermittent con-
nectivity, it is a common case that nodes may endure certain
isolation periods. During such periods, they will not be con-
nected to any other nodes, thus requests for data becomes
meaningless. It is also important during this period to detect
encountered members to learn whether they have any newer
4In this example, both B and C receive equal priority to respond to D’s
sync Interest, since they both have the same state. To avoid collisions in such
cases, B and C use a timer to randomize their sync Reply transmissions.
state [14]. DDSN offers a mechanism to detect when isolation
happens and when nodes get connected again.
When a node does not receive any packet transmissions for
a certain period of time (tactivity ), it enters the inactive mode.
We refer to tactivity as the activity timer. In inactive mode, a
node suppresses transmissions, since no receivers are around,
and switches to probing through periodic sync Interests that
contain its latest state vector5. If a node receives/overhears
a packet transmission while in inactive mode, it interprets
that as being connected again with others. Therefore, it exits
inactive mode, resets its activity timer, and continues its
regular operation.
D. Communication over Multiple Wireless Hops
DDSN makes use of the event-triggered transmission of
sync Interests to propagate on a hop-by-hop manner new state
over multiple wireless hops. Each node that receives a state
vector with up-to-date state sends a sync Interest containing
its updated state vector. For the data synchronization process,
requests for data are transmitted over multiple hops based on
a certain forwarding probability at each hop. In the future, we
plan to investigate mechanisms to build soft-state knowledge
about the available data over multiple hops around nodes.
This will allow us to dynamically adjust the probability of
forwarding data requests based on how likely it is that the
requested data is available around nodes.
E. Partial Data Synchronization
For distributed applications it is common that users may
be interested only in data generated under specific member
prefixes (e.g., users may subscribe to specific news resources).
In such pub-sub model, users can be notified when a member
they are interested in has generated new data, and fetch the
data accordingly. We call this process partial data synchro-
nization. DDSN offers a set of general-purpose transport layer
abstractions to cover the needs of any distributed multi-party
application under intermittent connectivity by supporting both
full data synchronization and partial data synchronization. In
adverse environments, each member carries and propagates
the state of the data generated by all other members in a sync
group. In this way, DDSN allows as many members as possible
to learn in a timely manner the latest state of the data they
would like to fetch. Each member can then fetch the data based
on its subscription.
The key challenge in this scenario is that members may hold
only a subset of the full dataset (data they have subscribed to).
However, members propagate information about the state of
the entire dataset (state of all other members). This mismatch
between a member’s local dataset and the propagated state
information triggers members to send requests for data that
may not be available around them.
To address this issue, we can add an additional field in the
state vector called preference flag, under each producer prefix.
5Alternatively, probing can be done through lightweight probing Interests
under a predefined probing namespace. When a node receives a probing
Interest, it sends a sync Interest to initiate the state sync process.
When a node is subscribed to a particular producer prefix, it
will set the preference flag to 1 for that prefix or 0 otherwise.
For example, the ith pair of values in the state vector would be
in the form of {(pi, seqi, f lagi)}, which represents: the latest
data sequence number produced under producer pi’s prefix is
seqi, and flagiindicates whether this consumer is subscribed
to producer pi’s data. The preference flag reflects what data is
held by each member, and is used to prioritize the requests for
data. Each node can thus build up a local table regarding which
prefixes its neighbors are subscribed to, by aggregating the
preference flags it received from its neighbors’ Sync Interests.
To ensure the timeliness of the information, each flag in the
table would have a life timer which expires after certain period
(then the flag value is set to 0). A node will refresh the life
timer of a preference flag if it receives a state vector with same
preference flag set to 1. Nodes will prioritize the transmission
of Interests for data under the producer prefix subscribed to
both by itself and by at least one of its neighbors. This allows
higher chance of fetching available data, and the data fetched
is more likely to benefit others neighbors as well.
F. Sync Interest Authentication
Received sync Interests change the receiver’s state. This can
cause the receiver to fetch data perceived as missing based on
the received state. Sync Interests could be abused by malicious
nodes for bogus state injection in a sync group, which would
cause nodes to fetch bogus or non-existent data. To deal with
bogus state vectors, nodes sign sync Interests and attach the
signature to the Interests. To sign Interests, each node either
has a pre-configured shared symmetric key per sync group or a
pair of dynamically generated public and private keys. Nodes
authenticate received sync Interests to verify the validity of the
carried state. They also decide whether they trust the node that
signed each Interest based on pre-established trust anchors.
VI. EXPERIMENTAL EVAL UATI ON
In this section, we present our experimental evaluation. We
first describe the experimental setup (Section VI-A), and then
present our experimental results under adverse network condi-
tions (Section VI-B). We first compare the DDSN performance
with a data dissemination solution based on epidemic rout-
ing [2] (Section VI-B1), then compare the DDSN performance
with two existing NDN-based sync protocols (Section VI-B2).
A. Experimental Setup
We have implemented a DDSN prototype in C++ and makes
use of the ndn-cxx library [15] to ensure compatibility with
NFD [16]. To evaluate DDSN under a variety of settings
and conditions, we ported our prototype into the ndnSIM
network simulator [17] (our simulation code can be found
at https://github.com/JonnyKong/DDSN-Simulation). ndnSIM
offers integration with the NDN prototype implementations
(ndn-cxx library and NDN forwarder). We use a grid topology
of 800m×800m with 30 mobile nodes (20 sync nodes that
belong in the same sync group and 10 other nodes which
only forward packets) that move based on the Random Walk
Mobility Model. Each node randomly chooses its direction and
speed. The speed ranges from 1m/s to 20m/s, representing both
humans and vehicles, and the direction ranges from 0 to 2π.
Each node moves along the same path for 20s before changing
its direction and speed. Nodes communicate through NDN
over IEEE 802.11b 2.4GHz (transmission rate of 11Mbps)
with a WiFi range of 60m (unless otherwise noted). Each node
generates data following a poisson distribution with λ= 40s
on average and its data generation process lasts for 800s.
The payload of each data packet consists of random text of
size 100-1024 bytes. Note that we perform 10 trials for all
the experiments mentioned below and we present the 90th
percentile of the collected results (with the exception of CDF
plots that show the full distribution of the results).
We consider the following evaluation metrics: (i) state sync
delay: the time needed for an updated state vector to reach all
the members in the group, (ii) data sync/retrieval delay: the
time needed for newly generated data to reach all members in
the group, and (iii) overhead: the traffic volume (in terms of
bytes or Interests) generated for all the members to retrieve
all the generated data.
Comparison to an epidemic dissemination solution: We
implemented a data dissemination solution based on epidemic
routing in an IP-based network setup, which we compare
to DDSN. Epidemic routing relies on periodic transmissions
of beacon messages to detect when nodes are connected to
others. The reception of a beacon triggers the encountered
nodes to exchange their summary vectors. A summary vector
contains a list of data packets that its sender has. After nodes
exchange their summary vectors, they send to each other data
that they have, but others are missing. Each node also has a
buffer to store and carry data it overhears to other nodes. For
epidemic routing, we use UDP multicast running on top of
a broadcast IP address and we assume that every node has
adequate resources to store and carry all the generated data
packets. For a fair comparison with DDSN, we set the period
of beacons to 8s, i.e., the same as the period of the Sync
Interest transmissions.
Comparison to existing sync protocols in NDN: We ported
the prototype ChronoSync and PSync implementations into
ndnSIM, which we compare to DDSN. To perform a fair
comparison, we let each protocol transmit a sync Interest every
8s. and each member in the sync group fetch the data generated
by all the other members.
Retransmission Strategy: We adopted a data request retrans-
mission strategy for lossy network environments. It is based
on the characteristics of sync protocol operations. For sync,
initially a state update is sourced from the data producer, then
it is propagated to nearby nodes which in turn fetch the new
data. In other words, the state and data propagate together
when there is adequate connection time among nodes, thus
the sender of new state is likely to carry the corresponding
data. To facilitate the exchange of new data under intermittent
connectivity, we place newly generated data requests in a
transmission queue with a short transmission interval (0.5s
in our simulations). If such a request does not bring data
(a) 0% loss, state sync delay (b) 5% loss, state sync delay (c) 20% loss, state sync delay
(d) 0% loss, data sync delay (e) 5% loss, data sync delay (f) 20% loss, data sync delay
Fig. 7: State and data sync delay, comparison between DDSN & existing NDN sync protocols
(a) 0% Loss Rate (b) 20% Loss Rate
Fig. 8: Overhead, DDSN & existing NDN sync protocols
back after a certain number of retransmissions (10 in our
simulations), it is placed to a queue with a longer transmission
interval (5s), so that nodes avoid retransmitting repeatedly
to the same neighbors. Compared to a baseline approach
of periodic request retransmissions every 5s, our strategy
reduces the number of transmitted data requests by 35%
and the data sync delay by 45% under 50% loss rate. This
retransmission strategy is implemented for each sync protocol
in our simulation setup.
B. Experimental Results
Fig. 9: Data retrieval delay, DDSN & epidemic dissemination
1) Comparison with Epidemic Data Dissemination: In this
subsection, we compare the performance of DDSN to a
data dissemination design based on epidemic routing, which
opportunistically disseminates data to nodes in the network
through pair-wise exchange of messages between nodes [2].
Data retrieval delay: In Figure 9, we present the CDF of the
data retrieval delay for DDSN and epidemic data dissemina-
tion. The results demonstrate that DDSN nodes retrieve 90%
of the generated data in less than 140s for loss rates of 0%
and 20% respectively. On the other hand, in epidemic dissem-
ination, nodes need around 220s and 400s to retrieve 90% of
the generated data for loss rates of 0% and 20% respectively.
That is, DDSN achieves 35-56% lower delays than epidemic
dissemination. This is due to the fact that DDSN follows an
event-based approach that propagates without any delay up-
to-date state information multiple hops in the network, so
that nodes can retrieve missing data as quickly as possible.
Epidemic routing relies on the periodic transmission of beacon
messages to trigger message exchange between a pair of
nodes; only in cases that another node receives a beacon, the
encountered nodes exchange their summary vectors and send
data to each other.
Overhead: DDSN generates 1.49 and 1.44 bytes×107of
overhead for loss rates of 0% and 20% respectively, while
epidemic dissemination generates 2.48 and 2.58 bytes×107of
overhead for loss rates of 0% and 20% respectively. These
results show that DDSN achieves 40-44% lower overheads.
The main reason is that DDSN’s uniform sequential naming
convention of each data (member prefix + sequence number)
allows the state of the entire dataset to be encoded in the state
vector by enumerating all the member prefixes and the newest
sequence number under each prefix. The state vector size
remains the same as the number of data generated continues
to increase. Epidemic routing’s summary vector enumerates
all the data generated, and is not scalable as the number of
generated data increases. The state vector size is 150-200
bytes during the experiments, however, the summary vector
grows up to 1600 bytes. Especially when data generation rate
is high, summary vector needs to remove some of the older
data information to maintain its size limit. Under disruptive
environments, this can result in some of the data to be never
received by certain nodes.
2) Comparison with Existing NDN Sync Protocols: In this
subsection, we compare DDSN with ChronoSync and PSync
under different loss rates.
State and data sync delay: Figure 7a, 7b, and 7c show
the CDF of the state sync delays under packet loss rate 0%,
5%, and 20% respectively. Overall, DDSN achieves lower
state sync delays compared to ChronoSync and PSync due
to its prioritized propagation of up-to-date state information
in the network. The results also demonstrate that DDSN is
more resilient than ChronoSync and PSync to different rates
of packet loss. For low packet loss rates (0-5%), DDSN
achieves 27-38% lower state sync delays (90th percentile) than
ChronoSync and PSync. For high loss rates (20%), DDSN
achieves 67-77% lower state sync delays (90th percentile)
than ChronoSync and PSync. When loss rate is high (50%),
it becomes difficult for Chronosync and PSync to propagate
the state to more than 50% of the nodes, while DDSN
performance remains stable. This is due to a number of design
factors. First, DDSN utilizes the successful receipt of a sync
Interest carrying the state vector to ensure state propagation.
The receiver can directly interpret the state information and
resolve state inconsistency based on the received state vector.
On the other hand, ChronoSync and PSync require multiple
rounds of message exchanges for state divergence recovery.
As the loss rate increases, the increase in state sync delay
becomes significant for ChronoSync and PSync due to lower
success rates of completing the recovery process. Secondly,
each DDSN member prioritizes the propagation of new state
information to its neighbors, which disseminates state update
more effectively under intermittent connectivity.
The data sync delay shown in Figure 7d, 7e, and 7f demon-
strate that DDSN is again more resilient than ChronoSync
and PSync to packet loss. DDSN is able to distribute data to
more nodes than the other two protocols. This is due to the
fact that data sync follows tightly after state sync. Receiving
new state information enables nodes to send Interests for the
new missing data. For low packet loss rates (0-5%), DDSN
achieves 26-36% lower data sync delays (90th percentile) than
ChronoSync and PSync. For high loss rates (20%), DDSN
achieves 68-76% lower data sync delays (80th percentile) than
ChronoSync and PSync. Under 50% loss rate, ChronoSync and
PSync did not distribute more than 80% of the generated data,
due to the large state synchronization delay.
Overhead: In Figure 8a and 8b, we present the overhead
results for DDSN, ChronoSync, and PSync for loss rates of 0%
and 20% respectively. The results show that DDSN achieves 3-
17% and 5-36% lower overheads than ChronoSync and PSync
respectively. ChronoSync uses a digest to compactly encode
the latest state, thus sync Interests are of small sizes. However,
ChronoSync results in a large number of requests for data,
since its design does not offer detection of members within
the communication of each other. In PSync, the dominating
overhead factor is the compressed IBF size, which increases
with the size of the encoded state information.
VII. DISCUSSION
A. State and Data Mismatch
As we mentioned in Section II-B, sync decouples the
state and data sync process by allowing nodes to first sync
the shared dataset state, then fetch data if it is needed by
applications. This is an important design choice considering
the requirements of different applications, and works well in
infrastructure networks, where nodes are typically connected
and have all the data that their state indicates. However, in
environments with intermittent connectivity, nodes might not
have all the data their state indicates. As a result, nodes may
propagate state for data they do not have. Nodes receiving this
state will try to fetch potentially unavailable data, resulting in
a large number of transmitted requests. One feasible solution
would be to extend the state vector design, by adding an
additional data sequence number under each member prefix
to reflect the actual dataset of the state vector sender. Thus,
the state vector reflects the newest sequence generated under
each member prefix and the actual dataset of the sender. As
a tradeoff, this approach would introduce additional overhead
for the state vector.
B. Scalability and Maintenance of State Vector
Depending on the number of sync members (dataset size)
encoded in a state vector, the vector size might grow. For
example, if we assume that the size of each member prefix is
8 bytes, each sequence number is 8 bytes, and the IEEE 802.11
WiFi MTU is 2304 bytes, a state vector will be able to encode
up to about 120-130 members. Although the state vector can be
encoded in binary, its size can still exceed MTU when there are
a few hundreds of nodes. For the state synchronization process,
members can transmit a partial state vector, containing only the
dataset state information of the members which have recently
generated new data. Approaches to enhance the state vector
scalability can also be explored. For example, compression
can be employed to reduce the vector size. Another candidate
approach might be a multi-dimensional vector structure [18],
which offers multiple dimensions of state aggregation. To
address cases of ever growing state vectors, members need to
have a mechanism to clean up their local vector over time.
Members remove a member prefix from their local vector
after a certain amount of time, if they do not receive a vector
containing a greater sequence number for this prefix compared
to the sequence number in their local vector.
VIII. CONCLUSION & FUTURE WO RK
In this paper, we presented DDSN, a distributed dataset
synchronization protocol in NDN designed to operate under
adverse network conditions. This work is based on long trials
of design refactoring and performance analysis. Through this
process, we were able to get an in-depth understanding of the
challenges of data synchronization in adverse environments, as
well as how to effectively address these challenges. This paper
not only summarized the lessons learned from countless ex-
perimental attempts, but also the design merits and limitations
of previous distributed dataset synchronization protocols.
Throughout our design and experimentation process, we
were able to discover aspects useful for future protocol de-
velopment. First, we experimented with different mechanisms
to minimize redundant transmissions and we concluded that,
in highly lossy environments, maximizing the utility of each
message becomes vital to ensure protocol robustness. Second,
we confirmed the resilience of using a state vector to synchro-
nize the distributed dataset, as long as it is feasible for a given
dataset size. Third, we reconfirmed that the old simple soft-
state approach (periodic retransmission of state vector) offers
resilience under adverse conditions, in lossy wired networks
before and now in disruptive wireless networks as well.
Our next steps include more thorough simulation and real-
world experimental analysis on the protocol performance in
various settings. We also plan to build new pilot applications
over DDSN to use as driver examples for its further evaluation.
Finally, we plan to compare DDSN with further DTN solu-
tions, explore adaptive ways for communication over multiple
wireless hops, as well as investigate alternative approaches to
enhance the scalability of the state vector structure.
REFERENCES
[1] L. Zhang et al., “Named Data Networking,” ACM Computer Communi-
cation Review, July 2014.
[2] A. Vahdat, D. Becker et al., “Epidemic routing for partially connected
ad hoc networks,” 2000.
[3] W. Shang, Y. Yu, L. Wang, A. Afanasyev, and L. Zhang, “A Survey of
Distributed Dataset Synchronization in Named-Data Networking,” NDN
Project, Technical Report NDN-0053, April 2017.
[4] T. Li, W. Shang, A. Afanasyev, L. Wang, and L. Zhang, “A brief intro-
duction to ndn dataset synchronization (ndn sync),” in MILCOM 2018-
2018 IEEE Military Communications Conference (MILCOM). IEEE,
2018, pp. 612–618.
[5] W. Fu, H. B. Abraham, and P. Crowley, “Synchronizing namespaces
with invertible bloom filters,” in 2015 ACM/IEEE Symposium on Archi-
tectures for Networking and Communications Systems (ANCS). IEEE,
2015, pp. 123–134.
[6] Z. Zhu and A. Afanasyev, “Let’s chronosync: Decentralized dataset
state synchronization in named data networking,” in Network Protocols
(ICNP), 2013 21st IEEE International Conference on. IEEE, 2013.
[7] M. Zhang et al., “Scalable Name-based Data Synchronization for Named
Data Networking,” in Proceedings of the IEEE Conference on Computer
Communications (INFOCOM), May 2017.
[8] W. Shang, A. Afanasyev, and L. Zhang, “Vectorsync: distributed dataset
synchronization over named data networking,” in Proceedings of the 4th
ACM Conference on Information-Centric Networking. ACM, 2017.
[9] P. de-las Heras-Quirós, E. M. Castro, W. Shang, Y. Yu, S. Mastorakis,
A. Afanasyev, and L. Zhang, “The design of roundsync protocol,”
Technical Report NDN-0048, NDN, Tech. Rep., 2017.
[10] X. Xu, H. Zhang, T. Li, and L. Zhang, “Achieving resilient data
availability in wireless sensor networks,” in Communications Workshops
(ICC), 2018 IEEE International Conference on. IEEE, 2018, pp. 7–11.
[11] D. Eppstein, M. T. Goodrich, F. Uyeda, and G. Varghese, “What’s the
difference?: efficient set reconciliation without prior context,” in ACM
SIGCOMM, 2011.
[12] D. S. Parker, G. J. Popek, G. Rudisin, A. Stoughton, B. J. Walker,
E. Walton, J. M. Chow, D. Edwards, S. Kiser, and C. Kline, “Detection
of mutual inconsistency in distributed systems,IEEE transactions on
Software Engineering, no. 3, pp. 240–247, 1983.
[13] Z. Zhang, Y. Yu, H. Zhang, E. Newberry, S. Mastorakis, Y. Li,
A. Afanasyev, and L. Zhang, “An overview of security support in named
data networking,” Technical Report NDN-0057, NDN, Tech. Rep., 2018.
[14] T. Li, S. Mastorakis, X. Xu, H. Zhang, and L. Zhang, “Data synchro-
nization in ad hoc mobile networks,” 2018.
[15] NDN Team, “ndn-cxx,” http://named-data.net/doc/ndn-cxx.
[16] A. Afanasyev, J. Shi et al., “NFD Developer’s Guide,” NDN, Tech. Rep.
NDN-0021, 2015.
[17] S. Mastorakis, A. Afanasyev, and L. Zhang, “On the evolution of
ndnsim: An open-source simulator for ndn experimentation,” ACM
SIGCOMM Computer Communication Review, vol. 47, no. 3, 2017.
[18] D. Meagher, “Geometric modeling using octree encoding,Computer
graphics and image processing, vol. 19, no. 2, pp. 129–147, 1982.
... Robotic service discovery mechanisms enable swarm robots to be aware of other peers' capabilities, which is essential for RSoS deployments in large-scale harsh scenarios [5]. Existing service discovery protocols (e.g., Bonjour, SSDP, DNS-SD) do not fit the unique characteristics of RSoS, which include high network dynamism [6] [7] where network partitions may occur causing possible inconsistencies in the local registries [8] [9], heterogeneous Machine-to-Machine (M2M) communication technologies [10], and strict latency bounds for critical robotic applications. ...
... For this reason, data synchronization techniques play a crucial role to ensure that all the robots of the swarm share a consistent representation of the available resources [8]. A similar problem has been investigated for disruptive network environments [9]. Several recent works investigate the adoption of software patterns belonging to service-oriented architectures in RSoS, in order to support the seamless composition of functionalities offered by different robots. ...
... • Hash-based: this setup corresponds to Algorithm 1 presented in Section III. • Table-based: this is our implementation of the algorithm presented in [9]. Differently from our proposal, this method does not use any hash function and hence it broadcasts the full Service Table to discover services' dissimilarities among the robots of the swarm. ...
Conference Paper
In robotic swarms, nodes must cooperate in order to accomplish complex tasks. Robot interactions may occur at run-time based on the dynamic network topology and on mission-specific goals. For this reason, service discovery mechanisms are fundamental in order to inform each robot about the functionalities offered by the other peers. Although several network discovery protocols have been proposed so far, none of them fits the unique characteristics of robotic swarms in terms of dynamic topology, communication heterogeneity, and latency requirements. In this paper, we fill such gap by designing, implementing and evaluating a novel service discovery mechanisms for generic Robotic Systems-of-Systems (RSoS). The study proposes algorithmic and practical contributions. Regarding the first, we describe a novel distributed service discovery algorithm for RSoS which adapts to highly mobile robotic environments while limiting the network overhead and latency. Regarding the latter, we implemented the proposed algorithm within the Uhura swarm robotics framework; as a result, our solution is able to support multi-radio scenarios where robots are provided with heterogeneous Machine-to-Machine (M2M) communication technologies. In addition, we validate our solution through large-scale simulations and a test-bed in which ground robots are able to discover a Federated Learning (FL) task and join it at run-time to improve the accuracy.
... Exemplos de protocolos de sincronização em NDN incluem a utilização de técnicas comoárvores [Ben Abraham and Crowley 2013], filtro de Bloom invertido [Fu et al. 2015] e vetores de estado [Shang et al. 2017], dentre outros. Dentre as principais soluções existentes, os protocolos de vetores de estado possuem destaque especial em cenários de rede disruptivos, com enlaces sujeitos a falhas frequentes [Li et al. 2019]. ...
Conference Paper
A Redes de Dados Nomeados (NDN) oferecem uma alternativa segura e otimizada para cenários de distribuição de conteúdo, cujas aplicações lideram os rankings de uso da Internet atual e previsões futuras. Um dos requisitos destas aplicações está no controle de acesso, que visa garantir aplicação de políticas de autorização e contabilização aos dados requisitados. Para garantir o controle de acesso, soluções de NAC (do inglês Name-based Access Control) fazem uso da semântica do esquema de nomeação da NDN para introduzir uso de chaves criptográficas que garantem confidencialidade e controle de acesso aos dados. Um dos componentes da solução de NAC é o gerenciador de acesso, que centraliza a aplicação das políticas de controle, criação, gerenciamento e revogação de chaves. O uso de uma entidade centralizada contrasta com o modelo totalmente distribuído da NDN, propiciando gargalos e indisponibilidade no advento de falhas. Este artigo apresenta o design, prototipagem e avaliação do D-NAC, uma melhoria da solução NAC que visa tornar o gerenciador de acesso um componente totalmente distribuído. Uma avaliação experimental demonstra que o D-NAC proporciona resiliência a falhas e melhor desempenho para os consumidores comparado com o NAC, sem trazer impactos significativos no total de dados transmitidos na rede.
... There is no single point of failure in contrast to centralised data storage solutions. After the loss of nodes, the network offers a synchronisation mechanism that returns the system to its most recent state, guaranteeing that all nodes have the same information [12]. ...
Conference Paper
Throughout the last decade, the online e-governance platform's services have promoted engagement between individuals and institutions by completing different e-government transactions. To utilise the government platform's services, the user must first be granted access to the e-government, which is accomplished by two-factor authentication based on log-in and password through SMS or even three-factor authentication by adding smart cards. Due of the use of electronic government by the conventional techniques of the central server to transmit data, these services suffer from the issue of centralisation, which is represented by the danger of exposure to an attack that leads to a malfunction that may result in service disruption and closure. Moreover, these vulnerabilities exacerbated consumers' worries about their data's privacy and security. Blockchain technology protects and decentralises log-in information in our electronic authentication system. The evaluation demonstrated that blockchain-based authentication is feasible and manageable. By utilising blockchain technology, the proposed decentralised authentication model aims to create a secure architecture that fully protects the electronic governance platforms through the authorised access process of users, overcomes the problem of centralisation, and protects user log-in and credential information and data rights through the use of digital signatures, encryption, and hashing technologies. The results showed that the total time for the user account creation process took “0.8” seconds to generate the user's public and private keys, and less than “0.1” seconds to create the user certificate, and less than “0.1” second to create the user's transaction, while the total time for the user's access to e-government services took “0.1” second.
... Furthermore, employing certain tactics when broadcasting protocol messages in a group of nodes, NDVR from the wireless shared medium and NDN opportunistic caching features to reduce protocol overhead. NDVR is solely compared in this study to Distributed Dataset Synchronization over Disruptive Networks (DDSN) [106], which is a synchronization technique rather than a routing system. As a result, there is yet to be a performance comparison with routing strategies. ...
Article
Full-text available
Data has become an increasingly significant component of the Internet as it has increased in popularity. Where people care more about the data than the location of data. The Named Data Network (NDN) took this concept and paired it with the idea of making data a core component rather than host addresses. Because of its in-network caching capability, NDN will become a better contender than current Transmission Control Protocol and Internet Protocol (TCP/IP) based networks as data traffic grows exponentially. NDN-related challenges are available for investigation as NDN becomes more important. Routing in NDN is another essential domain that needs to be addressed, and several approaches have been presented to address routing concerns in NDN. In this study, we discuss and highlight NDN and its routing strategies comprehensively. In addition, this research illustrates a comparison of important routing paradigms to emphasis the breadth of routing research in NDN. Also, we investigate the routing attributes of NDN and expose the latest literature on this critical topic. Finally, this study provides useful insights into the emerging areas of guidance in the NDN to assist future studies in addressing challenges and open research issues.
... The constantly changing network topology and paths make packet routing challenging in mobile networks. The data-centric approach that NDN adopts is beneficial to mobile networks like MANETs [57,67]. There is no need to maintain and manage nodes' IP addresses. ...
Preprint
Named Data Networking (NDN) is a prominent realization of the vision of Information-Centric Networking. The NDN architecture adopts name-based routing and location-independent data retrieval. Among other important features, NDN integrates security mechanisms and focuses on protecting the content rather than the communications channels. Along with a new architecture come new threats and NDN is no exception. NDN is a potential target for new network attacks such as Interest Flooding Attacks (IFAs). Attackers take advantage of IFA to launch (D)DoS attacks in NDN. Many IFA detection and mitigation solutions have been proposed in the literature. However, there is no comprehensive review study of these solutions that has been proposed so far. Therefore, in this paper, we propose a survey of the various IFAs with a detailed comparative study of all the relevant proposed solutions as counter-measures against IFAs. We also review the requirements for a complete and efficient IFA solution and pinpoint the various issues encountered by IFA detection and mitigation mechanisms through a series of attack scenarios. Finally, in this survey, we offer an analysis of the open issues and future research directions regarding IFAs. This manuscript consists of an extended version of the paper published in ACM Computing Surveys: https://dl.acm.org/doi/10.1145/3539730.
Thesis
Full-text available
Nowadays, the Internet constitutes a vital part of our everyday lives and an in-creasing number of social functions and activities are being performed online. However, the underlying architecture supporting our interconnected digital world was designed several decades ago, before its commercial explosion. At the core of the Internet lies the TCP/IP architecture, a protocol stack which supports this "network of networks" and the variety of applications that are beingbuilt on top of it. The way that devices located in different networks communicate with each other is by implementing a common layer based on the Internet Protocol (IP). This design has enabled protocol innovation above and below this common layer, with multiple new protocols being proposed and adopted over the years. But it has also resulted in an ossification of the middle layers - Layers 3/4 of the OSI model - which have remained largely unchanged during those years. The resulting architecture is often visualized as having an hourglass shape, with several protocolsin the upper and lower layers and only a handful in the middle. It can be argued that this approach was largely successful, as it has proven highly scalable and capable of supporting a large variety of applications. However, the fact that TCP/IP originated in the 1970s means that its design was inherently limited by the capabilities of that time. In particular, the focal points of the architecture’s thinwaist - IP addresses - conceptually reflect the endpoints of that day’s fixed telephone networks, whose infrastructure provided the backbone for the early Internet. Yet, Internet usage has largely changed those years. Nowadays, connecting to a remote host is not the primary aim of an Internet user. To a large and increasing extent, this goal has been substituted by the need for content retrieval. Moreover, the hosts are no longer stationary as mobile computing devices have been the norm for more than a decade. To accommodate needs such as host mobility and content retrieval, elaborate mechanisms have been designed on top of the TCP/IP Internet infrastructure. Still, the fact that the underlying fabric was designed for a different purpose, introduces additional complexity to the architecture. This insight has given rise to proposals to replace the current Internet hourglass with a design that is more aligned with its current use, removing the unnecessary complexity and potentially increasing its performance. In this context, Information-Centric Networking architectures emerged. In Information-Centric Networking the focus is shifted to "what" is being transferred by the network, instead of "where" thedata is being transferred to. Named Data Networking is the most prominent Information-Centric Networking architecture. In Named Data Networking, the network layer operates directlyon named content objects, instead of host addresses. Thus, it adopts a clean-slate approach designed to replace TCP/IP with content-oriented communication primitives. In this dissertation we enhance the Named Data Networking architecture with various mechanisms, aiming to address some of its shortcomings. In particular, we focus on two main issues. The first one is its lack of native support for communica-tion in intermittently-connected wireless networks, due to its original focus was on well-connected fixed networks. The second one is its trust and security framework, which is open to specific attacks due to its sole reliance on cryptographic signatures and digital certificates. In the first part of this dissertation, we propose the use of a layered design whichleverages the existing work on Delay Tolerant Networking to achieve NDN’s operation in challenged environments. This design, NDN-over-DTN (NoD), uses the Bundle Protocol as a transport option in intermittently-connected networks. Compatibility with the original NDN architecture is maintained, while the design canflexibly support intermittently-connected devices with various levels of resource capabilities. Our research shows that this design can be effectively used to support varioususe cases, such as service deployment in a disrupted Mobile Edge Computing environment and content retrieval from intermittently-connected Internet of Things devices. It is capable of handling asymmetric routing paths and opportunistic connectivity, including scenarios with short network contacts. Its performance outperforms current host-centric DTN solutions and shows further improvement when leveraging cross-layer information. In the second part of this dissertation, we propose the use of reputation-based trust mechanisms to complement NDN’s default trust framework. To this end, we place our focus on two specific use cases: Dynamic Adaptive Streaming and Peer-to-Peer content distribution. In the first one, we assume a scenario in which Bitrate Oscillation Attacks take place and design a reputation-based mechanism which leverages adaptive caching by the routers to mitigate the attack. In the second one, we assume a scenario in which the network is plagued by Content Poisoning Attacks. In this case, we design a blockchain-based reputation system which is able to identify malicious behavior and reward honest users. Our evaluation results demonstrate that in both scenarios, reputation can serveas an effective means for attack mitigation. In the first case, we show that our approach can potentially achieve similar levels of performance with a no-attack scenario. In the second case, we show that our mechanism is able to correctly identify malevolent actors and mitigate both full-on and on-off attacks, while consistently providing financial rewards to honest users.
Poster
Full-text available
Intermittent connectivity and dynamic network topology create unique challenges for distributed applications in Mobile Ad Hoc Networks (MANETs), where individual entities may produce data at any time while moving around continuously. In this poster, we present DDSN, a distributed dataset synchronization protocol in Named Data Networking (NDN) that enables applications in MANETs to keep all members in an application group synchronized on the latest state of a shared dataset. Taking intermittent connectivity as the norm, whenever nodes in the same application group are reachable to each other, DDSN provides an eecient mechanism to unify the shared state and dataset between the encountered nodes.
Technical Report
Full-text available
Distributed dataset synchronization (Sync in short) implemented by ChronoSync allows a group of nodes to operate on a shared dataset with eventual consistency. However, when multiple nodes in the same sync group publish new data simultaneously, ChronoSync needs to either use exclude mechanism to fetch the simultaneously produced data, or fall back to a recovery mechanism. This problem is caused by a semantic overloading on Sync Interests: a Sync Interest is used both to detect state inconsistency (by embedding the dataset state digest in the Interest name) and to retrieve update (resulting in the update being named under a specific digest). In this report, we first use a simple case study to analyze the behavior of ChronoSync under simultaneous data publications, and then introduce RoundSync, a revision to ChronoSync to fix the overloading problem. RoundSync splits data publications into “rounds” and uses two separate Interest types for state inconsistency detection and update retrieval. We have implemented the RoundSync protocol, conducted preliminary evaluation through simulations, as well as performed comparative study of the RoundSync design with other NDN dataset synchronization solutions that have been developed so far [1].
Technical Report
Full-text available
NDN Forwarding Daemon (NFD) is a network forwarder that implements the Named Data Networking (NDN) protocol. NFD is designed with modularity and extensibility in mind to enable easy experiments with new protocol features, algorithms , and applications for NDN. To help developers extend and improve NFD, this document explains NFD's internals including the overall design, major modules, their implementations, and their interactions. Revision history • Revision 8 (February 19, 2018):-Updated description of face system-Interface whitelist and blacklist for multicast faces-TCP permanent face-IPv6 support in MulticastUdpTransport-New ad hoc link type-Content Store policy configuration and policy API-Unsolicited data policy-Forwarding pipeline updates, including semantics of removing Link from Interest when it reaches producer region-Description of new semantics of NextHopFaceId-Scope control in strategies-Strategy parameters-Updated description of multicast strategy-Command Authenticator-Updated face management to match current NFD implementation-RIB-to-NLSR readvertise-New section on Congestion Control • Revision 7 (October 4, 2016):-Added brief description and reference to the new Adaptive SRTT-based (ASF) forwarding strategy-Update description of Strategy API to reflect latest changes-Miscellaneous updates 1 • Revision 6 (March 25, 2016):-Added description of refactored Face system (Face, LinkService, Transport)-Added description of WebSocket transport-Updated description of RIB management-Added description of Nack processing-Added introductory description of NDNLP-Added description of best-route retransmission suppression-Other updates to synchronize description with current NFD implementation • Revision 5 (Oct 27, 2015):-Add description of CS CachePolicy API, including information about new LRU policy-BroadcastStrategy renamed to MulticastStrategy-Added overview of how forwarder processes Link objects-Added overview of the new face system (incomplete)-Added description of the new automatic prefix propagation feature-Added description of the refactored management-Added description of NetworkRegionTable configuration-Added description about client.conf and NFD • Revision 4 (May 12, 2015): New section about testing and updates for NFD version 0.3.2:-Added description of new ContentStore implementation, including a new async lookup model of CS-Added description of the remote prefix registration-Updated Common Services section • Revision 3 (February 3, 2015): Updates for NFD version 0.3.0:-In Strategy interface, beforeSatisfyPendingInterest renamed to beforeSatisfyInterest-Added description of dead nonce list and related changes to forwarding pipelines-Added description of a new strategy_choice config file subsection-Amended unix config text to reflect removal of "listen" option-Added discussion about encapsulationg of NDN packets inside WebSocket messages-Revised FaceManager description, requiring canonical FaceUri in create operations-Added description of the new access router strategy
Article
Full-text available
As a proposed Internet architecture, Named Data Networking (NDN) takes a fundamental departure from today's TCP/IP architecture, thus requiring extensive experimentation and evaluation. To facilitate such experimentation, we have developed ndnSIM, an open-source NDN simulator based on the NS-3 simulation framework. Since its first release in 2012, ndnSIM has gone through five years of active development and integration with the NDN prototype implementations, and has become a popular platform used by hundreds of researchers around the world. This paper presents an overview of the ndnSIM design, the ndnSIM development process, the design tradeoffs, and the reasons behind the design decisions. We also share with the community a number of lessons we have learned in the process.
Conference Paper
Distributed dataset synchronization (sync for short) provides an important abstraction for multi-party data-centric communication in the Named Data Networking (NDN) architecture. Since the beginning of the NDN project, several sync protocols have been developed, each made its own design choices that cause inefficiency under various conditions. Furthermore, none of them provides group membership management, making it difficult to remove departed nodes from the protocol state maintained at each node. This poster presents VectorSync, a new NDN sync protocol that is built upon the lessons learned so far, provides group membership management, and improves the efficiency of dataset synchronization.