Conference PaperPDF Available

Building a Low-latency, Proximity-aware DHT-Based P2P Network

Authors:

Abstract and Figures

DHT-based P2P networks have a problem of topology mismatch which causes high latency for message routing. This paper focuses on improving the latency of routing process for Chord, which is a typical DHT-based P2P network protocol. We propose a simple method to build a Chord network based on the proximity of nodes in the underlying physical network. The idea of our method is the combination of two techniques: identifier selection in node joining phase and neighbor selection in network stabilization phase, both are performed based on physical network latency. We have evaluated our proposed solution by simulations. In a simulation network with 4096 nodes, the average latency of routing process in our method can reduces 30% comparing with the conventional Chord routing method.
Content may be subject to copyright.
Building a low-latency, proximity-aware DHT-based P2P network
Ngoc Ben DANG, Son Tung VU, Hoai Son NGUYEN
Department of Computer network
College of Technology, Vietnam National University, Hanoi
144 Xuan Thuy, Cau Giay, Hanoi, Vietnam
Abstract—DHT-based P2P networks have a problem of topol-
ogy mismatch which causes high latency for message routing.
This paper focuses on improving the latency of routing process
for Chord, which is a typical DHT-based P2P network protocol.
We propose a simple method to build a Chord network based
on the proximity of nodes in the underlying physical network.
The idea of our method is the combination of two techniques:
identifier selection in node joining phase and neighbor selection
in network stabilization phase, both are performed based on
physical network latency. We have evaluated our proposed
solution by simulations. In a simulation network with 4096
nodes, the average latency of routing process in our method can
reduces 30% comparing with the conventional Chord routing
method.
Keywords-DHT;Chord;low latency;proximity-aware
I. INTRODUCTION
Peer-to-peer network applications are becoming more and
more popular in the Internet today. Comparing to traditional
server-client network model, peer-to-peer networks have
many advantages like its scalability, self-organization and
fault tolerance.
Recently, Distributed Hashing Table (DHT) algorithms
such as Chord [1], CAN [2], Pastry [3] attract lots of
attention since they can offer a promising solution for
scalable and efficient message routing on P2P networks. The
algorithm constructs a structure of a virtual key space on a
P2P network where each node is responsible for a portion
of key space and keys are used as destinations to route
messages. Each node also maintains information about a
small set of neighbor nodes, which are selected based on the
positions of nodes in the virtual key space. The neighbor sets
are used to route messages to destination nodes efficiently.
However, since the construction of a DHT-based P2P net-
work is performed regardless of physical network topology,
there are differences between overlay network topology and
physical network topology (called ”topology mismatch”).
For example, two nodes which are near in the underlying
physical network (i.e. the latency of message routing be-
tween them is small), may be far away (i.e. several hops
away) from each other in the DHT-based network. In this
case, the latency of message routing between these nodes
in the DHT-based network is significantly higher compared
with the latency between them in the underlying physical
network.
This paper focuses on the problem of improving the
latency of message routing in Chord, a typical DHT-based
P2P network protocol. There are two main approaches to
solve this problem: proximity neighbor selection (PNS) [5],
[7], [9] and proximity identifier selection (PIS) [4], [8]. In
the PNS approach, a node will choose its neighbor nodes
based on their proximity. In the PIS approach, each node
chooses its node identifier based on its geographic location.
To realize the PIS approach, conventional researches [4],
[8] map participating nodes into a proximity-based two-
dimensional space and assigns node identifier in Chord
based on the node position in the two-dimensional space
such that nodes that are close in identifier are close in the
two-dimensional space. This solution has a drawback of
requiring a set of landmarks and a pre-processing phase for
nodes to compute their own coordinates in the proximity-
based two-dimensional space.
This paper proposes a simple solution to solve the topol-
ogy mismatch problem for Chord protocol. The main idea of
the solution consists of two modifications of Chord protocol.
First, in the node joining phase, a new node can select a
position in the Chord ring from a set of randomly self-
generated positions such that the latency to its successor
node and predecessor node is the smallest. Second, in the
stabilization phase, each node selects a neighbor node for
each finger table entry from a set of candidate nodes such
that the latency to each neighbor node is small. Our method
is easy to implement and has small overhead comparing
with conventional methods. With minor modifications of
Chord protocol, however, our method can reduce the latency
of message routing compared to the conventional Chord
protocol.
In order to evaluate the effectiveness of our method,
we have built a simulator to simulate the execution of
our algorithm and measure the latency of message routing
between participating nodes in the network. The simulation
results show that our method can reduce the average latency
of message routing by approximately 30%, compared to
the conventional Chord protocol. Our method also does not
affect much on the random mapping of identifier in the
Chord ring.
The rest of this paper is structured as follows. Section 2
explains Chord protocol and related works with their pros
and cons. Section 3 is dedicated to describe our proposal
method. Section 4 describes our simulator and simulation
results. Conclusion is written in section 5.
II. BACKGROUND
A. Chord protocol
Chord is a typical structured peer-to-peer network pro-
tocol using an m-bit address space which includes 2
m
identifiers arranged in a circle topology called a Chord ring
(Fig. 1). Using a consistent hash function, each participating
node randomly chooses an identifier and then joins to a
position in the ring by a joining phase. A resource is
distributed to a Chord network by the use of key-value
pairs: the key is the resourcefs identifier and the value is
the resource itself. Each key-value pair is stored in the node
which is nearest the key in Chord ring clockwise (i.e. the
successor of the key).
In this paper, we focus on the construction of Chord ring,
which includes a joining phase and a finger table building
phase. The joining phase of a node to a Chord network is
rather simple. When a node wants to join a Chord network,
it will generate its identifier by making use of a consistent
hash function on its own information such as its IP/Port
pair. The joining node then sends a joining message to a
well-known node in the Chord network (so-called ”bootstrap
node”), in order to query for its identifier’s successor. Once
the successor information is returned, the joining node will
contact to the successor to join the position next to the
successor counter-clockwise in Chord ring. After that, the
node performs a finger table building phase to build its own
finger table.
In an m-bit address space, each participating node builds a
finger table which has m entries. Assuming that the identifier
of the current node is k, the i
th
entry in the table holds the
information about the successor node of identifier (k+2
i1
).
Finger tables are used to route messages between nodes in
the network. When a node receives a message destined to
an identifier in the Chord ring, it will look up its finger table
to find the nearest node to the identifier counter-clockwise
and forward the message to that node. The forwarded node
will recursively repeats the same operation. The process
continues until the message reaches the successor of the
destination identifier.
In Chord protocol, the construction of a finger table is
performed regardless of physical network topology (i.e. the
latency between nodes). Therefore, the latency of message
routing between two nodes in Chord network may be signifi-
cantly higher compared with the latency between these nodes
in the underlying physical network. Chord also defines some
other operations such as node leaving, network recovery
from a node failure... The interested reader is referred to
[1] for these details.
1) Related works: There are several conventional solu-
tions for Chord protocol to improve the latency of message
Figure 1. An example of a Chord ring
routing due to the differences between overlay network
topology and underlying physical network topology.
In the approach of Proximity Neighbor Selection (PNS),
each finger table entry holds the information of a node
that is selected from a number of nodes, which belong
to a predetermined interval of identifiers, based on their
proximity. The effectiveness of this approach is showed in
[5], [7], [9]. However, actual realization of this approach is
different between conventional researches. In [7], the authors
propose a method that makes use of latency information
collected from lookup process on the overlay network. After
a node joins a Chord network, it will gradually obtain low-
latency finger table by updating its finger table entries based
on collected latency information. In [9], the authors propose
the use of Vivaldi algorithm [11] to calculate the position of
a node on a virtual two-dimensional space so that distances
in the space approximate distances in the underlying physical
network. Each node then selects neighbor nodes that are
physically nearby by looking up for nodes that are close to
it on the virtual 2-dimension space.
In the approach of Proximity Identifier Selection (PIS), a
node will choose its node identifier based on its geographic
location. Like [9], the work [4] and Quasi-Chord [8] uses
a set of landmarks to map nodes into points in a two-
dimensional space. The identifier of nodes are selected such
that nodes are close in a two-dimensional space are close
in the identifier space. In common, both of approaches
try to decrease the latency of message routing on a DHT
network. However, they require a set of landmarks and a pre-
processing phase for nodes to compute their own coordinates
in the proximity-based two-dimensional space.
III. OUR APPROACH
A. Overview
In this chapter, we propose a new solution for the topology
mismatch problem of Chord protocol. The solution achieves
the same goals as related works, but with another approach
which requires less cost and can be deployed in a large-scale
network easily.
The idea of our approach is that a node can select a
position that it can join in a Chord ring such that nodes
that are close to it in the Chord ring are physically nearby
to it. As the result, nodes are close in a Chord ring will be
close regarding to physical network positions.
The idea comes from a consideration that the proximity
neighbor selection approach is not effective in upper entries
of finger table. It is because the number of nodes that can
be selected as an upper entry of a finger table (i.e. belong to
a predetermined interval of identifiers) is often small. Since
the very last hops of message routing in Chord usually use
the upper entries of finger tables, we need to improve the
latency due to the last hops by keeping physically nearby
nodes as close together in a Chord ring as possible.
Unlike conventional researches which force a joining node
to calculate its physical position, in our method, we allow
a joining node to select its identifier in a set of randomly
generated identifiers, each of which corresponds to a position
in the Chord ring. The node selects an identifier as its joining
position such that the latency to the successor node and the
predecessor node of the selected identifier is the smallest in
the set of candidate identifiers.
Our method also includes a proximity neighbor selection
phase. The i
th
entry of the finger table of a node k will be
selected from a set of candidate nodes which are next to the
successor node of the identifier (k+2
i1
). The selected node
must satisfy the condition that the latency from the selected
node to node k is the smallest in the set of candidate nodes.
The details of our method are described as follows.
B. Our method
Our method includes two phases.
Proximity-based identifier selection phase
When a new node joins in a Chord network, instead of
having the node to generate only one identifier to join in the
network, we let the node to generate a set C of identifiers.
The set C is generated randomly by making use of a hash
function on the combination of the nodefs own information
and a random value.
The node then contacts to a bootstrap node in the Chord
network to query the successor node and the predecessor
node of each identifier in the set C. We call CHOICE the size
of set C of identifiers. The node then measures the network
latency from it to the successor node and the predecessor
node of each identifier. The latency to a node can be simply
measured by a ping protocol. It then selects the identifier,
Figure 2. Pseudo-code of proximity-based identifier selection phase.
CHOICE: the size of set C of identifiers
Figure 3. Pseudo-code of proximity-based neighbor selection phase
which successor node or predecessor node has the smallest
latency to it, as its identifier to join in the Chord network.
The pseudo-code for the algorithm is shown in Fig. 2.
Proximity-based neighbor selection phase
To update an entry in the finger table, first the node, which
identifier is k for instance, queries the successor node s of
the identifier (k + 2
i1
) based on the Chord protocol. Then,
node k queries the successor node s and nodes next to the
successor node to get a set of candidate nodes which are
next to the successor clockwise and counter-clockwise in the
Chord ring. The set of candidate nodes has 2 * EXPANSION
+ 1 nodes including the successor node s, EXPANSION
nodes next to node s counter-clockwise and EXPANSION
nodes next to s clockwise. Here, we call EXPANSION the
spread degree. To avoid duplication of nodes in multiple
entries of the finger table, nodes in the set which are not
between k + 2
i1
and k + 2
i+1
will be not considered. The
pseudo-code for the algorithm is shown in Fig. 3.
By probing for the latency from node k to each node in
the set of candidate nodes, we can select the desired node
for the entry of the finger table which latency from node k
to the selected node is the smallest in the set.
The update of finger table entries is performed whenever
the network stabilization phase of Chord protocol is run.
Hence, it can adapt to the join/leave of nodes.
With high probability, the selection of node identifier
based on network latency can keep the latency among nearby
nodes in a Chord network relative small. Hence, it takes
a short time to route messages of which destinations are
nearby in a Chord ring. Furthermore, the update of finger
tables based on network latency can reduce routing time
to forward query messages among nodes. Therefore, our
solution can achieve the goal of reducing latency of message
routing among participating nodes in a Chord network.
The solution has a drawback of delaying the joining
process of new nodes. Comparing to the original Chord
protocol, the solution requires a time interval to probe for
the latency from a joining node to a number of randomly
generated set of node IDs. Updating routing table process
also needs some time interval to select one node from a set
of candidate nodes. The more nodes or identifiers there are
in the sets, the more optimal degree of latency we get and
the longer time it takes for nodes to join in the network or
update routing tables. It is a trade-off and we need to run
simulations to get reasonable parameter values.
IV. EVALUATION
To evaluate the efficiency of the proposed solution, we
have performed experiments with a simple network simu-
lator. Our simulation focuses on measuring the difference
of message routing latency between the conventional Chord
and our proposed method.
A. Simulator
We build a simulator which generates a physical network
topology of participating nodes and maps it to conventional
and improved Chord overlay network. Generated queries
are routed among participating nodes based on routing
Figure 4. Average query latency comparison
information and the returned results are collected to build
statistics.
The simulation network topology is divided into separated
domains. Each domain has a number of participating nodes.
Nodes in the same domain are connected to each other to
create a star topology with a center switch. All the domains’
switches are connected directly to each other. Each of
connections between nodes and their domainfs switch, and
between switches has its own latency. The latency between
node A and node B is calculated by summing up the latency
between node A and its domain’s switch and the latency
between the switch to node B if A and B are in the same
domain. Otherwise, the latency is calculated as the sum of
the latency between node A and its domain’s switch, the
latency between the switch of As domain and the switch
of B’s domain and the latency between the switch of B’s
domain and node B.
With the simulator, we can specify following parameters:
number of participating nodes, number of domains, and
number of nodes in a domain and latency of connections. We
divide experiment time into intervals of time called ”round”.
To simulate the join/leave of nodes, the simulator makes
nodes join in and leave the network randomly after each
round. The join/leave timing of nodes is generated based on
the Pareto distribution. Then, all the nodes’ routing tables
are updated and the simulator generates queries. The average
latency in each round after updating routing tables and the
overall average latency of the experiment are stored to build
the statistics for reviewing later.
B. Simulation results
The first experiment aims to compare our solution’s effi-
ciency to conventional Chord’s. The simulator creates 4096
nodes which are divided into 32 domains, with EXPANSION
value of 3 and CHOICE value of 8 and 16 respectively.
The latency of connections between nodes in a domain is
assigned randomly from 1 to 30 ms, and the latency between
Figure 5. Average query latency comparison
Figure 6. Average latency with CHOICE adjustment
two switches is assigned randomly from 50 to 250ms. The
experiment is carried out in 360 rounds and there are about
1000 queries generated in each round. The average latency
of the network in the first 100 rounds is presented in Fig. 4.
The result indicates that applying our proposed method can
make average query latency in each round be steady and
reduce significantly comparing to the conventional Chord
protocol. After 360 rounds, the average latency in our
method with CHOICE=8 is about 759 ms, which equals to
67% of conventional Chordfs result (about 1123 ms). With
CHOICE=16, the average latency is improved slightly to 731
ms.
We also perform the same experiment but increase the
number of nodes in a Chord network. Figure 5 shows that
our solution can reduce the latency of message routing
about 33% comparing with conventional Chord protocol
even when the number of nodes in a network increases.
The second experiment focuses on examining the latency
optimization degree when we adjust CHOICE value - the
number of randomly generated node IDs in the joining
phase. The simulator parameters are the same as the first
Figure 7. Average latency with EXPANSION adjustment
Figure 8. The distribution of ID intervals between adjoining nodes in a
Chord ring
experiment: 4096 nodes are divided into 32 domains, EX-
PANSION value is 3, but there are 3600 rounds and about
1000 queries generated in each round. The result is presented
in Fig. 6.
The case in which CHOICE value is 1 can be viewed as
conventional Chordfs joining phase and only the selection
of routing table entries based on network latency is applied.
In this case, the average latency is 814ms. The more choices
of ID for a node in the joining phase are, the better latency
optimization degree we can achieve. When the CHOICE
value is 32, the average latency is 660 ms.
The third experiment focuses on examining optimization
degree of the solution when we adjust EXPANSION value
- the number of candidate nodes in the neighbor selection
phase. As the EXPANSION value increases, each node has
more choices to select a node for a routing table entry.
Hence, we get better latency optimization degree (Fig. 7).
The random mapping of identifier in Chord rings is one
of Chord advantages that we needs to keep. Conventional
Chord protocol uses a consistent hash algorithm to assign
one ID to each newly joining node. In our solution, an ID is
assigned to a newly joining node based on node proximity.
Because of the difference between two joining algorithms,
we need to examine the affects of our solution on the random
mapping of identifier in the Chord ring. The simulation of
our method is performed with CHOICE=8, EXPANSION
=3. The simulator results show the statistics of ID intervals
between every pair of adjoining nodes in a Chord ring. As
shown in Fig. 8, with about 2000 available nodes existed
in the network, the distribution of ID intervals between
adjoining nodes in a Chord ring in the case of our method is
similar to the one in the case of conventional Chord protocol.
It means that our method does not affect much on the random
mapping of identifier in the Chord ring.
V. CONCLUSION AND FUTURE WORKS
In the paper, we have proposed a method to optimize
Chord topology based on underlying network latency. The
solution focuses on two modifications of the Chord protocol:
selecting the ID in a set of randomly generated IDs from
which latency to its next and previous node is smallest and
updating routing table by selecting neighbor nodes based
on network latency. By doing experiments with simulator,
we have proved that our method can reduce average query
latency significantly but does not affect much on the random
mapping of identifier in the Chord ring. Moreover, it is rather
simple to deploy the solution in Chord-based applications.
In the near future, we are going to continue to evaluate
the efficiency of our method in a real network topology and
deployed our method in Chord-based applications.
ACKNOWLEDGMENT
This work is partly supported by the research project No.
QC.09.12 granted by Vietnam National University, Hanoi
REFERENCES
[1] I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, H. Balakrisnan,
”Chord: A Scalable peer-to-peer lookup service for Internet
applications”, In Proceedings of ACM SIGCOMMf01, August
2001
[2] S. Ratnasamy, P. Francis, M. Handley and R. Karp, ”A Scal-
able Content-Addressable Network”, In Proceedings of ACM
SIGCOMMf01, Aug. 2001
[3] A. Rowstron and P. Druschel, ”Pastry: Scalable, distributed
object location and routing for large-scale peer-to-peer sys-
tems”, In Proceedings of IFIP/ACM International Conference
on Distributed Systems Platforms, Nov. 2001
[4] S. Ratnasamy, M. Handley, R. Karp and S. Shenker,
”Topologically-Aware Overlay Construction and Server Selec-
tion”, Proceedings of the INFOCOM 2002.
[5] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy and S.
Shenker, I. Stoica, ”The Impact of DHT Routing Geometry on
Resilience and Proximity”, Proceedings of SIGCOMM 2003
[6] F. Dabek, J. Li, E. Sit, J. Robertson, M. F. Kaashoek and
R. Morris, ”Designing a DHT for low latency and high
throughput”, Proceedings of 1st Symposium on Networked
Systems Design and Implementation, March 2004
[7] A. Goel, R. Govindan, H. Zhang, ”Improving Lookup Latency
in Distributed Hash Table Systems using Random Sampling”,
IEEE/ACM Transactions on Networking, Volume 13, February
2005
[8] M. Sun, Z. Zhang, ”Quasi-Chord: physical topology aware
structured P2P network”, Proceedings of the 11th Joint Con-
ference on Information Sciences 2008. December 2008.
[9] H. Duan, X. Lu, H. Tang, X. Zhou, Z. Zhao, ”Proximity
Neighbor Selection in Structured P2P Network”, Proceedings
of The Sixth IEEE International Conference on Computer and
Information Technology, September 2006.
[10] T. S. Eugene Ng and H. Zhang, ”Predicting Internet Network
Distance with Coordinates-Based Approaches”, Proceedings of
INFOCOM 2002
[11] R. Cox, F. Dabek, F. Kaashoek, J. Li, and R. Morris,
”Practical, distributed network coordinates”, Proceedings of
the Second workshop on Hot Topics in Networks (HotNets-
II) (Nov. 2003).
... We can broadly classify these research efforts into two categories -Proximity Neighbor Selection ( PNS ) [9,16,23,26,35,36] and Proximity Identifier Selection (PIS) [10,17 ,39,40, 46,48,50]. Further, some research efforts [12,48] also used a combination of both PNS and PIS scheme. ...
... In [12], authors provide location information by geographical mapping locations in 2-D space. Any incoming new node in the overlay places itself closer to a node with minimum latency. ...
... If the node/link has been removed, lines [8,9] help identify the node's IP address and then remove the node entry from the hash map. Similarly, lines [11][12][13] shows the addition of node entry in the hash-map by the controller at the appropriate position. ...
Article
Full-text available
Structured Peer-to-Peer (P2P) overlay networks are scalable and efficient networks. This efficiency comes at a cost as there is a need for regular maintenance of an overlay structure to handle the continuous churn of peers. This results in huge overhead in terms of maintenance traffic generated by peers. This problem becomes more critical when logical overlay topology differs from actual physical network topology, a common phenomenon in P2P overlay networks. In this paper, we present a novel approach to designing a Distributed Hash Table (DHT) based P2P overlays with the help of Software Defined Networks (SDN) concepts. SDN is an upcoming technology that provides enough flexibility in deciding application-specific network policies and rules dynamically. In our approach, we have attempted to bring the overlay topology closer to actual physical topology and reduce the maintenance traffic as much as possible. We have also tested our work on a hybrid network consisting of both virtual and physical switches. The experimental results suggest and establish the applicability of our approach.
... No multi-path routing. Bidirectional latencysensitive Chord [32], LLChord [15], topology matching [8] Lookup latency. ...
... In randomized Chord [13], a node u takes its ith finger randomly in [u + 2 i−1 , u + 2 i ), decreasing in result the fraction of failed lookups (by 29% in the Gnutella trace [12]). The proximity neighbor selection scheme (PNS) [13] allows reducing the lookup latency; concrete Chord extensions include bidirectional latency-sensitive Chord [32], LLChord [15] 1 , and topology matching [8]. All these approaches also apply to CR-Chord since it does not fix a strategy of neighbor selection. ...
Chapter
Full-text available
Without additional mechanisms conventional DHTs are vulnerable to attacks. In particular, previous research showed that Chord is not well resistant to malicious nodes that joined the DHT. This chapter describes the CR-Chord protocol, our implementation of the cyclic routing algorithm. Using simulations we compare the lookup availability of basic Chord and CR-Chord. The results suggest that CR-Chord improves the lookup availability on the average by 1.4 times. When the number of malicious nodes is small, such as 5%, CR-Chord has almost twice lower lookup failure rate.
Conference Paper
Structured Peer-to-Peer (P2P) overlay networks based on Distributed Hash Table (DHT) provide self-organization and fault tolerance along with good look-up efficiency. This makes them a perfect alternative for designing distributed applications where participating entities (peers) leave or join the system dynamically. However due to continuous churn of peers, periodic maintenance of overlay structure is required resulting in large amount of network traffic. This issue becomes more critical when logical overlay topology is different from actual physical network topology which is a common phenomenon in P2P overlay networks. In this paper, we present a novel approach of designing DHT based P2P overlays using Software Defined Networks (SDN) paradigm. SDN provides flexible ways to tune the policies and rules of underlying physical network dynamically according to the application. In our approach, we have used SDN concepts to bring the overlay topology closer to actual physical topology with an aim to reduce the maintenance traffic. We have implemented an improved SDN based DHT on a smaller scale as proof of concept. The experimental results suggest and establish the applicability of our approach.
Conference Paper
In many structured overlays, nodes strictly maintain their routing tables using node identifiers. However, building routing tables while taking account of the physical network is difficult. We propose proximity-aware flexible routing tables (PFRT) in this paper as a method to systematically design proximity-aware routing algorithms for structured overlays. PFRT was developed by extending the flexible routing tables (FRT) method such that network proximity is considered. Routing tables in FRT-and PFRT-based algorithms are updated according to the order of the set of all routing table patterns. FRT-based algorithms define a total order based on node identifiers, whereas PFRT-based algorithms define two total orders based on node identifiers and network proximity. Because PFRT is a simple extension of FRT, PFRT-based algorithms also have many of the advantageous features of FRT. We extended Chord to design PFRT-Chord, which is a PFRT-based algorithm. Experimental results showed that PFRT-Chord preserves the expected FRT-derived properties and it could take account of network proximity.
Book
Full-text available
The main goal of this book is to cover important issues in optimization of present large-scale P2P systems as well as help in designing future systems. The book contains a comprehensive survey and summary of research results complemented by authors’ own contributions to the field in the recent years. The book is extensively illustrated by examples to help the understanding as well as includes an extensive list of references on P2P articles and resources.
Conference Paper
Distributed Hash Tables (DHTs) attract con- siderable attention as a way of realizing an efficient lookup service in Peer-to-Peer (P2P) applications. However, many of conventional DHTs such as Chord and Kademlia do not take into account the proximity of nodes in the physical network, which causes a significant performance degradation as the size of the underlying network grows. In this paper, we propose a method to construct a proximity- aware DHT based on the notion of virtual coordinate space realized by a network coordinate system (NCS). The performance of the proposed method is evaluated by simulation. The result of simulations indicates that it reduces the average physical distance between adjacent nodes in the P2P overlay by 10% for random or scale-free networks, and by 80% for grid-structured networks.
Article
Distributed Hash Tables (DHTs) provide a useful key-to-value lookup service for many Internet applications. However, without additional mechanisms DHTs are vulnerable to attacks. In particular, previous research showed that Chord is not well resistant to malicious nodes that joined the DHT. We introduce the cyclic routing algorithm as an extension of Chord (CR-Chord). Using simulations we compare the lookup availability of Chord and CR-Chord. The results suggest that CR-Chord improves the lookup availability on the average by 1.4 times. When the number of malicious nodes is small, such as 5%, CR-Chord has almost twice lower lookup failure rate.
Article
Full-text available
A new approach is proposed in our paper to solve the topology mismatch problem in P2P network called Quasi-Chord. Quasi-Chord is in the light of the ideology of Chord. Our model utilizes global network position (GNP) system to coordinate the host on the physical layer and uses the Cantor space filling Curve to map the 2-dimensional geometrical space into 1-dimensional. Then we build up the Quasi-Chord model according to the Cantor value. Simulating experiment shows that this method can effectively lower the network delay and decrease the network flow.
Conference Paper
Full-text available
A number of large-scale distributed Internet applications could potentially benefit from some level of knowledge about the relative proximity between its participating host nodes. For example, the performance of large overlay networks could be improved if the application-level connectivity between the nodes in these networks is congruent with the underlying IP-level topology. Similarly, in the case of replicated Web content, client nodes could use topological information in selecting one of multiple available servers. For such applications, one need not find the optimal solution in order to achieve significant practical benefits. Thus, these applications, and presumably others like them, do not require exact topological information and can instead use sufficiently informative hints about the relative positions of Internet hosts. In this paper, we present a binning scheme whereby nodes partition themselves into bins such that nodes that fall within a given bin are relatively close to one another in terms of network latency. Our binning strategy is simple (requiring minimal support from any measurement infrastructure), scalable (requiring no form of global knowledge, each node only needs knowledge of a small number of well-known landmark nodes) and completely distributed (requiring no communication or cooperation between the nodes being binned). We apply this binning strategy to the two applications mentioned above: overlay network construction and server selection. We test our binning strategy and its application using simulation and Internet measurement traces. Our results indicate that the performance of these applications can be significantly improved by even the rather coarse-grained knowledge of topology offered by our binning scheme.
Conference Paper
Full-text available
The various proposed DHT routing algorithms embody several different underlying routing geometries. These geometries include hypercubes, rings, tree-like structures, and butterfly networks. In this paper we focus on how these basic geometric approaches affect the resilience and proximity properties of DHTs. One factor that distinguishes these geometries is the degree of flexibility they provide in the selection of neighbors and routes. Flexibility is an important factor in achieving good static resilience and effective proximity neighbor and route selection. Our basic finding is that, despite our initial preference for more complex geometries, the ring geometry allows the greatest flexibility, and hence achieves the best resilience and proximity performance.
Article
Full-text available
Distributed hash table (DHT) systems are an important class of peer-to-peer routing infrastructures. They enable scalable wide-area storage and retrieval of information, and will support the rapid development of a wide variety of Internet-scale applications ranging from naming systems and file systems to application-layer multicast. DHT systems essentially build an overlay network, but a path on the overlay between any two nodes can be significantly different from the unicast path between those two nodes on the underlying network. As such, the lookup latency in these systems can be quite high and can adversely impact the performance of applications built on top of such systems. In this paper, we discuss a random sampling technique that incrementally improves lookup latency in DHT systems. Our sampling can be implemented using information gleaned from lookups traversing the overlay network. For this reason, we call our approach lookup-parasitic random sampling (LPRS). LPRS converges quickly, and requires relatively few modifications to existing DHT systems. For idealized versions of DHT systems like Chord, Tapestry, and Pastry, we analytically prove that LPRS can result in lookup latencies proportional to the average unicast latency of the network, provided the underlying physical topology has a power-law latency expansion. We then validate this analysis by implementing LPRS in the Chord simulator. Our simulations reveal that LPRS-Chord exhibits a qualitatively better latency scaling behavior relative to unmodified Chord. The overhead of LPRS is one sample per lookup hop in the worst case. Finally, we provide evidence which suggests that the Internet router-level topology resembles power-law latency expansion. This finding implies that LPRS has significant practical applicability as a general latency reduction technique for many DHT systems. This finding is also of independent interest since it might inform the design of latency-sensitive topology models for the Internet.
Conference Paper
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.
Conference Paper
Structured P2P network offers efficient and faulttolerant routing, object location and load balancing for upper applications. In this paper, we propose a novel scheme to improve the proximity property of structured P2P network. Basing on 2-dimension virtual network coordinates, the psychically nearby nodes are clustered into the same and adjacent regions after mapping from the network coordinates space to the identifier space of DHT. Through searching the corresponding region id using RPC calls in Chord, nodes obtain their nearby neighbors quickly in pure distributed way. Results obtained via simulation of large scale topology models denote that our scheme has lower average neighbor delay than randomly sampling approach and only incurs a modest additional searching overhead.
Article
Designing a wide-area distributed hash table (DHT) that provides high-throughput and low-latency network storage is a challenge. Existing systems have explored a range of solutions, including iterative routing, recursive routing, proximity routing and neighbor selection, erasure coding, replication, and server selection.
Article
Vivaldi is a distributed algorithm that assigns synthetic coordinates to Internet hosts, so that the Euclidean distance between two hosts' coordinates predicts the network latency between them. Each node in Vivaldi computes its coordinates by simulating its position in a network of physical springs. Vivaldi is both distributed and efficient: no fixed infrastructure need be deployed and a new host can compute useful coordinates after collecting latency information from only a few other hosts. Vivaldi can rely on piggy-backing latency information on application traffic instead of generating extra traffic by sending its own probe packets.
Article
Hash tables -- which map "keys" onto "values" -- are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a ContentAddressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN design is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.