Conference PaperPDF Available

Kademlia: A Peer-to-peer Information System Based on the XOR Metric

Authors:
  • Maymounkov Math

Abstract

We describe a peer-to-peer system which has provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.
Space of 160−bit numbers
0
0
0
0
0
0
0
0
0
0
0
0
00
0
1
1
1
1 1
1
1
1 1
1
1
1
1
1
1
1
1
00...0011...11
0
0
1
2
3
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
00
0
1
1
1
1 1
1
1
1 1
1
1
1
1
1
1
1
1
00...0011...11
Space of 160−bit numbers
00...0011...11 Space of 160−bit ID numbers
01
0
0
1
1
1
1
0
0
0
1
00...00
11...11
Space of 160−bit numbers (used for nodeIDs and keys)
1
1
1
0
0
0
0
0
1
1
... A blockchain system relies on an underlying peer-to-peer (P2P) network to propagate information including recent transactions and blocks. The topology of the P2P network is foundational to the blockchain's availability under network partitions, its security against a variety of attacks (e.g., eclipsing targeted nodes [1], denial of individual nodes' service [72,73], and deanonymization of transaction senders [74,75]), and its performance (e.g., mining power utilization [76] and the quality of RPC services [77,78,79] forms a structured DHT network by following Kademlia's protocols [82] for peer discovery (RLPx) and session establishment (DevP2P) [67], and 2) a number of application-specific overlays [83,67], among which the dominant ones are Ethereum blockchains for information propagation. In particular, the Ethereum P2P network hosts multiple blockchain overlays with different "networkIDs" ...
Thesis
Full-text available
This thesis aims to examine the security of a blockchain's communication network. A blockchain relies on a communication network to deliver transactions. Understanding and hardening the security of the communication network against Denial-of-Service (DoS) attacks are thus critical to the well-being of blockchain participants. Existing research has examined blockchain system security in various system components, including mining incentives, consensus protocols, and applications such as smart contracts. However, the security of a blockchain's communication network remains understudied. In practice, a blockchain's communication network typically consists of three services: RPC service, P2P network, and mempool. This thesis examines each service's designs and implementations , discovers vulnerabilities that lead to DoS attacks, and uncovers the P2P network topology. Through systematic evaluations and measurements, the thesis confirms that real-world network services in Ethereum are vulnerable to DoS attacks, leading to a potential collapse of the Ethereum ecosystem. Besides, the uncovered P2P network topology in Ethereum mainnet suggests that critical nodes adopt a biased neighbor selection strategy in the mainnet. Finally, to fix the discovered vulnerabilities, practical mitigation solutions are proposed in this thesis to harden the security of Ethereum's communication network.
... Moreover, instead of using direct connections, we implemented a topology inspired by peer-to-peer networks using the kademlia topology [17]. A given process connects to the following process rank with a power of two strides. ...
... accessed on 5 July 2022, is a P2P-distributed file system, designed for storing versioned file data in a decentralized manner [22]. The IPFS has been built on top of the BitTorrent protocol [45] and the Kademlia DHT [46]. BitTorrent is a widely used P2P filesharing system, which in the IPFS enables the efficient relocation of objects between peers composing the infrastructure. ...
Article
Full-text available
Edge computing constitutes a promising paradigm of managing and processing the massive amounts of data generated by Internet of Things (IoT) devices. Data and computation are moved closer to the client, thus enabling latency- and bandwidth-sensitive applications. However, the distributed and heterogeneous nature of the edge as well as its limited resource capabilities pose several challenges in implementing or choosing an efficient edge-enabled storage system. Therefore, it is imperative for the research community to contribute to the clarification of the purposes and highlight the advantages and disadvantages of various edge-enabled storage systems. This work aspires to contribute toward this direction by presenting a performance analysis of three different storage systems, namely MinIO, BigchainDB, and the IPFS. We selected these three systems as they have been proven to be valid candidates for edge computing infrastructures. In addition, as the three evaluated systems belong to different types of storage, we evaluated a wide range of storage systems, increasing the variability of the results. The performance evaluation is performed using a set of resource utilization and Quality of Service (QoS) metrics. Each storage system is deployed and installed on a Raspberry Pi (small single-board computers), which serves as an edge device, able to optimize the overall efficiency with minimum power and minimum cost. The experimental results revealed that MinIO has the best overall performance regarding query response times, RAM consumption, disk IO time, and transaction rate. The results presented in this paper are intended for researchers in the field of edge computing and database systems.
Chapter
Several distributed storage solutions that do not rely on a central server have been proposed over the last few years. Most of them are deployed on public networks on the internet. However, these solutions often do not provide a mechanism for access rights to enable the users to control who can access a specific file or piece of data. In this article, we propose Mutida (from the Latin word “Aditum” meaning “access”), a protocol that allows the owner of a file to delegate access rights to another user. This access right can then be delegated to a computing node to process the piece of data. The mechanism relies on the encryption of the data, public key/value pair storage to register the access control list and on a function executed locally by the nodes to compute the decryption key. After presenting the mechanism, its advantages and limitations, we show that the proposed mechanism has similar functionalities to Wave, an authorization framework with transitive delegation. However, Wave does not require fully trusted nodes. We implement our approach in a Java software program and evaluate it on the Grid’5000 testbed. We compare our approach to an approach based on a protocol relying on Shamir key reconstruction, which provides similar features.
Article
After years of in-depth development of blockchain, various blockchains with different characteristics and suitable for different application scenarios coexist in large numbers. Due to the isolation of blockchains and the high degree of heterogeneity between chains, value transfer and data communication between existing blockchains are facing unprecedented challenges, and the phenomenon of value isolated island is gradually emerging. The cross-chain technology of blockchain is an important technical means to realize the interconnection of blockchains and improve the interoperability and scalability of blockchains. In this paper, the development and application of blockchain cross-chain technology are studied, the background and significance of cross-chain technology are described, the research status of cross-chain technology is expounded, the current mainstream cross-chain technologies and cross-chain projects are introduced, the mentioned cross-chain technologies and cross-chain projects are analyzed and compared. In addition, this paper also summarizes the difficulties existing in the current cross-chain technology and provides solutions for reference, so as to lead to the discussion of the development trend of cross-chain technology, and finally complete the summary of the research content of the full text and the prospect of cross-chain technology. It is hoped that the relevant summary results can help relevant researchers and practitioners quickly grasp the research progress in the field of blockchain interoperability, and obtain relevant knowledge and application methods in this field.
Article
Full-text available
The amount of accessible computational devices over the Internet offers an enormous but latent computational power. Nonetheless, the complexity of orchestrating and managing such devices requires dedicated architectures and tools and hinders the exploitation of this vast processing capacity. Over the last years, the paradigm of (Browser-based) Volunteer Computing emerged as a unique approach to harnessing such computational capabilities, leveraging the idea of voluntarily offering resources. This article proposes VFuse, a groundbreaking architecture to exploit the Browser-based Volunteer Computing paradigm via a ready-to-access volunteer network. VFuse offers a modern multi-language programming environment for developing scientific workflows usingWebAssembly technology without requiring the user any local installation or configuration. We equipped our architecture with a secure and transparent rewarding mechanism based on blockchain technology (Ethereum) and distributed P2P file system (IPFS). Further, the use of Non-Fungible Tokens provides a unique, secure, and transparent methodology for recognizing the users’ participation in the network.We developed a prototype of the proposed architecture and four example applications implemented with our system. All code and examples are publicly available on GitHub.
Article
Full-text available
The popularity of peer-to-peer multimedia file sharing applications such as Gnutella and Napster has created a flurry of recent research activity into peer-to-peer architectures. We believe that the proper evaluation of a peerto -peer system must take into account the characteristics of the peers that choose to participate. Surprisingly, however, few of the peer-to-peer architectures currently being developed are evaluated with respect to such considerations. In this paper, we remedy this situation by performing a detailed measurement study of the two popular peer-to-peer file sharing systems, namely Napster and Gnutella. In particular, our measurement study seeks to precisely characterize the population of end-user hosts that participate in these two systems. This characterization includes the bottleneck bandwidths between these hosts and the Internet at large, IP-level latencies to send packets to these hosts, how often hosts connect and disconnect from the system, how many files hosts share and download, the degree of cooperation between the hosts, and several correlations between these characteristics. Our measurements show that there is significant heterogeneity and lack of cooperation across peers participating in these systems.
Article
Full-text available
Efficiently determining the node that stores a data item in a distributed network is an important and challenging problem. This paper describes the motivation and design of the Chord system, a decentralized lookup service that stores key/value pairs for such networks. The Chord protocol takes as input an m-bit identifier (derived by hashing a higher-level application specific key), and returns the node that stores the value corresponding to that key. Each Chord node is identified by an m-bit identifier and each node stores the key identifiers in the system closest to the node's identifier. Each node maintains an m-entry routing table that allows it to look up keys efficiently. Results from theoretical analysis, simulations, and experiments show that Chord is incrementally scalable, with insertion and lookup costs scaling logarithmically with the number of Chord nodes.
Conference Paper
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.
Conference Paper
Consider a set of shared objects in a distributed network, where several copies of each object may exist at any given time. To ensure both fast access to the objects as well as efficient utilization of network resources, it is desirable that each access request be satisfied by a copy ``close'' to the requesting node. Unfortunately, it is not clear how to achieve this goal efficiently in a dynamic, distributed environment in which large numbers of objects are continuously being created, replicated, and destroyed. In this paper we design a simple randomized algorithm for accessing shared objects that tends to satisfy each access request with a nearby copy. The algorithm is based on a novel mechanism to maintain and distribute information about object locations, and requires only a small amount of additional memory at each node. We analyze our access scheme for a class of cost functions that captures the hierarchical nature of wide-area networks. We show that under the particular cost model considered (i) the expected cost of an individual access is asymptotically optimal, and (ii) if objects are sufficiently large, the memory used for objects dominates the additional memory used by our algorithm with high probability. We also address dynamic changes in both the network and the set of object copies.
Article
A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes. 1
Conference Paper
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties
Article
In today's chaotic network, data and services are mobile and replicated widely for availability, durability, and locality. Components within this infrastructure interact in rich and complex ways, greatly stressing traditional approaches to name service and routing. This paper explores an alternative to traditional approaches called Tapestry. Tapestry is an overlay location and routing infrastructure that provides location-independent routing of messages directly to the closest copy of an object or service using only point-to-point links and without centralized resources. The routing and directory information within this infrastructure is purely soft state and easily repaired. Tapestry is self-administering, faulttolerant, and resilient under load. This paper presents the architecture and algorithms of Tapestry and explores their advantages through a number of experiments.
Article
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing scheme for wide-area peer-to-peer applications. Pastry provides application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a wide range of peer-to-peer applications like global data storage, global data sharing, and naming. An insert operation in Pastry stores an object at a user-defined number of diverse nodes within the Pastry network. A lookup operation reliably retrieves a copy of the requested object if one exists. Moreover, a lookup is usually routed to the node nearest the client issuing the lookup (by some measure of proximity), among the nodes storing the requested object. Pastry is completely decentralized, scalable, and self-configuring; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on a simulated network of 100,000 nodes confirm Pastry's scalability, its ability to self-configure and adapt to node failures, and its good network locality properties.