Article

Authenticity and availability in PIPE networks

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We describe a system, which we call a peer-to-peer information preservation and exchange (PIPE) network, for protecting digital data collections from failure. A significant challenge in such networks is ensuring that documents are replicated and accessible despite malicious sites which may delete data, refuse to serve data, or serve an altered version of the data. We enumerate the services of PIPE networks, discuss a threat model for malicious sites, and propose basic solutions for managing these malicious sites. The basic solutions are inefficient, but demonstrate that a secure system can be built. We also sketch ways to improve efficiency.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Thirdly, for the applications to run smoothly on small devices that we are aiming for, efficient management of local computing resources is a necessity. Those problems have been addressed in other research works, but mostly in an ad-hoc fashion [5], [12], [13]. When developing new applications, these issues have to be treated repeatedly as there exists no generic infrastructure which addresses all aforementioned problems. ...
... A topic discussed often is the one of data availability. [5] argues that " a complete system must ensure that important data remain preserved even if the creator or publisher leaves the system " (p.2). They developed a reliable preservation service built on a P2P architecture, on top of which digital library applications could be built. ...
... the background knowledge to the technical innovations of P2P computing. It also explores some technical issues associated with current P2P implementations. Particularly to our interest, the book mentions " remembering important information " as " a true sign of P2P intelligence " [15](p.145). A topic discussed often is the one of data availability. [5] argues that " a complete system must ensure that important data remain preserved even if the creator or publisher leaves the system " (p.2). They developed a reliable preservation service built on a P2P architecture, on top of which digital library applications could be built. Another strategy, the " dissemination tree " , is used in [4 ...
Article
Full-text available
The design of ad-hoc, wireless, peer-to-peer appli- cations for small mobile devices raises a number of challenges for the developer, with object synchronisation, network failure and device limitations being the most significant. In this paper, we introduce a framework for peer-to-peer application develop- ment that deals with those problems. Other than most current literature, we focus on small peer-to-peer networks for gaming applications. I. I NTRODUCTION With small, mobile devices becoming more powerful, peer- to-peer applications on such devices are becoming increasingly popular. Full scalability, mobility and flexibility are desired features in many application domains, which can be achieved via mutual exchange of information and services over ad-hoc, wireless, peer-to-peer networks. Challenging problems arise in the development of such applications. Firstly, wireless, ad- hoc networks face problems such as stability, data integrity, routing, notification of joining and leaving peers, and, in case of peer failure, fault tolerance. In such networks, the connections of the devices may be highly variable as device may hop from online to offline unpredictably, and thus not reliable. Secondly, since the peers exist in a collaborative environment without central control, synchronisation of peers and distribution of resources become big issues. Thirdly, for the applications to run smoothly on small devices that we are aiming for, efficient management of local computing resources is a necessity. Those problems have been addressed in other research works, but mostly in an ad-hoc fashion (5), (12), (13). When developing new applications, these issues have to be treated repeatedly as there exists no generic infrastructure which addresses all aforementioned problems. In this paper, we present a framework called "FRAGme2004" which we designed for developing collaborative mobile applications that achieve the features mentioned above by using a flexible peer- to-peer architecture. The FRAGme2004 framework has a three-layer architec- ture. The layers inter-communicate via interfaces thus to achieve clear separation. The bottom layer is the Infrastructure Layer. This layer consists of the basic building blocks that address the communication requirements. A layer higher is the Object Layer. Object is the smallest entity that is distributed among the peers. The information and data that needs to be shared in the applications is encapsulated into objects, and this layer takes care of the delivery, synchronization and life-cycle management of objects. The top layer is the Application Layer. A clearly defined API is provided to the developers for easy application development.
... Thirdly, for the applications to run smoothly on the small devices we are aiming for, efficient management of local computing resources is a necessity. Those problems have been addressed in other research works, but mostly in an ad-hoc fashion [5], [12], [13]. When developing new applications, these issues have to be treated repeatedly as there exists no generic infrastructure which addresses all aforementioned problems . ...
... A topic discussed often is data availability. Cooper et al. [5] argue that " a complete system must ensure that important data remain preserved even if the creator or publisher leaves the system " (p.2). They developed a reliable preservation service built on a P2P architecture, on top of which digital library applications could be built. ...
Article
Full-text available
The design of ad-hoc, wireless, P2P applications for small mobile devices raises a number of challenges for the developer, with ob-ject synchronisation, network failure and device limitations being the most significant. In this paper, we introduce the FRAGme2004 frame-work for mobile P2P application development. To address data availabil-ity and stability problems, we devised an agent-based fostering mecha-nism to protect applications against data losses in cases of peer dropping out. In contrast to most current literature, we focus on small scale P2P applications, especially gaming applications.
... .È ç PIPE [7] ¿ ò ¼ Ü ¼ ¯ Ö Ð Ô Ú Î Ä µ µ ¸ ´ Ö AE Ò Ô Ì á ¸ ß Ï µ Í ³ ¿ É Ó Ã Ð Ô º Í Õ ae Ê µ Ð ...
Article
For the information sharing Peer-to-Peer (or P2P) systems, the document security is an important metric to evaluate the performance, so this paper concentrates on the optimization of document security in file sharing P2P system. For the highly autonomous P2P systems, because the document security in P2P systems mainly depends on two aspects: the security of the documents' carrier and the mechanisms related to the document, such as the replica management, the improvement of the document security can not depend on the improvement of the peers' security, but rely on the mechanisms related to the document. A query protocol sensitive to the document security is designed first in this paper. Based on this protocol, the mechanisms related to the document can be formally described as functions, and the improvement of the system document security can be transformed into the mathematical analyses on the function space. Derived from the results of mathematical analyses, a set of algorithms for replica managements are designed, aiming at improving the document security. In ideal situation, this set of algorithms can achieve the optimization for document security seen from the theoretical analyses, and in realistic systems, the algorithms can obtain good effects, approach to the optimal level. The algorithms are verified by lots of
... The use of graphs is quite standard in many scientific fields, including paths in networks for grids [7,18], peer-to-peer networks [5,9], neural and functional networks for artificial intelligence [8,10,11,14,17], automatic differentiation [15], interactive optimization [16], text classification [13], etc. ...
Article
This paper analyzes some graph issues by using the symbolic program Mathematica and its version for the Web, webMathematica. In particular, we consider the problem of graph coloring: the assignment of colors to the vertices/edges of the graph such that adjacent vertices/edges are colored differently. In addition, we address the problem of obtaining the tenacity of binomial trees with Mathematica. Finally, we describe briefly an example of the application of our software to a scheduling problem. (c) 2006 Elsevier B.V. All rights reserved.
... the background knowledge to the technical innovations of P2P computing. It also explores some technical issues associated with current P2P implementations. Particularly to our interest, the book mentions " remembering important information " as " a true sign of P2P intelligence " [15](p.145). A topic discussed often is the one of data availability. [5] argues that " a complete system must ensure that important data remain preserved even if the creator or publisher leaves the system " (p.2). They developed a reliable preservation service built on a P2P architecture, on top of which digital library applications could be built. Another strategy, the " dissemination tree " , is used in [4 ...
Conference Paper
The design of ad-hoc, wireless, peer-to-peer applications for small mobile devices raises a number of challenges for the developer, with object synchronisation, network failure and device limitations being the most significant. In this paper, we introduce a framework for peer-to-peer application development that deals with those problems. In contrast to most current literature, we focus on small peer-to-peer networks for gaming applications.
Conference Paper
Full-text available
Future space facilities that could power our planet and expand our horizons will differ vastly from the satellites and space stations familiar today. Characterized by their immense size and the difficulties of human construction in orbit, future space facilities will be assembled in part by robots. This paper profiles Skyworker, a prototype assembly, inspection, and maintenance (AIM) robot designed for large mass payload transport and assembly tasks. Skyworker is an attached mobile manipulator (AMM) capable of walking and working on the structure it is building.
Conference Paper
The design of ad-hoc, wireless, peer-to-peer applications for small mobile devices raises a number of challenges for the developer, with object synchronisation, network failure, and device limitations being the most significant. In this paper, we introduce the FRAGme2004 framework for mobile P2P application development. To address data availability and stability problems, we have devised an agent-based fostering mechanism to protect applications against data losses in cases of peers dropping out. In contrast to most current literature, we focus on small scale P2P applications, especially gaming applications.
Article
Full-text available
This paper describes an application of Byzantine Agreement [DoSt82a, DoSt82e, LyFF82] to distributed transaction commit. We replace the second phase of one of the commit algorithms of [MoLi83] with Byzantine Agreement, providing certain trade-offs and advantages at the time of commit and providing speed advantages at the time of recovery from failure. The present work differs from that presented in [DoSt82b] by increasing the scope (handling a general tree of processes, and multi-cluster transactions) and by providing an explicit set of recovery algorithms. We also provide a model for classifying failures that allows comparisons to be made among various proposed distributed commit algorithms. The context for our work is the Highly Available Systems project at the IBM San Jose Research Laboratory [AAF-KM83].
Conference Paper
Full-text available
A secure timeline is a tamper-evident historic record of the states through which a system goes throughout its operational history. Secure timelines can help us reason about the temporal ordering of system states in a provable manner. We extend secure timelines to encompass multiple, mutually distrustful services, using timeline entanglement. Timeline entanglement associates disparate timelines maintained at independent systems, by linking undeniably the past of one timeline to the future of another. Timeline entanglement is a sound method to map a time step in the history of one service onto the timeline of another, and helps clients of entangled services to get persistent temporal proofs for services rendered that survive the demise or non-cooperation of the originating service. In this paper we present the design and implementation of Timeweave, our service development framework for timeline entanglement based on two novel disk-based authenticated data structures. We evaluate Timeweave's performance characteristics and show that it can be efficiently deployed in a loosely-coupled distributed system of a few hundred services with overhead of roughly 2-8% of the processing resources of a PC-grade system.
Article
Full-text available
A key escrow encryption system is an encryption system with a backup decryption capability that allows, under certain prescribed conditions, information access only to authorized persons. This article presents a taxonomy for key escrow encryption systems, providing a structure for describing and categorizing the escrow mechanisms of complete systems as well as various design options. A table is presented which applies the taxonomy to several key escrow products or proposals. The sidebar, 'Glossary and Sources,' identifies key terms, commercial products, and proposed systems.
Article
Full-text available
Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disks (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAID to an IBM 3380 and a Fujitsu Super Eagle.
Article
Full-text available
Peer-to-peer file-sharing networks are currently receiving much attention as a means of sharing and distributing information. However, as recent experience shows, the anonymous, open nature of these networks offers an almost ideal environment for the spread of self-replicating inauthentic files.
Article
Full-text available
technique [1, 26, 27]. In this method, client calls are directed to a single primary server, which communicates This paper describes the design and implementation of the with other backup servers and waits for them to respond Harp file system. Harp is a replicated Unix file system before replying to the client. The system masks failures by accessible via the VFS interface. It provides highly availperforming a failover algorithm in which an inaccessible able and reliable storage for files and guarantees that file server is removed from service. When a primary performs operations are executed atomically in spite of concurrency an operation, it must inform enough backups to guarantee and failures. It uses a novel variation of the primary copy that the effects of that operation will survive all subsequent replication techniquethat provides good performance befailovers. cause it allows us to trade disk accesses for network comHarp is one of the first implementations of a primary munication. Harpis intended to be used within a file sercopy scheme that runs on conventional hardware. It has vice in a distributed network; in our current implemensome novel features that allow it to perform well. The key tation, it is accessed via NFS. Preliminary performance performance issues are how to provide quick response for results indicate that Harp provides equal or better response user operations and how to provide good system capacity time and system capacity than an unreplicated implemen- (roughly, the number of operations the system can handle tation of NFS that uses Unix files directly.
Article
Full-text available
OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development.
Article
Full-text available
This paper describes a new replication algorithm that is able to tolerate Byzantine faults. We believe that Byzantinefault -tolerant algorithms will be increasingly important in the future because malicious attacks and software errors are increasingly common and can cause faulty nodes to exhibit arbitrary behavior. Whereas previous algorithms assumed a synchronous system or were too slow to be used in practice, the algorithm described in this paper is practical: it works in asynchronous environments like the Internet and incorporates several important optimizations that improve the response time of previous algorithms by more than an order of magnitude. We implemented a Byzantine-fault-tolerant NFS service using our algorithm and measured its performance. The results show that our service is only 3% slower than a standard unreplicated NFS. 1 Introduction Malicious attacks and software errors are increasingly common. The growing reliance of industry and government on online information...
Article
Full-text available
An Archival Intermemory solves the problem of highly survivable digital data storage in the spirit of the Internet. In this paper we describe a prototype implementation of Intermemory, including an overall system architecture and implementations of key system components. The result is a working Intermemory that tolerates up to 17 simultaneous node failures, and includes a Web gateway for browser-based access to data. Our work demonstrates the basic feasibility of Intermemory and represents significant progress towards a deployable system.
Article
Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disks (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of RAIDs, giving their relative cost/performance, and compares RAID to an IBM 3380 and a Fujitsu Super Eagle.
Article
Over the last 12 years the LOCKSS Program at Stanford has developed and deployed an open source, peer-to-peer sys-tem now comprising about 200 LOCKSS boxes in libraries around the world preserving a wide range of web-published content. Initially supported by NSF, and subsequently by the Mellon Foundation, Sun Microsystems and NDIIPP, the program has since 2004 been sustainable, funded by the li-braries using it. The program won an ACM award for break-through research in fault and attack resistance in peer-to-peer systems. Since it was designed initially for e-journals, the system's design is unusual; it is driven primarily by copyright law. The design principles were: • Minimize changes to existing legal relationships such as subscription agreements. • Reinstate the purchase model of paper. Each library gets its own copy to keep and use for its own readers as long as it wants without fees. • Preserve the original, just what the publisher pub-lished, so that future readers will see all the intellectual content, including the full historical context. • Make access to the preserved content transparent to the reader.
Conference Paper
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.
Conference Paper
A transaction is an atomic update which takes a data base from a consistent state to another consistent state. The Transaction Monitoring Facility (TMF), is a component of the ENCOMPASS distributed data management system, which runs on the Tandem [TM] computer system. TMF provides continuous, fault-tolerant transaction processing in a decentralized, distributed environment. Recovery from failures is transparent to user programs and does not require system halt or restart. Recovery from a failure which directly affects active transactions, such as the failure of a participating processor or the loss of communications between participating network nodes, is accomplished by means of the backout and restart of affected transactions. The implementation utilizes distributed audit trails of data base activity and a decentralized transaction concurrency control mechanism.
Conference Paper
We describe the motivation for moving policy enfor cement for access control down to the digital object level. The reas ons for this include handling of item-specific behaviors, adapting to evolution of d igital objects, and permitting objects to move among repositories and portable dev ices. We then describe our experiments that integrate the Fedora architecture for digital objects and reposi- tories and the PoET implementation of security auto mata to effect such object- centric policy enforcement.
Article
In this paper, we seek to answer a simple question: "How prevalent are denial-of-service attacks in the Internet today?". Our motivation is to understand quantitatively the nature of the current threat as well as to enable longer-term analyses of trends and recurring patterns of attacks. We present a new technique, called "backscatter analysis", that provides an estimate of worldwide denial-of-service activity. We use this approach on three week-long datasets to assess the number, duration and focus of attacks, and to characterize their behavior. During this period, we observe more than 12,000 attacks against more than 5,000 distinct targets, ranging from well known e-commerce companies such as Amazon and Hotmail to small foreign ISPs and dial-up connections. We believe that our work is the only publically available data quantifying denial-of-service activity in the Internet.
Conference Paper
The effectiveness of reputation systems for peer-to-peer resource-sharing networks is largely dependent on the reliability of the identities used by peers in the network. Much debate has centered around how closely one's pseudo-identity in the network should be tied to their real-world identity, and how that identity is protected from malicious spoofing. We investigate the cost in efficiency of two solutions to the identity problem for peer-to-peer reputation systems. Our results show that, using some simple mechanisms, reputation systems can provide a factor of 4 to 20 improvement in performance over no reputation system, depending on the identity model used.
Article
The development of public-key cryptography is described, and its principles are elucidated. The discussion covers exponential key exchange, the trap-door knapsack public-key cryptosystem, the Rivest-Shamir-Adleman (RSA) system, and the breaking of the knapsack cryptosystem. Early responses to public-key systems and the problem of key management are examined. Applications and implementations are described. Significant development in multiplying, factoring, and finding prime numbers which have resulted from public-key research are sketched. Directions in public-key research are discussed
Conference Paper
An Archival Repository reliably stores digital objects for long periods of time (decades or centuries). The archival nature of the system requires new techniques for storing, indexing, and replicating digital objects. In this paper we discuss the specialized indexing needs of a write-once archive. We also present a reliability algorithm for effectively replicating sets of related objects. We describe an administrative user interface and a data import utility for archival repositories. Finally, we discuss and evaluate a prototype repository we have built, the Stanford Archival Vault, SAV. KEYWORDS: archival storage, digital objects, object replication, object indexing, user interface, archival repository 1 INTRODUCTION Information stored and managed by today's digital libraries can be lost within years or decades if special care is not taken. The causes include media and system failures, format obsolescence and bankruptcy of publishers. At Stanford we have implemented a prototype arc...
Article
Reliable computer systems must handle malfunctioning components that give conflicting information to different parts of the system. This situation can be expressed abstractly in terms of a group of generals of the Byzantine army camped with their troops around an enemy city. Communicating only by messenger, the generals must agree upon a common battle plan. However, one of more of them may be traitors who will try to confuse the others. The problem is to find an algorithm to ensure that the loyal generals will reach agreement. It is shown that, using only oral messages, this problem is solvable if and only if more than two-thirds of the generals are loyal; so a single traitor can confound two loyal generals. With unforgeable written messages, the problem is solvable for any number of generals and possible traitors. Applications of the solutions to reliable computer systems are then discussed.
Conference Paper
This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer applications. Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties
Article
We present a design for a system of anonymous storage which resists the attempts of powerful adversaries to find or destroy any stored data. We enumerate distinct notions of anonymity for each party in the system, and suggest a way to classify anonymous systems based on the kinds of anonymity provided. Our design ensures the availability of each document for a publisher-specified lifetime. A reputation system provides server accountability by limiting the damage caused from misbehaving servers. We identify attacks and defenses against anonymous storage services, and close with a list of problems which are currently unsolved.
Article
Efficiently determining the node that stores a data item in a distributed network is an important and challenging problem. This paper describes the motivation and design of the Chord system, a decentralized lookup service that stores key/value pairs for such networks. The Chord protocol takes as input an m-bit identifier (derived by hashing a higher-level application specific key), and returns the node that stores the value corresponding to that key. Each Chord node is identified by an m-bit identifier and each node stores the key identifiers in the system closest to the node's identifier. Each node maintains an m-entry routing table that allows it to look up keys efficiently. Results from theoretical analysis, simulations, and experiments show that Chord is incrementally scalable, with insertion and lookup costs scaling logarithmically with the number of Chord nodes.
Article
Digital archives can best survive failures if they have made several copies of their collections at remote sites. In this paper, we discuss how autonomous sites can cooperate to provide preservation by trading data. We examine the decisions that an archive must make when forming trading networks, such as the amount of storage space to provide and the best number of partner sites. We also deal with the fact that some sites may be more reliable than others. Experimental results from a data trading simulator illustrate which policies are most reliable. Our techniques focus on preserving the "bits" of digital collections; other services that focus on other archiving concerns (such as preserving meaningful metadata) can be built on top of the system we describe here.
LOCKSS (lots of copies keep stuff safe), in: Proceedings of the Preservation
  • V Reich
V. Reich, D. Rosenthal, LOCKSS (lots of copies keep stuff safe), in: Proceedings of the Preservation 2000, November 2000.
Replication in the Harp file system
  • B Liskov
OceanStore: an architecture for global-scale persistent storage
  • J Kubiatowicz