Article

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We describe Freenet, an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity of both authors and readers. Freenet operates as a network of identical nodes that collectively pool their storage space to store data files and cooperate to route requests to the most likely physical location of data. No broadcast search or centralized location index is employed. Files are referred to in a location-independent manner, and are dynamically replicated in locations near requestors and deleted from locations where there is no interest. It is infeasible to discover the true origin or destination of a le passing through the network, and difficult for a node operator to determine or be held responsible for the actual physical contents of her own node.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In order to provide free flow of information and retrieval over the Internet, some decentralized data retrieval systems [19, 23, 24,30,75] have been introduced to make Internet censorship more difficult. These decentralized data-retrieval systems are indeed more trustworthy than the traditional centralized search systems, since users can exchange and retrieve information without depending on centralized servers. ...
... On the other hand, the unstructured approach [19,23,24,30,60,75] is typically based on gossiping, which uses randomization and requires the nodes to find each other by exchanging messages over existing links. Some unstructured systems like [24] or the random-walk strategy [23] use flooding of requests to find or replicate queries and data [19] requires nodes that successfully respond to requests to store more metadata, and to answer more requests. ...
... On the other hand, the unstructured approach [19,23,24,30,60,75] is typically based on gossiping, which uses randomization and requires the nodes to find each other by exchanging messages over existing links. Some unstructured systems like [24] or the random-walk strategy [23] use flooding of requests to find or replicate queries and data [19] requires nodes that successfully respond to requests to store more metadata, and to answer more requests. Quasar [75] uses a rendezvous-less event routing infrastructure to routes messages directly to nearby group members using local gradients of aggregated vectors. ...
Article
Full-text available
The challenge of a mobile ad-hoc network (MANET) or hybrid wireless network (HWN) is that every node must rely on others to forward its messages, but selfish nodes might refuse to forward those messages or move to other subnetworks at any time, thus degrading network speed and retrieval efficiency. This paper presents a system called “cRedit-based and Reputation Retrieval system” that employs a distributed message transmission protocol to avoid censorship problems, forwarding and reliable routing algorithms to ensure reliable message routing, reputation and credit algorithms to encourage greater cooperation incentives, location and refresh algorithms to deal with nodes’ limited view even in high-churn network, and a protection algorithm to address the problem of selfish nodes. A series of experiments confirmed the system’s effectiveness in achieving superior network scalability, high retrieval rates, low transmission costs, and mobility resilience in hybrid wireless networks.
... Moreover, there have been significant works on designing file-sharing systems that allow people to anonymously store, publish, and retrieve data (see survey [5] for more information). Peer-to-peer storage and retrieval systems such as Freenet [66], FreeHaven [67] and, later, GnuNet [68], [69] provide anonymous persistent data stores. They use multiple hops to retrieve data associated with a key in a distributed data store. ...
... The adversary uses the patterns of packet inter-arrival times to link the network participation's patterns [66]. ...
Article
Full-text available
Traffic analysis attacks can counteract end-to-end encryption and use leaked communication metadata to reveal information about communicating parties. With an ever-increasing amount of traffic by an ever-increasing number of networked devices, communication privacy is undermined. Therefore, Anonymous Communication Systems (ACSs) are proposed to hide the relationship between transmitted messages and their senders and receivers, providing privacy properties known as anonymity, unlinkability, and unobservability. This article aims to review research in the ACSs field, focusing on Dining Cryptographers Networks (DCNs). The DCN-based methods are information-theoretically secure and thus provide unconditional unobservability guarantees. Their adoption for anonymous communications was initially hindered because their computational and communication overhead was deemed significant at that time, and scalability problems occurred. However, more recent contributions, such as the possibility to transmit messages of arbitrary length, efficient disruption handling and overhead improvements, have made the integration of modern DCN-based methods more realistic. In addition, the literature does not follow a common definition for privacy properties, making it hard to compare the approaches’ gains. Therefore, this survey contributes to introducing a harmonized terminology for ACS privacy properties, then presents an overview of the underlying principles of ACSs, in particular, DCN-based methods, and finally, investigates their alignment with the new harmonized privacy terminologies. Previous surveys did not cover the most recent research advances in the ACS area or focus on DCN-based methods. Our comprehensive investigation closes this gap by providing visual maps to highlight privacy properties and discussing the most promising ideas for making DCNs applicable in resource-constrained environments.
... Moreover, there have been significant works on designing file-sharing systems that allow people to anonymously store, publish, and retrieve data (see survey [50] for more information). Peer-to-peer storage and retrieval systems such as Freenet [30], FreeHaven [46] and, later, GnuNet [17,68] provide anonymous persistent data stores. They use multiple hops to retrieve data associated with a key in a distributed data store. ...
... Then, the attacker is maybe able to correlate incoming and outgoing packets through timing analysis [138,50,45]. The adversary uses the patterns of packet inter-arrival times to link the network participation's patterns [30]. ...
Preprint
Full-text available
Traffic analysis attacks can counteract end-to-end encryption and use the leaked communication metadata to reveal information about the communicating parties. With an ever-increasing amount of traffic by an ever-increasing amount of networked devices, this undermines communication privacy and goes against the uptrend of limiting personal data collection. Therefore, Anonymous Communication Systems (ACSs) are proposed to protect the users' privacy by hiding the relationship between transmitted messages and their senders and receivers, providing privacy properties known as anonymity, unlinkability and unobservability. This article aims to review the research in the ACSs field based on its applicability in real-world scenarios. First, we present an overview of the underlying principles of ACSs and different methods. Then, we focus on Dining Cryptographers Networks (DCNs) and the methods for anonymous communication that are based on them. We investigate the alignment of ACSs with the privacy terminologies. Most notably, the DCN-based methods are information-theoretically secure and thus provide unconditional unobservability guarantees. Their initial adoption for anonymous communications was hindered initially as their computational and communication overhead was deemed too significant at that time and scalability problems occurred. However, several more recent contributions such as the possibility to transmit arbitrary length messages, efficient handling of disruptors and improvements in overhead made the integration of modern DCN-based methods more realistic. Previous surveys on ACSs did not cover the most recent research advances in this area or did not focus on DCN-based methods. This comprehensive investigation of modern ACSs and DCN-based systems closes this gap.
... In Freenet [43], queries are not flooded, but routed in a way reminiscent to a depth-first search. Each node receiving a query checks if it has stored the queried data, and if it does, the query is backtracked to the original sender with the data; if not, the query is forwarded to the adjacent node that is the most likely to have the data. ...
... It is straightforward to convert Algorithm 4 for use with an event queue: instead of checking, at each time step, each node that might be involved in a failing certificate, like on lines 4 through 18, a node just has to take all certificates at the top of the priority queue that are failing, and add them to Fail, the set of failing certificates. if req(+) has been received from w then 34: if reply(v, +) has been received then 40: newLookout ← v 41: if newLookout + u then 42: Send remove(S + , u) to + u (*inform previous lookout*) 43: if remove(S + , v) has been received then 51: 52: if add(S + , v) has been received then 53: ...
Thesis
With the development of information technology, computing devices become more available and more connected, and are increasingly used in contexts where mobility plays an important role. Video games enable huge amounts of players to interact in some virtual world, and networks of connected vehicles are envisioned to improve road safety. In such distributed contexts, one cannot assume that each entity can share its position with all the participants. This thesis presents different methods to allow moving entities, that we call nodes, and that move in some metric space, to answer to queries related to their distances, with guarantees on the accuracy of the approximations.First, we propose a synchronous distributed algorithm, that allows two connected nodes to estimate the distance between them, with a guarantee on the relative error. It is proven that when applied to nodes that follow random movements, the algorithm is optimal in terms of number of exchanged messages.Then, queries returning, for a given node, the set of nodes that are at a distance smaller than a given distance r are studied. We describe a synchronous distributed algorithm for positions on a line, that ensures each node is connected with all nodes at a distance r, where r is a fixed value given as input to the algorithm; the answer to the query is thus done in O(1). The algorithm needs O(1) communication rounds per node movement, and the local memory cost is of the same order as the worst case largest size of nodes returned by a query.After that, two algorithms are given for positions in any metric space of constant doubling dimension, where r may vary and is now a parameter of the query. First, a centralized algorithm with a computational cost of O(log Φ) operations per movement of a node (where Φ is the ratio between maximal and minimal distance between two nodes), and with O(n) memory usage (where n is the number of nodes), and then a distributed algorithm that needs O(1) communication rounds per movement of the nodes, and that uses O(n) memory for a node at worst, but O(n) memory in total.
... On the other hand, object storage software systems which leverage Peer-To-Peer (P2P) mechanisms, can cope better with the proposed edge storage requirements. Clarke et al. [3] propose an epidemic approach for decentralized storage systems which offer data publication, replication and retrieval. However, the proposed approach presents limitations regarding resource discovery. ...
... A PVC is bound to a Persistent Volume (PV) object that is provisioned by the chosen storage provider. This interaction is standardised through the Container Storage Interface (CSI) 3 . ...
... Object storage software systems such as those mentioned above, leverage Peer-To-Peer (P2P) mechanisms and are able to cope better with the proposed edge storage requirements. Clarke et al. [25] propose an epidemic approach for decentralized storage systems which offer data publication, replication and retrieval. However, the proposed approach presents limitations regarding resource discovery. ...
... The method can be used iteratively to increase the accuracy of the resulting mesh. More detailed description of the method can be found in 25 . Ball pivoting by F. Bernardini et al. [65] and Poisson surface reconstruction by M. Kazhdan et al. [66] are other meshing algorithms that need to be mentioned here. ...
Article
Full-text available
In recent years, the emergence of XR (eXtended Reality) applications, including Holography, Augmented, Virtual and Mixed Reality, has resulted in the creation of rather demanding requirements for Quality of Experience (QoE) and Quality of Service (QoS). In order to cope with requirements such as ultra-low latency and increased bandwidth, it is of paramount importance to leverage certain technological paradigms. The purpose of this paper is to identify these QoE and QoS requirements and then to provide an extensive survey on technologies that are able to facilitate the rather demanding requirements of Cloud-based XR Services. To that end, a wide range of enabling technologies are explored. These technologies include e.g. the ETSI (European Telecommunications Standards Institute) Multi-Access Edge Computing (MEC), Edge Storage, the ETSI Management and Orchestration (MANO), the ETSI Zero touch network & Service Management (ZSM), Deterministic Networking, the 3GPP (3rd Generation Partnership Project) Media Streaming, MPEG’s (Moving Picture Experts Group) Mixed and Augmented Reality standard, the Omnidirectional MediA Format (OMAF), ETSI’s Augmented Reality Framework etc.
... There is no absolute standard to select one assignment over another but rather depends on the priorities. For each pk, let alreadyF ined[id][ pk ] = T rue; 10 Distribute reward vR to witness; 11 Distribute γ-compensation to usr; 12 On new (id, e type, evidence): 13 Charge vA from accuser msg.sender; 14 AccusationVal(id, e type, evidence) 15 On new (pkList, requests): 16 Lock vS from msg.sender for each pk ∈ pkList; 17 Issue a unique identifier id for the request; 18 Store requests, pkList in Journal[id]; 19 for pk ∈ pkList do 20 alreadyF ined [id][ pk ] = f alse; Algorithm 1 presents essential functions for implementing the game and payment rules. In Line 15, a user submits queries to a list of servers. ...
... Clarke et al. [12] circumvent collusion problem in private data retrieval through anonymization. As a result, a privacy breach is not bound with an identified entity. ...
Preprint
Full-text available
For distributed protocols involving many servers, assuming that they do not collude with each other makes some secrecy problems solvable and reduces overheads and computational hardness assumptions in others. While the non-collusion assumption is pervasive among privacy-preserving systems, it remains highly susceptible to covert, undetectable collusion among computing parties. This work stems from an observation that if the number of available computing parties is much higher than the number of parties required to perform a secure computation, collusion attempts could be deterred. We focus on the standard problem of multi-server private information retrieval (PIR) that inherently assumes that servers do not collude. For PIR application scenarios, such as those for blockchain light clients, where the available servers are plentiful, a single server's deviating action is not tremendously beneficial to itself. We can make deviations undesired through small amounts of rewards and penalties, thus raising the bar for collusion significantly. For any given multi-server 1-private PIR (i.e. the base PIR scheme is constructed assuming no pairwise collusion), we provide a collusion mitigation mechanism. We first define a two-stage sequential game that captures how rational servers interact with each other during collusion, then determine the payment rules such that the game realizes the unique sequential equilibrium: a non-collusion outcome. We also offer privacy protection for an extended period beyond the time the query executions happen, and guarantee user compensation in case of a reported privacy breach. Overall, we conjecture that the incentive structure for collusion mitigation to be functional towards relaxing the strong non-collusion assumptions across a variety of multi-party computation tasks.
... Freenet [83] is a file sharing application (text, sounds, data, etc.) that also protects the anonymity of authors and readers. Freenet nodes contain 3 types of information: keys (which are similar to Web URLs), addresses of other Freenet nodes that are also likely to know similar keys and possibly data corresponding to these keys. ...
... This model allows the system to maintain the following three properties[67]:Convergence: when the same set of operations is executed on all sites, they will all have the same state; Causality: if an operation O is executed before another operation O', the same execution order is respected on all sites ; Intent: for any operation O, the effects of the execution of O on all sites are the same as the intentions of O, and the effect of the execution of O does not change the effects of the independent operations. Other solutions[83,93,99] also rely on mechanisms for caching relevant data according to the meta-data of peer requests. Replacement (or deletion) policies identify which data to move to persistent storage or permanently delete for proper cache management.Using metrics such as complexity in time and space and the traffic effect on network architecture, existing solutions are compared by analyzing their performance in relation to their own properties. ...
Thesis
Access to the Web of Data is nowadays of real interest for research, mainly in the sense that the clients consuming or processing this data are more and more numerous and have various specificities (mobility, Internet, autonomy, storage, etc.). Tools such as Web applications, search engines, e-learning platforms, etc., exploit the Web of Data to offer end-users services that contribute to the improvement of daily activities. In this context, we are working on Web of Data access, considering constraints such as customer mobility and intermittent availability of the Internet connection. We are interested in mobility as our work is oriented towards end-users with mobile devices such as smartphones, tablets, laptops, etc. The intermittency of Internet connection refers herein to scenarios of unavailability of national or international links that make remote data sources inaccessible. We target a scenario where users form a peer-to-peer network such that anyone can generate information and make it available to everyone else on the network. Thus, we survey and compare several solutions (models, architectures,etc.) dedicated to Web of Data access by mobile contributors and discussed in relation to the underlying network architectures and data models considered. We present a conceptual study of peer-to-peer solutions based on gossip protocols dedicated to design the connected overlay networks and present a detailed analysis of data replication systems whose general objective is to ensure a system’s local data availability. On the basis of this work, we proposed an architecture adapted to constraining environments and allowing mobile contributors to share locally, via a browser network, an RDF dataset. The architecture consists of 3 levels: single peers, super peers and remote sources. Two main axes are considered for the implementation of this architecture: firstly the construction and maintenance of connectivity ensured by the gossip protocol, and secondly the high availability of data ensured by a replication mechanism. Our approach has the particularity to consider the location of each participant’s neighbours to increase the search perimeter and to integrate super-peers on which the data graph is replicated allowing data availability improvement. We finally carried out an experimental evaluation of our architecture through extensive simulation configured to capture key aspects of our motivating scenario of supporting data exchange between the participants of a local event.
... Most consistency maintenance methods update files by relying on a structure [9], [10], [11] or message spreading [12], [13]. Though these methods generally can be applied to all file replication methods, they cannot be exploited to their full potential without considering time varying and dynamic replica nodes. ...
... In these methods updates are not guaranteed to be propagated to each replica and redundant message will generate high overhead. FreeNet [13] replicates file on the path from the file requester to the target routes and it routes an update to other nodes based on key closeness. ...
Article
In p2p systems, file replication and replica consistency maintenance are most widely used techniques for better system performance. Most of the file replication methods replicates file in all nodes or at two ends in a clientserver query path or close to the server, leading to low replica utilization, produces unnecessary replicas and hence extra consistency maintenance overhead. Most of the consistency maintenance methods depends on either message spreading or structure based for update message propagation without considering file replication dynamism, leading to inefficient file update and outdated file response. These paper presents an Efficient and Adaptive file Replication and consistency Maintenance (EARM) that combines file replication and consistency maintenance mechanism that achieves higher query efficiency in file replication and consistency maintenance at a low cost. Instead of accepting passively file replicas and updates, each node determines file replication and update polling by adapting to time-varying file query and update rates. Simulation results demonstrate the effectiveness of EARM in comparison with other approaches.
... The second wave of decentralized solutions was achieved through fully distributed technology; i.e., P2P networks without classical servers but instead using ordinary computers (different from classical cluster/grid parallel computing). There have been multiple attempts to offer P2P web services [24,25], such as Freenet for censorship-resistant communication [26], although broad adoption has mostly been limited to the field of file sharing; e.g., eDonkey, BitTorrent [27]. ...
Article
Full-text available
The current state of the web, which is dominated by centralized cloud services, raises several concerns regarding different aspects such as governance, privacy, surveillance, and security. A way to address these issues is to decentralize the platforms by adopting new distributed technologies, such as IPFS and Blockchain, which follow a full peer-to-peer model. This work proposes a set of guidelines to design decentralized systems, taking the different trade-offs these technologies face with regard to their consistency requirements into consideration. These guidelines are then illustrated with the design of a decentralized questions and answers system. This system serves to illustrate a framework to create decentralized services and applications that uses IPFS and Blockchain technologies and incorporates the discussion and guidelines of the paper, providing solutions for data access, data provenance, and data discovery. Thus, this work proposes a framework to assist in the design of new decentralized systems, proposing a set of guidelines to choose the appropriate technologies depending on the relevant requirements; e.g., considering if Blockchain technology may be required or IPFS might be sufficient.
... There have been several works targeting user anonymity in the field of peer-to-peer content distribution networks, such as Freenet [193] and Free Haven [194]. While Freenet allows data owners to encrypt data with their own names, Free Haven does not provide such an encryption mechanism to protect data confidentiality against storage hosts. ...
Thesis
Cloud-based data storage and sharing services have been proven successful since the last decades. The underlying model helps users not to expensively spend on hardware to store data while still being able to access and share data anywhere and whenever they desire. In this context, security is vital to protecting users and their resources. Regarding users, they need to be securely authenticated to prove their eligibility to access resources. As for user privacy, showing credentials enables the service provider to detect sharing-related people or build a profile for each. Regarding outsourced data, due to complexity in deploying an effective key management in such services, data is often not encrypted by users but service providers. This enables them to read users’ data. In this thesis, we make a set of contributions which address these issues. First, we design a password-based authenticated key exchange protocol to establish a secure channel between users and service providers over insecure environment. Second, we construct a privacy-enhancing decentralized public key infrastructure which allows building secure authentication protocols while preserving user privacy. Third, we design two revocable ciphertext-policy attribute-based encryption schemes. These provide effective key management systems to help a data owner to encrypt data before outsourcing it while still retaining the capacity to securely share it with others. Fourth, we build a decentralized data sharing platform by leveraging the blockchain technology and the IPFS network. The platform aims at providing high data availability, data confidentiality, secure access control, and user privacy.
... Another could be something akin to Freenet [111]. Peers could allocate a certain amount of their unused storage space to be used to automatically download, cache, and rehost shards of other datasets. ...
Preprint
Full-text available
The most pressing problems in science are neither empirical nor theoretical, but infrastructural. Scientific practice is defined by coproductive, mutually reinforcing infrastructural deficits and incentive systems that everywhere constrain and contort our art of curiosity in service of profit and prestige. Our infrastructural problems are not unique to science, but reflective of the broader logic of digital enclosure where platformatized control of information production and extraction fuels some of the largest corporations in the world. I have taken lessons learned from decades of intertwined digital cultures within and beyond academia like wikis, pirates, and librarians in order to draft a path towards more liberatory infrastructures for both science and society. Based on a system of peer-to-peer linked data, I sketch interoperable systems for shared data, tools, and knowledge that map onto three domains of platform capture: storage, computation and communication. The challenge of infrastructure is not solely technical, but also social and cultural, and so I attempt to ground a practical development blueprint in an ethics for organizing and maintaining it. I intend this draft as a rallying call for organization, to be revised with the input of collaborators and through the challenges posed by its implementation. I argue that a more liberatory future for science is neither utopian nor impractical -- the truly impractical choice is to continue to organize science as prestige fiefdoms resting on a pyramid scheme of underpaid labor, playing out the clock as every part of our work is swallowed whole by circling information conglomerates. It was arguably scientists looking for a better way to communicate that created something as radical as the internet in the first place, and I believe we can do it again.
... IPFS also strives to be censorship resistant. Approaches such as Freenet [13] and Wuala [41] have similar goals. These work by storing encrypted content across an arbitrary subset of peers. ...
Preprint
Full-text available
Recent years have witnessed growing consolidation of web operations. For example, the majority of web traffic now originates from a few organizations, and even micro-websites often choose to host on large pre-existing cloud infrastructures. In response to this, the "Decentralized Web" attempts to distribute ownership and operation of web services more evenly. This paper describes the design and implementation of the largest and most widely used Decentralized Web platform - the InterPlanetary File System (IPFS) - an open-source, content-addressable peer-to-peer network that provides distributed data storage and delivery. IPFS has millions of daily content retrievals and already underpins dozens of third-party applications. This paper evaluates the performance of IPFS by introducing a set of measurement methodologies that allow us to uncover the characteristics of peers in the IPFS network. We reveal presence in more than 2700 Autonomous Systems and 152 countries, the majority of which operate outside large central cloud providers like Amazon or Azure. We further evaluate IPFS performance, showing that both publication and retrieval delays are acceptable for a wide range of use cases. Finally, we share our datasets, experiences and lessons learned.
... This is an iterative process that is performed until the query is answered or until all of the neighbors have been queried. FreeNet is an example of a P2P system that uses the DFS approach [26,27]. Instead of sending the query to all neighboring peers as BFS does, the controlled flooding forwards the query to the arbitrarily selected k neighbors. ...
... A Freenet trata-se de uma rede desconhecida pela maioria do público, até mesmo para os maiores adeptos dos programas de compartilhamento P2P [Clarke et al. 2001]. Não se trata de um programa, e sim de uma rede tipicamente deep web. ...
Conference Paper
Full-text available
O presente artigo tem por escopo científico traçar a similitude en-tre os registros policiais relativos ao compartilhamento de materiais de abuso sexual infantil (CSAM), um dos crimes previstos no Estatuto da Criança e do Adolescente e a sua efetiva distribuição virtual. Para tal fim, foi utilizado um software policial de monitoramento sobre as redes de pares (P2P), tendo por recorte metodológico as ocorrências, tanto de compartilhamento de efetivo de material como de registo de ocorrência policial, observadas tão somente no Es-tado de São Paulo. Por fim, o estudoé capaz de demonstrar que os registros policiais refletem, de maneira fiel, a localidade das ocorrências, ainda que em proporções numéricas menores em relacãoàs virtuais. 1. Introdução A popularização da Internet como consequência direta da inclusão digital promovida por muitos países, inclusive no Brasil, com a expansão da banda larga de comunicação, as redes móveis e as novas tecnologias disponíveis nas mãos das pessoas se tornou algo inevitável. Hoje não há quem não tenha pelo menos um smartphone, mesmo nas cama-das mais baixas da população [Freitas et al. 2015]. Esse fenômeno digital proporcionou
... Some proposals focus on the way the data is shared through a set of entities, while others focus on centralising the storage on just one point. Examples of these can be the HAT project [20], Freenet [21], the DAT Foundation [22], the reefold Net [23], the ActivityPub [24], the Safe Network [25], or the BBC Databox [26]. However, recently, the SOLID initiative has become one of the most relevant. ...
Article
Full-text available
Personal information has become one of the most valuable coins on the Internet. Companies gather a massive amount of data to create rich profiles of their users, trying to understand how they interact with the platform and what are their preferences. However, these profiles do not follow any standard and are usually incomplete in the sense that users provide different subsets of information to distinct platforms. Thus, the quality and quantity of the data vary between applications and tends to inconsistency and duplicity. In this context, the Social Linked Data (SOLID) initiative proposes an alternative to separate the user’s information from the platforms which consume it, defining a unique and autonomous source of data. Following this line, this study proposes Pushed SOLID, an architecture that integrates SOLID in the user’s smartphone to store and serve their information from a single entity controlled by the users themselves. In this study, we present an implementation of the Pushed SOLID proposal with the aim of experimentally assessing the technical viability of the solution. Satisfactory performance results have been obtained at battery consumption and response time. Furthermore, users have been interviewed about the proposal, and they find the solution attractive and reliable. This solution can improve the way data are stored on the Internet, empowering users to manage their own information and benefiting third party applications with consistent and update profiles.
... This technique is useful in combating censorship; seemingly benign traffic to uncensored proxies can be redirected to their real destinations, e.g., censored websites hosted outside the restricted region. Anonymous peer-to-peer file sharing techniques, e.g., Freenet [24] and RetroShare [25], are also useful in circumventing censorship. It is important to note that the degree of anonymity eventually depends on how large the suspect pool is. ...
Thesis
Infrastructureless Delay Tolerant Networks (DTNs) composed of commodity mobile devices have the potential to support communication applications resistant to blocking and censorship, as well as certain types of surveillance. In this thesis we study the utility, practicality, robustness, and security of these networks. We collected two sets of wireless connectivity traces of commodity mobile devices with different granularity and scales. The first dataset is collected through active installation of measurement software on volunteer users' own smartphones, involving 111 users of a DTN microblogging application that we developed. The second dataset is collected through passive observation of WiFi association events on a university campus, involving 119,055 mobile devices. Simulation results show consistent message delivery performances of the two datasets. Using an epidemic flooding protocol, the large network achieves an average delivery rate of 0.71 in 24 hours and a median delivery delay of 10.9 hours. We show that this performance is appropriate for sharing information that is not time sensitive, e.g., blogs and photos. We also show that using an energy efficient variant of the epidemic flooding protocol, even the large network can support text messages while only consuming 13.7% of a typical smartphone battery in 14 hours. We found that the network delivery rate and delay are robust to denial-of-service and censorship attacks. Attacks that randomly remove 90% of the network participants only reduce delivery rates by less than 10%. Even when subjected to targeted attacks, the network suffered a less than 10% decrease in delivery rate when 40% of its participants were removed. Although structurally robust, the openness of the proposed network introduces numerous security concerns. The Sybil attack, in which a malicious node poses as many identities in order to gain disproportionate influence, is especially dangerous as it breaks the assumption underlying majority voting. Many defenses based on spatial variability of wireless channels exist, and we extend them to be practical for ad hoc networks of commodity 802.11 devices without mutual trust. We present the Mason test, which uses two efficient methods for separating valid channel measurement results of behaving nodes from those falsified by malicious participants.
... In this section, we provide some background information to explain our taxonomy and the attacks discussed in this paper. The Tor network [6], which is one of the most widely used anonymity networks today (along with other popular networks such as I2P [17] and Freenet [18]), has been using the concept of onion routing [19]. Tor is an overlay network based on Transmission Control Protocol (TCP) that builds circuits from a user to the destination server, which generally consists of three voluntary relays. 1 Figures 1 and 2 show the components of a Tor network for a standard circuit, and hidden services respectively. ...
Article
Full-text available
Anonymity networks are becoming increasingly popular in today’s online world as more users attempt to safeguard their online privacy. Tor is currently the most popular anonymity network in use and provides anonymity to both users and services (hidden services). However, the anonymity provided by Tor is also being misused in various ways. Hosting illegal sites for selling drugs, hosting command and control servers for botnets, and distributing censored content are but a few such examples. As a result, various parties, including governments and law enforcement agencies, are interested in attacks that assist in de-anonymising the Tor network, disrupting its operations, and bypassing its censorship circumvention mechanisms. In this survey paper, we review known Tor attacks and identify current techniques for the de-anonymisation of Tor users and hidden services. We discuss these techniques and analyse the practicality of their execution method. We conclude by discussing improvements to the Tor framework that help prevent the surveyed de-anonymisation attacks.
... erefore, anonymous communication technology is proposed to protect the communication data and help the user conceal his IP address and other pieces of private information. Anonymous communication technology has developed from the Mix [1] technology to the commonly used Tor ( e second-generation Onion Router) [2], I2P (Invisible Internet Project), Freenet [3], and so on. Besides, some blockchain-based anonymous communication techniques have been proposed in recent years. ...
Article
Full-text available
Tor is an anonymous communication network used to hide the identities of both parties in communication. Apart from those who want to browse the web anonymously using Tor for a benign purpose, criminals can use Tor for criminal activities. It is recognized that Tor is easily intercepted by the censorship mechanism, so it uses a series of obfuscation mechanisms to avoid censorship, such as Meek, Format-Transforming Encryption (FTE), and Obfs4. In order to detect Tor traffic, we collect three kinds of obfuscated Tor traffic and then use a sliding window to extract 12 features from the stream according to the five-tuple, including the packet length, packet arrival time interval, and the proportion of the number of bytes sent and received. And finally, we use XGBoost, Random Forest, and other machine learning algorithms to identify obfuscated Tor traffic and its types. Our work provides a feasible method for countering obfuscated Tor network, which can identify the three kinds of obfuscated Tor traffic and achieve about 99% precision rate and recall rate.
... The continuous breaches of Internet-based services and personal data trade scandals in this smart digital world are turning privacy and anonymity more compelling than ever [21,28,23]. Practical anonymization services, like The Onion Router [9] (Tor), and others [17,5,7], are of particular interest because they can provide privacy by hiding the user identity which is important to support human rights, such as the right to freedom of expression, freedom of assembly, freedom of association, and the right to vote. Despite the introduction of legal instruments that recognize this direction, e.g., by the United Nations Human Rights and European Council [18,10,11], there is a lack for safe and generic technical tools to realize anonymous communication. ...
Article
Full-text available
The increasing demand on privacy is driving a notable quest for privacy legal instruments and practical techniques. The Onion Router (Tor) is considered the most practical service for anonymous public communications; however, it is not yet mainstream despite more than one decade of practical use and research. The reason is referred to using message encryption layers, i.e., onions, via relays offered through unknown volunteers which does not fully protect legitimate users nor services. It can also be used for illegal purposes which hampers its admissibility in many countries. In this paper, we introduce a new ecosystem built on top of Tor to broaden its use by legitimate users and at the same time provide provenance when users violate the usage policy terms. We propose Anonymity Service Providers that provide paid relays as a service to users. A user buys this service from different providers to diversify the onion circuit and avoid collude and thus disclosing her identity. Anonymity is maintained as long as the user abides to the policy; otherwise, her identity is disclosed via a reporting system. This accountability reporting system can be implemented over a Smart Contract to make arbitration automatic. This work proposes new use cases and business opportunities that are worth consideration by both the research and business communities.
... P 5 [121] also provides anonymous messaging by partitioning the peerto-peer network into anonymizing broadcast groups. Freenet [122] offers anonymous publication and retrieval of data. Internet Invisible Project (I2P) [123] is a peer-to-peer low-latency anonymous network built on top of the Internet. ...
Conference Paper
Every modern online application relies on the network layer to transfer information, which exposes the metadata associated with digital communication. These distinctive characteristics encapsulate equally meaningful information as the content of the communication itself and allow eavesdroppers to uniquely identify users and their activities. Hence, by exposing the IP addresses and by analyzing patterns of the network traffic, a malicious entity can deanonymize most online communications. While content confidentiality has made significant progress over the years, existing solutions for anonymous communication which protect the network metadata still have severe limitations, including centralization, limited security, poor scalability, and high-latency. As the importance of online privacy increases, the need to build low-latency communication systems with strong security guarantees becomes necessary. Therefore, in this thesis, we address the problem of building multi-purpose anonymous networks that protect communication privacy. To this end, we design a novel mix network Loopix, which guarantees communication unlinkability and supports applications with various latency and bandwidth constraints. Loopix offers better security properties than any existing solution for anonymous communications while at the same time being scalable and low-latency. Furthermore, we also explore the problem of active attacks and malicious infrastructure nodes, and propose a Miranda mechanism which allows to efficiently mitigate them. In the second part of this thesis, we show that mix networks may be used as a building block in the design of a private notification system, which enables fast and low-cost online notifications. Moreover, its privacy properties benefit from an increasing number of users, meaning that the system can scale to millions of clients at a lower cost than any alternative solution.
... The continuous breaches of Internet-based services and personal data trade scandals in this smart digital world are turning privacy and anonymity more compelling than ever [21,28,23]. Practical anonymization services, like The Onion Router [9] (Tor), and others [17,5,7], are of particular interest because they can provide privacy by hiding the user identity which is important to support human rights, such as the right to freedom of expression, freedom of assembly, freedom of association, and the right to vote. Despite the introduction of legal instruments that recognize this direction, e.g., by the United Nations Human Rights and European Council [18,10,11], there is a lack for safe and generic technical tools to realize anonymous communication. ...
Conference Paper
Full-text available
The increasing demand on privacy is driving a notable quest for privacy legal instruments and practical techniques. The Onion Router (Tor) is considered the most practical service for anonymous public communications; however, it is not yet mainstream despite more than one decade of practical use and research. The reason is referred to using message encryption layers, i.e., onions, via relays offered through unknown volunteers which does not fully protect legitimate users nor services. It can also be used for illegal purposes which hampers its admissibility in many countries. In this paper, we introduce a new ecosystem built on top of Tor to broaden its use by legitimate users and at the same time provide provenance when users violate the usage policy terms. We propose Anonymity Service Providers that provide paid relays as a service to users. A user buys this service from different providers to diversify the onion circuit and avoid collude and thus disclosing her identity. Anonymity is maintained as long as the user abides to the policy; otherwise, her identity is disclosed via a reporting system. This accountability reporting system can be implemented over a Smart Contract to make arbitration automatic. This work proposes new use cases and business opportunities that are worth consideration by both the research and business communities.
Article
Since its inception, The Onion Router (TOR) has been discussed as an anonymizing tool used for nefarious purposes. Past scholarship has focused on publicly available lists of onion URLs containing illicit or illegal content. The current study is an attempt to move past these surface-level explanations and into a discussion of actual use data; a multi-tiered system to identify real-world TOR traffic was developed for the task. The researcher configured and deployed a fully functioning TOR “exit” node for public use. A Wireshark instance was placed between the node and the “naked” internet to collect usage data (destination URLs, length of visit, etc.), but not to deanonymize or otherwise unmask TOR users. For 6 months, the node ran and collected data 24 hr per day, which produced a data set of over 4.5 terabytes. Using Python, the researcher developed a custom tool to filter the URLs into human-readable form and to produce descriptive data. All URLs were coded and categorized into a variety of classifications, including e-commerce, banking, social networking, pornography, and cryptocurrency. Findings reveal that most TOR usage is rather benign, with users spending much more time on social networking and e-commerce sites than on those with illegal drug or pornographic content. Likewise, visits to legal sites vastly outnumber visits to illegal ones. Although most URLs collected were for English-language websites, there were a sizable amount for Russian and Chinese sites, which may demonstrate the utilization of TOR in countries where internet access is censored or monitored by government actors. Akin to other new technologies which have earned bad reputations, such as file-sharing program BitTorrent and intellectual property theft or cryptocurrency Bitcoin and online drug sales, this study demonstrates that TOR is utilized by offenders and non-offenders alike.
Chapter
In figures, the cybersecurity landscape is one of the most across-the-border impactable trends in the last years, especially after the begging of the COVID-19 pandemic. Therefore, by the end of Q4 of 2021, more than 281 million people have been victims of data breaches and cyber-threads, costing more than $42.96 million per day. A possible explanation is that most network operators do not provide any mechanism that blocks path tracing. Almost anybody with above-average network security knowledge can use public path tracing tools such as traceroute, enabling malicious users and thread factors to craft sophisticated cyber-attacks easily. Therefore, this paper proposes a cross-platform privacy overlay over the SOCKSv5 protocol. We evaluate the proposed solution in terms of latency, average throughput, and transfer rate.KeywordsCSOCKS5Cloud network access patternsObfuscationPrivacySecurity
Chapter
Full-text available
cel/Teza: Celem rozważań jest wypracowanie propozycji spojrzenia na potencjał pro­ wadzenia badań w ramach nauk o komunikacji i mediach z wykorzystaniem śladów cyfrowych (ang. digital footprints, digital shadows, digital traces) z perspektywy informato­ logii oraz cyberbezpieczeństwa. koncepcja/MeTodyka badań: W rozważaniach teoretycznych bazujących na prze­ glądzie literatury przedmiotu skupiono się na zdefiniowaniu pojęcia śladu cyfrowego i określeniu jego własności i znaczenia w kontekście nauk o informacji oraz pojęcia cyberbezpieczeństwa. wynIkI I wnIoSkI: Ślady cyfrowe dzięki swojej formie i zróżnicowanemu polu ba­ dawczemu pozwalają na prowadzenie wysoko jakościowych pod względem meto­ dologicznym badań naukowych, w których mogą stanowić nie tylko źródło danych badawczych, ale również odzwierciedlenie zachowań informacyjnych użytkowników Internetu. W zależności od kontekstu dane te mogą być wykorzystywane na różne sposoby, np. jako niezbywalny element walki o władzę, prezentacja nastrojów społecz­ nych czy też forma sztuki. orygInalność/warTość poznawcza: Ze względu na swoją unikatowość badania śladów cyfrowych mogą wspomóc rozwój różnego rodzaju narzędzi zabezpieczających aktywność cyfrową użytkowników sieci.
Article
Nowadays, the vast majority of Internet services used to distribute hypermedia content follow a centralized model, which is highly dependent on servers and raises several quality and security concerns. Among other issues, this centralized model creates single points of failure, requires trust on providers to avoid censorship and personal data misuse, and results in a scenario where digital content tends to disappear or be inaccessible over time, for example, when a content creator stops maintaining a site or when the content is moved to another location. To improve this, it is necessary to replicate data and follow more distributed models. Nevertheless, current platforms to distribute content in this way, either do not offer an effective mechanism to maintain the privacy of their users or they offer full-anonymity, which contributes to the dissemination of content that goes beyond the law and moral standards of many users. This paper proposes a novel distributed architecture that enables hypermedia resource distribution ensuring censorship resistance and conditional k-anonymity. In the proposed system, users form groups to share hypermedia content where the anonymity of the publisher is preserved only if the publication follows a set of rules defined by the group. To this end, the proposed system uses threshold discernible ring signatures to enable conditional k-anonymity, the Ethereum blockchain platform to manage groups and user identities, and the InterPlanetary File System to store and share hypermedia resources in a distributed way. This document provides the design for the proposed architecture and protocols, it evaluates system risks and its security properties, and it discusses the proposal in general terms.
Article
Quantum cybersecurity is the study of all facets regarding the security of communication and computation in a distributed network. Significant developments in quantum technologies have outclassed their classical counterparts, thus envisioning the realization of a quantum internet. However, such quantum resources, in the hands of an adversary, can jeopardize network security. In this paper, we study two secure network connectivity concerns, namely, privacy and anonymity , in quantum information retrieval systems. To this end, we propose a state-of-the-art single-server multi-user quantum anonymous private information retrieval (QAPIR) protocol. To actualize this, we utilize anonymous entanglement as a quantum resource. We show that the QAPIR protocol not only provides privacy but also introduces anonymity as an added layer of security in quantum networks. Furthermore, we also detail a comparative security analysis that establishes the desirable properties of our proposal.
Chapter
Early Data Base Machines - DBMs were mainframes running database management systems - DBMSs. Data mining on mainframes was considered too costly, since results obtained by data mining were considered of questionable value. The advent of powerful low-cost microprocessors allowed the building of DBMs affording a high degree of parallelism, such as the Teradata DBC/1012, which has been used for data mining and data warehousing. Active disks process higher priority disk accesses for OnLine Transaction Processing - OLTP, while processing data mining requests as no cost freeblock accesses. Disks with a processor per track capability, such as the Relational Associative Processor - RAP, are no longer feasible because of high track densities, but the concept of associating processing power has been applied to flash storage and DRAM. Systems combining OLTP and Online Analytic Processing - OLAP are discussed.
Article
Decentralized, distributed storage offers a way to reduce the impact of data silos as often fostered by centralized cloud storage. While the intentions of this trend are not new, the topic gained traction due to technological advancements, most notably blockchain networks. As a consequence, we observe that a new generation of peer-to-peer data networks emerges. In this survey paper, we therefore provide a technical overview of the next generation data networks. We use select data networks to introduce general concepts and to emphasize new developments. Specifically, we provide a deeper outline of the Interplanetary File System and a general overview of Swarm, the Hypercore Protocol, SAFE, Storj, and Arweave. We identify common building blocks and provide a qualitative comparison. From the overview, we derive future challenges and research goals concerning data networks.
Chapter
The goals of search engines are to support a simple and convenient search of information. However, they lack the ability to consider search results which are corresponding to the user’s knowledge. A new fully decentralised co-occurrence graph management system, using P2P technology without a controlling authority from a central server, including web search engines called ‘TheBrain’, is proposed to overcome these problems. The key idea of ‘TheBrain’ is to retrieve the possible relevance of search results based on the co-occurrence graphs. Besides, a load balancing mechanism can also be applied to the flow of information. Overall, this decentralised search engine could produce semantic search results related to the search query context.
Chapter
This paper presents a state-of-the-art survey of scientific research related to web search. Essential pillars, such as content discovery and natural language processing, are elaborated in front of a review of the structure of the World Wide Web (WWW) and a taxonomy of search, elaborated from literature. From this, a classification of the Web with respect to structure and search is derived.
Conference Paper
The rigorous analysis of anonymous communication protocols and formal privacy goals have proven to be difficult to get right. Formal privacy notions as in the current state of the art based on indistinguishability games simplify analysis. Achieving them, however can incur prohibitively high overhead in terms of latency. Definitions based on function views, albeit less investigated, might imply less overhead but aren’t directly comparable to state of the art notions, due to differences in the model. In this paper, we bridge the worlds of indistinguishability game and function view based notions by introducing a new game: the ‘‘Exists INDistinguishability” (E·IND), a weak notion that corresponds to what is informally sometimes termed Plausible Deniability. By intuition, for every action in a system achieving plausible deniability there exists an equally plausible, alternative that results in observations that an adversary cannot tell apart. We show how this definition connects the early formalizations of privacy based on function views to recent game-based definitions. This enables us to link, analyze, and compare existing efforts in the field.
Chapter
Adversaries use many nefarious techniques to stay hidden and anonymous during their online activities. But there are still loopholes in the identity hiding services that can be exploited at a granular level to deanonymize the user. The Internet has gone through significant advancement in the last few decades. This technology change provides useful data and information via websites and blogs, but it gives add-on services that help users while on a route. For the last few years, such third-party services were continuously grabbing and capturing our personal information for their own benefits, such as advertisement or providing more recommendations without the consent of users. Furthermore, there is a possibility of sharing data with traders of the black market, where personal information such as mail, phone, or physical address can be sold and used for illegal access to various systems. Such information may also be used for any atrocious act. Many anonymous services are rendered over the Internet to provide low-level anonymity and privacy to communicated data. However, those services still have flaws that can be easily exploited. Such systems are susceptible to various attacks, including exploiting hardware devices, traffic analysis, footprinting, and many more. Proxy servers that help alter IP addresses and location details are also receptive in nature to collect data by authorities. Virtual private network (VPN), which uses the same features as proxy servers, adds an extra encryption layer. This functionality makes VPN a more secure technology compared to proxy servers. Though it has security, VPN has another serious issue of untrustworthy of VPN servers located around the globe. These VPN servers can collect users’ data, or they can be used as network security monitoring (NSM) media. This chapter focuses on anonymous systems like Invisible Internet Project (I2P), Freenet, and JonDonym, services that hide the user identity from the surface Internet. It also includes an introduction to The Onion Router (TOR) without going into details. Moreover, the chapter demonstrates the configure of each system and its crucial elements to make the users’ identity safe from different nemesis.
Chapter
The TOR browser is the most popular browser for surfing the Internet while being anonymous. This paper studies the digital artifacts left behind by TOR browser over the network and within the host. These artifacts give the most crucial forensic evidence for digital investigators to prove any unauthorized or unlawful activities. The paper also presents methods for retrieving more useful artifacts when compared to previous works and also investigates on Firefox, Chrome Incognito, and Internet Explorer. The results show that even the much-acclaimed TOR browser also leaves evidence and traces behind.
Chapter
Breast cancer is most common in middle-aged female population. It is the fourth most dangerous cancer compared to remaining cancers. In recent years, breast cancer patients are significantly increasing, so the early diagnosis of cancer has become a necessary task in the cancer research, to facilitate subsequent clinical management of patients. The prevention of the breast cancer tumor is early detection of the tumor. Early detection of cancer can stop increase in tumor and saves lives. In the field of machine learning classification, cancer patients are classified into two types as benign or malignant. Different preprocessing techniques like filling missing values, applying correlation coefficient, synthetic minority oversampling technique (SMOTE) and tenfold cross-validations are implemented and aptly used to obtain the accuracy. The main context of this study is to identify key features from the dataset and analyze the performance evaluation of different machine learning algorithms like random forest classifier, logistic regression, support vector machine, decision tree, Gaussian Naive Bayes and k-nearest neighbors. Based on the results, the classification model that gives highest accuracy will be used as the best model for cancer prediction.
Chapter
This chapter presents the main differences of the surface web, Deep Web and Dark Web as well as their dependences. It further discusses the nature of the Dark Web and the structure of the most prominent darknets, namely, Tor, I2P and Freenet, and provides technical information regarding the technologies behind these darknets. In addition, it discusses the effects police actions on the surface web can have on the Dark Web, the “dilemma” of usage that anonymity technologies present,as well as the challenges LEAs face while trying to prevent and fight against crime and terrorism in the Dark Web.
Chapter
Advertising has become the most important source of income for a significant number of web-based companies. This income is usually dependent on the personal information that companies gather from their users which has led them to create very rich profiles of their users. However, these profiles do not follow any standard and are usually incomplete in the sense that users provide different subsets of information to each platform. Thus, the quality and quantity of the data varies between applications and tends to inconsistency. In this context, the SOLID initiative proposes an alternative to decentralize the user information giving them complete ownership of their information. In this demo, we propose a proof of concept in which SOLID is used to store the user information in their mobile device, following the People as a Service paradigm to provide this information as a service to third parties.
Chapter
Vector commitments with subvector openings (SVC) [Lai-Malavolta, Boneh-Bunz-Fisch; CRYPTO’19] allow one to open a committed vector at a set of positions with an opening of size independent of both the vector’s length and the number of opened positions. We continue the study of SVC with two goals in mind: improving their efficiency and making them more suitable to decentralized settings. We address both problems by proposing a new notion for VC that we call incremental aggregation and that allows one to merge openings in a succinct way an unbounded number of times. We show two applications of this property. The first one is immediate and is a method to generate openings in a distributed way. The second application is an algorithm for faster generation of openings via preprocessing. We then proceed to realize SVC with incremental aggregation. We provide two constructions in groups of unknown order that, similarly to that of Boneh et al. (which supports aggregating only once), have constant-size public parameters, commitments and openings. As an additional feature, for the first construction we propose efficient arguments of knowledge of subvector openings which immediately yields a keyless proof of storage with compact proofs. Finally, we address a problem closely related to that of SVC: storing a file efficiently in completely decentralized networks. We introduce and construct verifiable decentralized storage (VDS), a cryptographic primitive that allows to check the integrity of a file stored by a network of nodes in a distributed and decentralized way. Our VDS constructions rely on our new vector commitment techniques.
Chapter
Among four ecommerce business models, the customer-to-customer (C2C) model allows customers to exchange goods or services, and get their money through fully or partly direct transactions. This model is usually dependent on mediate companies, e.g. eBay or Craigslist, for solving the main challenges of technology maintenance and quality control. This paper presents a decentralized C2C ecommerce model based on mobile peer-to-peer (P2P) networks. Several advantageous characteristics of the mobile P2P network including autonomy in control and administration, scalability and reliability in peers and resources not only avoid centralized servers and technology maintenance but also facilitate quality control by selecting various quality peers and resources using flexible search mechanisms on mobile devices. We have provided the design of Book Trading Service (BTS) using Gnutella protocol and the prototype of the BTS application on Android mobile platform. The experimental results show the feasibility and effectiveness of the decentralized C2C ecommerce model and the BTS application that can also be applied to several application domains.
Article
The P2P system should be used Proximity information to minimize the load of file request and improve the efficiency of the work .Clustering peers for their physical proximity can also rise the performance of the request file. However, very few currently work in a peer group based on demands as peers on physical proximity. Although structured P2P provides more efficient files requests than unstructured P2P, it is difficult to apply because of their strictly defined topology. In this work, we intending to introduce a system for exchange a P2P file for proximity and level of interest based on structured P2P nodes that form physically block in the cluster and other groups physically close and nodes of public interest in sub-cluster based on the hierarchical topology. Querying an effective file is important for the overall P2P file exchange performance. Clustering peers from their common interests can significantly enhance the efficiency of the request file PAIS use an intelligent file replication algorithm to further rise the efficiency of the request file .Create a copy file that is often requested by a group of physically close nodes in their position. In addition, PAIS improves the search for files within the intra-system sub-cluster through various approaches. First, it further classifies interest in the sub-cluster to a number of subsections of interests and groups with common interest-free sub nodes in the group for file sharing. Secondly PAIS creates an over for each group that connects nodes of less node capacity to a higher throughput for the distributed node overload prevention request file. Third, in order to reduce the search for late files, PAIS uses a set of proactive information so that applicant can file knowledge if its requested file is in the neighboring nodes. Fourth, reduce the overhead of collecting information about files using the PAIS, collection of file information based on the Bloom Filter and the corresponding search for files distributed. Fifth, in order to improve the efficiency of file sharing, PAIS ranks the results with a blob of filters in order. Sixth, while the newly visited file is usually re-visited approach, based on the Bloom filter is improved only through the management of new information flowering filter is added to reduce the delay of file search. The experimental result of the Real-world Planet Lab Experiment shows that PAIS significantly reduces overhead and improves the efficiency of scrolling and without sharing files. In addition, the experimental results show high efficiency within the sub-research cluster of file approaches to improve file search efficiency.
Article
Full-text available
Publicly accessible databases are an indispensable resource for retrieving up-to-date information. But they also pose a significant risk to the privacy of the user, since a curious database operator can follow the user's queries and infer what the user is after. Indeed, in cases where the users' intentions are to be kept secret, users are often cautious about accessing the database. It can be shown that when accessing a single database, to completely guarantee the privacy of the user, the whole database should be down-loaded; namely n bits should be communicated (where n is the number of bits in the database). In this work, we investigate whether by replicating the database, more efficient solutions to the private retrieval problem can be obtained. We describe schemes that enable a user to access k replicated copies of a database ( k ≥2) and privately retrieve information stored in the database. This means that each individual server (holding a replicated copy of the database) gets no information on the identity of the item retrieved by the user. Our schemes use the replication to gain substantial saving. In particular, we present a two-server scheme with communication complexity O(n 1/3 ).
Article
Full-text available
The exponential growth of the World-Wide Web has transformed it into an ecology of knowledge in which highly diverse information is linked in an extremely complex and arbitrary manner. But even so, as we show here, there is order hidden in the web. We find that web pages are distributed among sites according to a universal power law: many sites have only a few pages, whereas very few sites have hundreds of thousands of pages. This universal distribution can be explained by using a simple stochastic dynamical growth model.
Conference Paper
Full-text available
We describe schemes that enable a user to access k replicated copies of a database (k&ges;2) and privately retrieve information stored in the database. This means that each individual database gets no information on the identity of the item retrieved by the user. For a single database, achieving this type of privacy requires communicating the whole database, or n bits (where n is the number of bits in the database). Our schemes use the replication to gain substantial saving. In particular, we have: A two database scheme with communication complexity of O(n<sup>1/3</sup>). A scheme for a constant number, k, of databases with communication complexity O(n<sup>1</sup>k/). A scheme for 1/3 log<sub>2</sub> n databases with polylogarithmic (in n) communication complexity
Article
Full-text available
this article's publication, the prototype network is processing more than 1 million Web connections per month from more than six thousand IP addresses in twenty countries and in all six main top level domains. [7]
Article
Full-text available
An Archival Intermemory solves the problem of highly survivable digital data storage in the spirit of the Internet. In this paper we describe a prototype implementation of Intermemory, including an overall system architecture and implementations of key system components. The result is a working Intermemory that tolerates up to 17 simultaneous node failures, and includes a Web gateway for browser-based access to data. Our work demonstrates the basic feasibility of Intermemory and represents significant progress towards a deployable system.
Article
Full-text available
Many complex systems, such as communication networks, display a surprising degree of robustness: while key components regularly malfunction, local failures rarely lead to the loss of the global information-carrying ability of the network. The stability of these complex systems is often attributed to the redundant wiring of the functional web defined by the systems' components. In this paper we demonstrate that error tolerance is not shared by all redundant systems, but it is displayed only by a class of inhomogeneously wired networks, called scale-free networks. We find that scale-free networks, describing a number of systems, such as the World Wide Web, Internet, social networks or a cell, display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected by even unrealistically high failure rates. However, error tolerance comes at a high price: these networks are extremely vulnerable to attacks, i.e. to the selection and removal of a few nodes that play the most important role in assuring the network's connectivity. Comment: 14 pages, 4 figures, Latex
Article
One way to camouflage communication over a public network is to mingle connections from a variety of users and applications to make them difficult to distinguish.
Article
An innovative way to become an invisible user is simply to get lost in the crowd. After all, anonymity loves company.
Chapter
We present the architecture, design issues and functions of a MIX-based system for anonymous and unobservable real-time Internet access. This system prevents traffic analysis as well as flooding attacks. The core technologies include an adaptive, anonymous, time/volumesliced channel mechanism and a ticket-based authentication mechanism. The system also provides an interface to inform anonymous users about their level of anonymity and unobservability.
Article
A technique based on public key cryptography is presented that allows an electronic mail system to hide who a participant communicates with as well as the content of the communication - in spite of an unsecured underlying telecommunication system. The technique does not require a universally trusted authority. one correspondent can remain anonymous to a second, while allowing the second to respond via an untraceable return address. The technique can also be used to form rosters of untraceable digital pseudonyms from selected applications. Applicants retain the exclusive ability to form digital signatures corresponding to their pseudonyms. Elections in which any interested party can verify that the ballots have been properly counted are possible if anonymously mailed ballots are signed with pseudonyms from a roster of registered voters. Another use allows an individual to correspond with a record-keeping organization under a unique pseudonym which appears in a roster of acceptable clients.
Article
Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.
Article
The World Wide Web has recently matured enough to provide everyday users with an extremely cheap publishing mechanism. However, the current WWW architecture makes it fundamentally difficult to provide content without identifying yourself. We examine the problem of anonymous publication on the WWW, propose a design suitable for practical deployment, and describe our implementation. Some key features of our design include universal accessibility by pre-existing clients, short persistent names, security against social, legal, and political pressure, protection against abuse, and good performance.
Article
. It is a hard problem to achieve anonymity for real-time services in the Internet (e.g. Web access). All existing concepts fail when we assume a very strong attacker model (i.e. an attacker is able to observe all communication links). We also show that these attacks are realworld attacks. This paper outlines alternative models which mostly render these attacks useless. Our present work tries to increase the efficiency of these measures. 1 The perfect system 1.1 Attacks The perfect anonymous communication system has to prevent the following attacks: 1. Message coding attack: If messages do not change their coding during transmission they can be linked or traced. 2. Timing attack: An opponent can observe the duration of a specific communication by linking its possible endpoints and waiting for a correlation between the creation and/or release event at each possible endpoint. 3. Message volume attack: The amount of transmitted data (i.e. the message length) can be observed. Thus...
Article
We present a design for a system of anonymous storage which resists the attempts of powerful adversaries to find or destroy any stored data. We enumerate distinct notions of anonymity for each party in the system, and suggest a way to classify anonymous systems based on the kinds of anonymity provided. Our design ensures the availability of each document for a publisher-specified lifetime. A reputation system provides server accountability by limiting the damage caused from misbehaving servers. We identify attacks and defenses against anonymous storage services, and close with a list of problems which are currently unsolved.
Article
The Internet was designed to provide a communications channel that is as resistant to denial of service attacks as human ingenuity can make it. In this note, we propose the construction of a storage medium with similar properties. The basic idea is to use redundancy and scattering techniques to replicate data across a large set of machines (such as the Internet), and add anonymity mechanisms to drive up the cost of selective service denial attacks. The detailed design of this service is an interesting scientific problem, and is not merely academic: the service may be vital in safeguarding individual rights against new threats posed by the spread of electronic publishing.
Article
This report describes an algorithm which if executed by a group of interconnected nodes will provide a robust key-indexed information storage and retrieval system with no element of central control or administration. It allows information to be made available to a large group of people in a similar manner to the "World Wide Web". Improvements over this existing system include: - No central control or administration required - Anonymous information publication and retrieval - Dynamic duplication of popular information - Transfer of information location depending upon demand There is also potential for this system to be used in a modified form as an information publication system within a large organisation which may wish to utilise unused storage space which is distributed across the organisation. The system's reliability is not guaranteed, nor is its efficiency, however the intention is that the efficiency and reliability will be sufficient to make the system useful, and demonstrate that...
Article
We describe a system that we have designed and implemented for publishing content on the web. Our publishing scheme has the property that it is very di#cult for any adversary to censor or modify the content. In addition, the identity of the publisher is protected once the content is posted. Our system di#ers from others in that we provide tools for updating or deleting the published content, and users can browse the content in the normal point and click manner using a standard web browser and a client-side proxy that we provide. All of our code is freely available. 1
The eroded self The New York Times
  • J Rosen
J. Rosen, " The eroded self, " The New York Times, April 30, 2000.
Performance,” in Peer-to-Peer
  • T Hong
  • T. Hong
T. Hong, " Performance, " in Peer-to-Peer, ed. by A. Oram. O'Reilly: Sebastopol, CA, USA (2001).
\Publius: A robust, tamper-evident, censorship-resistant and source-anonymous web publishing system
  • M Waldman
  • A D Rubin
  • L F Cranor
M. Waldman, A.D. Rubin, and L.F. Cranor, \Publius: A robust, tamper-evident, censorship-resistant and source-anonymous web publishing system," in Ninth USENIX Security Symposium, Denver, CO, USA (to appear).
  • Dataweb
Church of Spiritual Technology (Scientology) v. Dataweb et al., Cause No. 96/1048, District Court of the Hague, The Netherlands (1999).
Anonymity and unobservability in the Internet'
  • O Berthold
  • H Federrath
O. Berthold, H. Federrath, and M. K ohntopp, \Project`Anonymity and unobservability in the Internet'," in Computers Freedom and Privacy Conference 2000 (CFP 2000) Workshop on Freedom and Privacy by Design (to appear).