Article

A Game Theoretic Approach to the Formation of Clustered Overlay Networks (Extended Version)

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In many large-scale content sharing applications, participants or nodes are connected with each other based on their content or interests, thus forming clusters. In this paper, we model the formation of such clustered overlays as a strategic game, where nodes determine their cluster membership with the goal of improving the recall of their queries. We study the evolution of such overlays both theoretically and experimentally in terms of stability, optimality, load balance and the required overhead. We show that, in general, decisions made independently by each node using only local information lead to overall cost-effective cluster configurations that are also dynamically adaptable to system updates such as churn and query or content changes.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Each node plays by selecting which clusters to join. This selection is determined individually by each node in order to minimize the utility function that depends on the cluster membership cost and on the cost of evaluating queries outside of the clusters the nodes belongs to [1]. ...
... Daabaj et al. [11] proposed a load-balanced routing algorithm in which parent selection is based on the residual power of the intermediate nodes and the channel state. An approach to form clustered overlay networks based on game theory is proposed in [12] for data sharing while authors in [13] discuss data aggregation algorithms for sensed data in vehicular environment. Sandagopan et al. in [1] and [14] proposed a decentralized utility-based method for making the balanced data-gathering tree when in-network aggregation can be applied as they consider nodes with similar amount of data. ...
Article
A utility-based distributed data routing algorithm is proposed and evaluated for heterogeneous wireless sensor networks. It is energy efficient and is based on a game-theoretic heuristic load-balancing approach. It runs on a hierarchical graph arranged as a tree with parents and children. Sensor nodes are considered heterogeneous in terms of their generated traffic, residual energy and data transmission rate and the bandwidth they provide to their children for communication. The proposed method generates a data routing tree in which child nodes are joined to parent nodes in an energy-efficient way. The principles of the Stackelberg game, in which parents as leaders and children as followers, are used to support the distributive nature of sensor networks. In this context, parents behave cooperatively and help other parents to adjust their loads, while children act selfishly. Simulation results indicate the proposed method can produce on average more load-balanced trees, resulting in over 30% longer network lifetime compared with the cumulative algorithm proposed in the literature. Copyright © 2014 John Wiley & Sons, Ltd.
... Daabaj et al. [11] proposed a load-balanced routing algorithm in which parent selection is based on the residual power of the intermediate nodes and the channel state. An approach to form clustered overlay networks based on game theory is proposed in [12] for data sharing while authors in [13] discuss data aggregation algorithms for sensed data in vehicular environment. Sandagopan et al. in [1] and [14] proposed a decentralized utility-based method for making the balanced data-gathering tree when in-network aggregation can be applied as they consider nodes with similar amount of data. ...
Article
In wireless sensor networks, achieving load balancing in an energy-efficient manner to improve the network lifetime as much as possible is still a challenging problem because in such networks, the only energy resource for sensor nodes is their battery supplies. This paper proposes a game theoretical-based solution in the form of a distributed algorithm for constructing load-balanced routing trees in wireless sensor networks. In our algorithm, load balancing is realized by adjusting the number of children among parents as much as possible, where child adjustment is considered as a game between the parents and child nodes; parents are considered as cooperative players, and children are considered as selfish players. The gained utility by each node is determined by means of some utility functions defined per role, which themselves determine the behavior of nodes in each role. When the game is over, each node gains the maximum benefit on the basis of its utility function, and the balanced tree is constructed. The proposed method provides additional benefits when in-network aggregation is applied. Analytical and simulation results are provided, demonstrating that our proposed algorithm outperform two recently proposed benchmarking algorithms [1, 2], in terms of time complexity and communication overhead required for constructing the load-balanced routing trees. Copyright © 2012 John Wiley & Sons, Ltd.
... Our proposal is experienced under network simulator NS2 integrated a standard version of OLSR (UM-OLSR-0.8.8 [15,16]), which is developed by MASIMUM (MANET Simulation and Implementation at the University of Murcia). Our simulation parameters are as follow. ...
Article
Full-text available
The radio link between a pair of wireless nodes is affected by a set of random factors such as transmission range, node mobility, and environment conditions. The properties of such radio links are continually experienced when nodes status balances between being reachable and being unreachable; thereby on completion of each experience the statistical distribution of link lifetime is updated. This aspect is emphasized in mobile ad hoc network especially when it is deployed in some fields that require intelligent processing of data information such as aerospace domain.
... Simulations are done in NS2 [33] (network simulator) version 2.35 in which we have integrated a standard version of OLSR. (UM-OLSR-0.8.8 [34,35]), which is developed by MANET Simulation and Implementation at the University of Murcia (MASIMUM). ...
Article
Full-text available
When evaluating the performance of QoS protocols, a number of factors have a major impact on the results. Notably, QoS is emphasized when mobile ad hoc networks (MANETs) are employed into aerospace fields. Some of these parameters are a particular manifestation of characteristics of the MANET environment, such as mobility. Indeed, our proposal is a novel multipoint relays scheme based on hybrid cost function taking into account QoS criteria and avoiding mobility effect of nodes, especially those selected as MPRs. A comprehensive simulation study was conducted to evaluate the performance of the proposed scheme. Performance results show that RQMPR outperforms existing MPR heuristic adopted in the ad hoc routing protocols OLSR and QOLSR, in terms of packet delivery and average end-to-end delay.
Chapter
Full-text available
Game theory constitutes a mathematical method for rational decision making in competitive and conflicting situations under specified rules, and thus is closely associated with decision theory. The applicability and usefulness of game theory has been already proved in the research area of peer-to-peer (P2P) networks and network optimization in general. P2P networks consist of autonomous nodes that not only collaborate for sharing and consuming resources, but also act independently and are governed by selfish motives. Thus, game-theoretic solutions lend themselves well for problems arising in P2P networks. Also, game-theoretic approaches have recently been employed in order to exploit the benefits of cloud infrastructures. The proposed work surveys the recent developments on game-theoretic approaches in P2P networks and cloud systems, and provides a classification of approaches dealing with a variety of problems encountered in the design and deployment of P2P and cloud systems.
Article
Different clustering approaches consider different aspects of quality including accuracy of cluster formation, speed of execution, and resource requirements. A major issue in data clustering is to consider the density of clusters and their structure. A challenge is to relax any assumption on the clusters shape such as sphericity, which is common in most partitioning approaches such as K‐means. In this article, with the help of the coalitional game theory and the concept of geodesic distance calculation, a density‐based shape‐independent clustering approach is proposed. In addition to the emphasis on the application of game theory, we also pay attention to the relative neighbourhood of data, which is depicted using geodesic distance. Geodesic distance is a well‐known measure for finding the manifold of data. The new idea supposes to discover any embedded structure of data and avoid finding only spherical clusters. The proposed approach is evaluated on a number of standard University of California, Irvine (UCI) datasets, and the results show the effectiveness of the proposed approach in comparison with some other approaches.
Article
Many systems or applications have been developed for distributed environments with the goal of attaining multiple objectives in the face of environmental challenges such as high dynamics/hostility, or severe resource constraints (e.g., energy or communications bandwidth). Often the multiple objectives are conflicting with each other, requiring optimal tradeoff analyses between the objectives. This work is mainly concerned with how to model multiple objectives of a system and how to optimize their performance. We first conduct a comprehensive survey of the state-of-the-art modeling and solution techniques to solve multi-objective optimization problems. In addition, we discuss pros and cons of each modeling and optimization technique for in-depth understanding. Further, we classify existing approaches based on the types of objectives and investigate main problem domains, critical tradeoffs, and key techniques used in each class. We discuss the overall trends of the existing techniques in terms of application domains, objectives, and techniques. Further, we discuss challenging issues based on the inherent nature of MOO problems. Finally, we suggest future work directions in terms of what critical design factors should be considered to design and analyze a system with multiple objectives.
Conference Paper
Recently, considerable research has been reported on the design of rescue framework, but mostly relies on teleoperated robots, or teams of wireless robots. This paper presents a cooperative rescue framework by using wireless sensor and actor networks (WSANs), where the objective of actors is to reach the event area with the aid of sensors. When an emergency event is detected, the sensor nodes will transmit an event report to inform the actors. Considered the actor mobility and duty cycle of sensor nodes, the data forwarding process from sensor nodes to mobile actors is described as an agent-based search. Then an improved ballooning strategy is proposed to control the search process. After actors receive the event report, a potential-based controller is proposed to make the actors reach the event area while avoiding collision between actors. Finally, simulation results are provided to demonstrate the effectiveness of the proposed method.
Article
The purpose of this paper is to evaluate the performance of cooperative diversity in distributed grouped two-hop and multi-hop relaying networks consisting of single-antenna terminals with constellation rearrangement. Simulation results, for phase shift keyed and quadrature amplitude modulation signals in the single carrier Rayleigh channel, show that the use of the cooperative diversity with a constellation rearrangement scheme improve the error probability performance in a power constrained, grouped multi-hop relaying network operating in a multi-path fading environment. The performance of the system will be improved without reduction of Transmission rate by using constellation rearrangement. Finally, the proposed scheme is evaluated through NS2 simulation from the view of throughput in order to investigate the interaction between physical layer and MAC layer.
Conference Paper
Energy consumption in mobile network might be excessive for some nodes acting as relays in OLSR network than others. This might have a considerable consequence on the network lifetime. Our approach is presented in the EDCR protocol. It is an extended version of OLSR which aims to increase the residual energy of the network by distributing forwarding tasks between MPRs. For that, we change the MPR procedure selection in order to favor nodes with the largest number of MPR-selector. Our hypothesis is experienced under NS2 and effectively results on less energy consumption.
Article
Full-text available
We consider the problem of network formation in a distributed fashion. Network formation is modeled as a strategic-form game, where agents represent nodes that form and sever unidirectional links with other nodes and derive utilities from these links. Furthermore, agents can form links only with a limited set of neighbors. Agents trade off the benefit from links, which is determined by a distance-dependent reward function, and the cost of maintaining links. When each agent acts independently, trying to maximize its own utility function, we can characterize “stable” networks through the notion of Nash equilibrium. In fact, the introduced reward and cost functions lead to Nash equilibria (networks), which exhibit several desirable properties such as connectivity, bounded-hop diameter, and efficiency (i.e., minimum number of links). Since Nash networks may not necessarily be efficient, we also explore the possibility of “shaping” the set of Nash networks through the introduction of state-based utility functions. Such utility functions may represent dynamic phenomena such as establishment costs (either positive or negative). Finally, we show how Nash networks can be the outcome of a distributed learning process. In particular, we extend previous learning processes to so-called “state-based” weakly acyclic games, and we show that the proposed network formation games belong to this class of games.
Article
Full-text available
Current peer-to-peer systems are targeted for information sharing, file storage, searching and indexing often using an overlay network. In this paper we expand the scope of peer-to-peer systems to include the concept of "communities". Communities are like interest groups, modeled after human communities and can overlap. They can also exist without anyone knowing about their existence. Communities are created, implicitly when one or more entities claim an interest in the same topic. Our work focuses on efficient methods to discover the formation of these self-configuring communities. We investigate the behavior of randomly created communities and model the complexity of discovery algorithms. Discovering communities on the fly is essential to being able to perform community directed searching. In addition, efficient discovery algorithms allow us to manage quickly changing community structures (dynamic communities, failures, mobile nodes and so on). We use some simulations to discover the architecture of randomly created communities and then perform studies on techniques for discovering communities.
Conference Paper
Full-text available
We use knowledge discovery techniques to guide the creation of efficient overlay networks for peer-to-peer file sharing. An overlay network specifies the logical connections among peers in a network and is distinct from the physical connections of the network. It determines the order in which peers will be queried when a user is searching for a specific file. To better understand the role of the network overlay structure in the performance of peer-to-peer file sharing protocols, we compare several methods for creating overlay networks. We analyze the networks using data from a campus network for peer-to-peer file sharing that recorded anonymized data on 6,528 users sharing 291,925 music files over an 81-day period. We propose a novel protocol for overlay creation based on a model of user preference identified by latent-variable clustering with hierarchical Dirichlet processes (HDPs). Our simulations and empirical studies show that the clusters of songs created by HDPs effectively model user behavior and can be used to create desirable network overlays that outperform alternative approaches.
Conference Paper
Full-text available
Online social networking sites like Orkut, YouTube, and Flickr are among the most popular sites on the Internet. Users of these sites form a social network, which provides a powerful means of sharing, organizing, and finding content and contacts. The popularity of these sites provides an opportunity to study the characteristics of online social network graphs at large scale. Understanding these graphs is important, both to improve current systems and to design new applications of online social networks. This paper presents a large-scale measurement study and analysis of the structure of multiple online social networks. We examine data gathered from four popular online social networks: Flickr, YouTube, LiveJournal, and Orkut. We crawled the publicly accessible user links on each site, obtaining a large portion of each social network's graph. Our data set contains over 11.3 million users and 328 million links. We believe that this is the first study to examine multiple online social networks at scale. Our results confirm the power-law, small-world, and scale-free properties of online social networks. We observe that the indegree of user nodes tends to match the outdegree; that the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree nodes at the fringes of the network. Finally, we discuss the implications of these structural properties for the design of social network based systems.
Conference Paper
Full-text available
In this paper we present an empirical study of a workload gathered by crawling the eDonkey network --- a dominant peer-to-peer file sharing system --- for over 50 days.We first confirm the presence of some known features, in particular the prevalence of free-riding and the Zipf-like distribution of file popularity. We also analyze the evolution of document popularity.We then provide an in-depth analysis of several clustering properties of such workloads. We measure the geographical clustering of peers offering a given file. We find that most files are offered mostly by peers of a single country, although popular files don't have such a clear home country.We then analyze the overlap between contents offered by different peers. We find that peer contents are highly clustered according to several metrics of interest.We propose to leverage this property by allowing peers to search for content without server support, by querying suitably identified semantic neighbours. We find via trace-driven simulations that this approach is generally effective, and is even more effective for rare files. If we further allow peers to query both their semantic neighbours, and in turn their neighbours' neighbours, we attain hit rates as high as over 55% for neighbour lists of size 20.
Conference Paper
Full-text available
We introduce a novel game that models the creation of Internet-like networks by selfish node-agents without central design or coordination. Nodes pay for the links that they establish, and benefit from short paths to all destinations. We study the Nash equilibria of this game, and prove results suggesting that the "price of anarchy" [4] in this context (the relative cost of the lack of coordination) may be modest. Several interesting: extensions are suggested.
Conference Paper
Full-text available
Peer-to-peer sharing systems are becoming increasingly popular and an exciting new class of innovative, internet-based data management systems. In these systems, users contribute their own resources (processing units and storage devices) and content (i.e., documents) to the P2P community. We focus on the management of content and resources in such systems. Our goal is to harness all available resources in the P2P network so that the users can access all available content efficiently. Efficiency is taken both from (i) the point of view of the system, in that we strive to ensure fair load distribution among all peer nodes, and (ii) from the point of view of the users, in that we strive to ensure low user-request response times.
Conference Paper
Full-text available
When joining information provider peers to a peer-to-peer network, an arbitrary distribution is sub-optimal. In fact, clustering peers by their characteristics, enhances search and integration significantly. Currently super-peer networks, such as the Edutella network, provide no sophisticated means for such a "semantic clustering" of peers. We introduce the concept of semantic overlay clusters (SOC) for super-peer networks enabling a controlled distribution of peers to clusters. In contrast to the recently announced semantic overlay network approach designed for flat, pure peer-to-peer topologies and for limited meta data sets, such as simple filenames, we allow a clustering of complex heterogeneous schemes known from relational databases and use advantages of super-peer networks, such as efficient search and broadcast of messages. Our approach is based on predefined policies defined by human experts. Based on such policies a fully decentralized broadcast-and matching approach distributes the peers automatically to super-peers. Thus we are able to automate the integration of information sources in super-peer networks and reduce flooding of the network with messages.
Conference Paper
Full-text available
Many distributed systems can be modeled as network games: a collection of selfish players that communicate in order to maximize their individual utilities. The performance of such games can be evaluated through the costs of the system equilibria: the system states in which no player can increase her utility by unilaterally changing her behavior. However, assuming that all players are selfish and in particular that all players have the same utility function may not always be appropriate. Hence, several extensions to incorporate also altruistic and malicious behavior in addition to selfishness have been proposed over the last years. In this paper, we seek to go one step further and study arbitrary relationships between participants. In particular, we introduce the notion of the social range matrix and explore the effects of the social range matrix on the equilibria in a network game. In order to derive concrete results, we propose a simplistic network creation game that captures the effect of social relationships among players. Comment: 12 pages
Conference Paper
Full-text available
Super-peer architectures exploit the heterogeneity of nodes in a P2P network by assigning additional responsi- bilities to higher-capacity nodes. In the design of a super- peer network for file sharing, several issues have to be ad- dressed: how client peers are related to super-peers, how super-peers locate files, how the load is balanced among the super-peers, and how the system deals with node failures. In this paper we introduce a self-organizing super-peer net- work architecture (SOSPNET) that solves these issues in a fully decentralized manner. SOSPNET maintains a super- peer network topology that reflects the semantic similarity of peers sharing content interests. Super-peers maintain se- mantic caches of pointers to files which are requested by peers with similar interests. Client peers, on the other hand, dynamically select super-peers offering the best search per- formance. We show how this simple approach can be em- ployed not only to optimize searching, but also to solve gen- erally difficult problems encountered in P2P architectures such as load balancing and fault tolerance. We evaluate SOSPNET using a model of the semantic structure derived from the 8-month traces of two large file-sharing communi- ties. The obtained results indicate that SOSPNET achieves close-to-optimal file search performance, quickly adjusts to changes in the environment (node joins and leaves), sur- vives even catastrophic node failures, and efficiently dis- tributes the system load taking into account peer capacities.
Article
Full-text available
The current approach in web searching, i.e., using centralized search engines, rises issues that question their future applicability: 1) coverage and scalability, 2) freshness, and 3) information monopoly. Performing web search using a P2P architecture that consists of the actual web servers has the potential to tackle those issues. In order to achieve the desired performance and scalability, as well as enhancing search quality relative to centralized search engines, semantic overlay networks (SONS) connecting peers storing semantically related information can be employed. The lack of global content/topology knowledge in a P2P system is the key challenge in forming SONS, and this paper describes an unsupervised approach for decentralized and distributed generation of SONS (DESENT). Through simulations and analytical cost models we verify our claims regarding performance, scalability, and quality.
Article
Full-text available
We present SETS, an architecture for efficient search in peer-to-peer networks, building upon ideas drawn from machine learning and social network theory. The key idea is to arrange participating sites in a topic-segmented overlay topology in which most connections are short-distance, connecting pairs of sites with similar content. Topically focused sets of sites are then joined together into a single network by long-distance links. Queries are matched and routed to only the topically closest regions. We discuss a variety of design issues and tradeoffs that an implementor of SETS would face. We show that SETS is efficient in network traffic and query processing load.
Article
Full-text available
An extensive analysis of user traffic on Gnutella shows a significant amount of free riding in the system. By sampling messages on the Gnutella network over a 24-hour period, we established that nearly 70% of Gnutella users share no files, and nearly 50% of all responses are returned by the top 1% of sharing hosts. Furthermore, we found out that free riding is distributed evenly between domains, so that no one group contributes significantly more than others, and that peers that volunteer to share files are not necessarily those who have desirable ones. We argue that free riding leads to degradation of the system performance and adds vulnerability to the system. If this trend continues copyright issues might become moot compared to the possible collapse of such systems. 2 1.
Article
Full-text available
Peer-to-peer protocols play an increasingly instrumental role in Internet content distribution. It is therefore important to gain a complete understanding of how these protocols behave in practice and how their operating parameters affect overall system performance. This paper presents the first detailed experimental investigation of the peer selection strategy in the popular BitTorrent protocol. By observing more than 40 nodes in instrumented private torrents, we validate three protocol properties that, though believed to hold, have not been previously demonstrated experimentally: the clustering of similar-bandwidth peers, the effectiveness of BitTorrent's sharing incentives, and the peers' high uplink utilization. In addition, we observe that BitTorrent's modified choking algorithm in seed state provides uniform service to all peers, and that an underprovisioned initial seed leads to absence of peer clustering and less effective sharing incentives. Based on our results, we provide guidelines for seed provisioning by content providers, and discuss a tracker protocol extension that addresses an identified limitation of the protocol.
Article
We study Nash equilibria in the setting of network creation games introduced recently by Fabrikant, Luthra, Maneva, Papadimitriou and Shenker. In this game we have a set of selfish node players, each creating some incident links, and the goal is to minimize α times the cost of the created links plus sum of the distances to all other players. Fabrikant et al. proved an upper bound O(√α) on the price of anarchy, i.e., the relative cost of the lack of coordination. Albers, Eilts, Even-Dar, Mansour, and Roditty show that the price of anarchy is constant for α = O(√n) and for α ≥ 12n[lg n], and that the price of anarchy is 15(1+min {α² n, n² α})1/3) for any α. The latter bound shows the first sublinear worst-case bound, O(n1/3), for all α. But no better bound is known for α between ω(√n) and o(n lg n). Yet α ≈ n is perhaps the most interesting range, for it corresponds to considering the average distance (instead ofthe sum of distances) to other nodes to be roughly on par with link creation (effectively dividing α by n). In this paper, we prove the first o(nε) upper bound for general α, namely 2O(√lg n). We also prove aconstant upper bound for α = O(n1-ε) for any fixed ε > 0, substantially reducing the range of α for which constant bounds have not been obtained. Along the way, we also improve the constant upper bound by Albers et al. (with the leadconstant of 15 ) to 6 for α < (n/2)1/2 and to 4 for α < (n/2)1/3}. Next we consider the bilateral network variant of Corbo and Parkesin which links can be created only with the consent of both end points and the link price is shared equally by the two. Corbo and Parkes show an upper bound of O(√α) and a lower bound of Ω(lg α) for α ≤ n. In this paper, we show that in fact the upper bound O(√α) is tight for α ≤, by proving a matching lower bound of Ω(√α). For α > n, we prove that the price of anarchy is Θ(n/√ α). Finally we introduce a variant of both network creation games, in which each player desires to minimize α times the cost of its created links plus the maximum distance (instead of the sum of distances) to the other players. This variant of the problem is naturally motivated by considering the worst case instead of the average case. Interestingly, for the original (unilateral) game, we show that the price of anarchy is at most 2 for α ≥ n, O(min{4√lg n, (n/α)1/3}) for 2√lgn ≤ α ≤ n, and O(n2/α) for α < 2√lg n. For the bilateral game, we prove matching upper and lower bounds of Θ(n α+1) for α ≤ n, and an upper bound of 2 for α > n.
Conference Paper
The popularity of blogs has been increasing dramatically over the last couple of years. As topics evolve in the blogosphere, keywords align together and form the heart of various stories. Intuit ively we expect that in certain contexts, when there is a lot of discus sion on a specific topic or event, a set of keywords will be correlated : the keywords in the set will frequently appear together (pair-wise or in conjunction) forming a cluster. Note that such keyword clusters are temporal (associated with specific time periods) and transi ent. As topics recede, associated keyword clusters dissolve, because their keywords no longer appear frequently together. In this paper, we formalize this intuition and present effici ent al- gorithms to identify keyword clusters in large collections of blog posts for specific temporal intervals. We then formalize pro blems related to the temporal properties of such clusters. In part icular, we present efficient algorithms to identify clusters that pe rsist over time. Given the vast amounts of data involved, we present algo- rithms that are fast (can efficiently process millions of blo gs with multiple millions of posts) and take special care to make them ef- ficiently realizable in secondary storage. Although we inst antiate our techniques in the context of blogs, our methodology is generic enough to apply equally well to any temporally ordered text source. We present the results of an experimental study using both real and synthetic data sets, demonstrating the efficiency of our algo- rithms, both in terms of performance and in terms of the quality of the keyword clusters and associated temporal properties we iden- tify.
Conference Paper
In a typical overlay network for routing or content sharing, each node must select a fixed number of immediate overlay neighbors for routing traffic or content queries. A selfish node entering such a network would select neighbors so as to minimize the weighted sum of expected access costs to all its destinations. Previous work on selfish neighbor selection has built intuition with simple models where edges are undirected, access costs are modeled by hop-counts, and nodes have potentially unbounded degrees. However, in practice, important constraints not captured by these models lead to richer games with substantively and fundamentally different outcomes. Our work models neighbor selection as a game involving directed links, constraints on the number of allowed neighbors, and costs reflecting both network latency and node preference. We express a node's "best response" wiring strategy as a k-median problem on asymmetric distance, and use this formulation to obtain pure Nash equilibria. We experimentally examine the properties of such stable wirings on synthetic topologies, as well as on real topologies and maps constructed from PlanetLab and AS-level Internet measurements. Our results indicate that selfish nodes can reap substantial performance benefits when connecting to overlay networks constructed by naive nodes. On the other hand, in overlays that are dominated by selfish nodes, the resulting stable wirings are optimized to such great extent that even uninformed newcomers can extract near-optimal performance through naive wiring strategies.
Article
In many large-scale content sharing applications, par-ticipants or peers are grouped together forming clusters based on their content or interests. In this paper, we deal with the maintenance of such clusters in the pres-ence of updates. We model the evolution of the system as a game, where peers determine their cluster mem-bership based on a utility function of the query recall. Peers are guided either by selfish or altruistic motives: selfish peers aim at improving the recall of their own queries, whereas altruistic peers aim at improving the recall of the queries of other peers. We study the evo-lution of such clusters both theoretically and experi-mentally under a variety of conditions. We show that, in general, local decisions made independently by each peer enable the system to adapt to changes and main-tain the overall recall of the query workload.
Article
Many P2P systems are only proven efficient for static environments. However, in practice, P2P systems are often very dynamic in the sense that peers can join and leave a system at any time and concurrently. In the first part of my talk, I will present a DHT we have developed recently in our group which maintains desirable properties under worst-case churn. In the second part of my talk, we will briefly look at another challenge of prime importance in P2P computing, namely selfishness. Concretely, some results are presented concerning the impact of selfish behavior on the performance of P2P topologies. @InProceedings{schmid_et_al:DSP:2006:643, author = {Stefan Schmid and Thomas Moscibroda and Roger Wattenhofer}, title = {On the Topologies Formed by Selfish Peers}, booktitle = {Peer-to-Peer-Systems and -Applications}, year = {2006}, editor = {Anthony D. Joseph and Ralf Steinmetz and Klaus Wehrle}, number = {06131}, series = {Dagstuhl Seminar Proceedings}, ISSN = {1862-4405}, publisher = {Internationales Begegnungs- und Forschungszentrum f{"u}r Informatik (IBFI), Schloss Dagstuhl, Germany}, address = {Dagstuhl, Germany}, URL = {http://drops.dagstuhl.de/opus/volltexte/2006/643}, annote = {Keywords: Churn, Selfishness, P2P Topologies} }
Conference Paper
Recently, clustered overlays in which peers are grouped based on the similarity of their content or interests have been proposed to improve performance in peer-to-peer systems. Since such systems are highly dynamic, the overlay network needs to be updated frequently to cope with changes. In this paper, we introduce an approach for updating a clustered overlay based on local decisions made by individual peers. We model the cluster-reformulation problem as a game where peers determine their cluster membership based on potential gains in the recall of their queries. We also define global criteria for the overall quality of the system and propose strategies for peer relocation that consider different behavioral patterns for the peers. Our preliminary experimental evaluation shows that our strategies cope well with changes in the overlay network.
Conference Paper
The success of a P2P file-sharing network highly depends on the scalability and versatility of its search mechanism. Two particularly desirable search features are scope (ability to find infrequent items) and support for partial-match queries (queries that contain typos or include a subset of keywords). While centralized-index architectures (such as Napster) can support both these features, existing decentralized architectures seem to support at most one: prevailing unstructured P2P protocols (such as Gnutella and FastTrack) deploy a "blind" search mechanism where the set of peers probed is unrelated to the query; thus they support partial-match queries but have limited scope. On the other extreme, the recently-proposed distributed hash tables (DHTs) such as CAN and CHORD, couple index location with the item's hash value, and thus have good scope but can not effectively support partial-match queries. Another hurdle to DHTs deployment is their tight control of the overlay structure and the information (part of the index) each peer maintains, which makes them more sensitive to failures and frequent joins and disconnects. We develop a new class of decentralized P2P architectures. Our design is based on unstructured architectures such as gnutella and FastTrack, and retains many of their appealing properties including support for partial match queries, and relative resilience to peer failures. Yet, we obtain orders of magnitude improvement in the efficiency of locating rare items. Our approach exploits associations inherent in human selections to steer the search process to peers that are more likely to have an answer to the query. We demonstrate the potential of associative search using models, analysis, and simulations.
Conference Paper
In a peer-to-peer (P2P) system, nodes typically connect to a small set of random nodes (their neighbors), and queries are propagated along these connections. Such query flooding tends to be very expensive. We propose that node connections be influenced by content, so that for example, nodes having many "Jazz" files will connect to other similar nodes. Thus, semantically related nodes form a Semantic Overlay Network (SON). Queries are routed to the appropriate SONs, increasing the chances that matching files will be found quickly, and reducing the search load on nodes that have unrelated content. We have evaluated SONs by using an actual snapshot of music-sharing clients. Our results show that SONs can significantly improve query performance while at the same time allowing users to decide what content to put in their computers and to whom to connect.
Article
Locating content in decentralized peer-to-peer systems is a challenging problem. Gnutella, a popular file-sharing application, relies on flooding queries to all peers. Although flooding is simple and robust, it is not scalable. In this paper, we explore how to retain the simplicity of Gnutella, while addressing its inherent weakness: scalability. We propose a content location solution in which peers loosely organize themselves into an interest-based structure on top of the existing Gnutella network. Our approach exploits a simple, yet powerful principle called interest-based locality, which posits that if a peer has a particular piece of content that one is interested in, it is very likely that it will have other items that one is interested in as well. When using our algorithm, called interest-based shortcuts,asignificant amount of flooding can be avoided, making Gnutella a more competitive solution. In addition, shortcuts are modular and can be used to improve the performance of other content location mechanisms including distributed hash table schemes.
Article
We describe an ecient peer-to-peer information retrieval system, pSearch, that supports state-of-the-art content- and semantic-based full-text searches. pSearch avoids the scalability problem of existing systems that employ centralized indexing, or index/query ooding. It also avoids the nondeterminism that is exhibited by heuristic-based approaches. In pSearch, documents in the network are organized around their vector representations (based on modern document ranking algorithms) such that the search space for a given query is organized around related documents, achieving both eciency and accuracy.
Article
Efficiently determining the node that stores a data item in a distributed network is an important and challenging problem. This paper describes the motivation and design of the Chord system, a decentralized lookup service that stores key/value pairs for such networks. The Chord protocol takes as input an m-bit identifier (derived by hashing a higher-level application specific key), and returns the node that stores the value corresponding to that key. Each Chord node is identified by an m-bit identifier and each node stores the key identifiers in the system closest to the node's identifier. Each node maintains an m-entry routing table that allows it to look up keys efficiently. Results from theoretical analysis, simulations, and experiments show that Chord is incrementally scalable, with insertion and lookup costs scaling logarithmically with the number of Chord nodes.
Network Formation Games Algo-rithmic Game Theory
  • E Tardos
  • T Wexler
E. Tardos and T. Wexler, " Network Formation Games, " Algo-rithmic Game Theory, Cambridge Univ. Press, 2007.
Semantic overlay networks for p2p systems, technical report, computer science department, stanford university Vazirgiannis . Desent: decentralized and distributed semantic overlay generation in p2p networks
  • A Crespo
  • H Garcia-Molina
A. Crespo and H. Garcia-Molina. Semantic overlay networks for p2p systems, technical report, computer science department, stanford university, 2002. [6] C. Doulkeridis, K. Norvag, and M. Vazirgiannis. Desent: decentralized and distributed semantic overlay generation in p2p networks. JSAC, 25(1):25–34, 2007.
&ldquo,Network Formation Games,&rdquo, Algorithmic Game Theory
  • E Tardos
  • T Wexler
Huberman, &amp;ldquo,Free Riding on Gnutella,&amp;rdquo, First Monday
  • E Adar
Druschel and B. Bhattacharjee, &amp;ldquo,Measurement and Analysis of Online Social Networks,&amp;rdquo
  • A Mislove
  • M Marcon
  • K P Gummadi
Wexler, &amp;ldquo,Network Formation Games,&amp;rdquo, Algorithmic Game Theory
  • E Tardos