Page 1
Self-optimizing Peer-to-Peer Networks with Selfish Processes
Amlan Bhattacharya
University of Iowa
bhattach@cs.uiowa.edu
Sukumar Ghosh
University of Iowa
ghosh@cs.uiowa.edu
Abstract
Request patterns in peer-to-peer networks are not uni-
form, and the cost of communication depends on the traffic
flows among peers. This paper illustrates how processes
in an overlay network can use the information about traf-
fic flow pattern and spontaneously adjust the topology to
minimize their communication costs. Four different self-
optimization protocols are presented. The selfish protocols
allow peers to modify their routing tables to suit their indi-
vidual needs, and are easy to implement, but the improve-
ments are limited. Compared to this, the altruistic proto-
cols that allow peers to adjust the routing tables based on
the needs of other processes, promise a better performance.
Since selfish peers may not comply, a penalty mechanism is
proposed to discourage selfishness.
1Introduction
Peer-to-peer systems derive their power from the collab-
oration of numerous peers who contribute their resources
to provide a scalable service without any central authority.
For all such networks, efficient routing is a key concern.
Gummadi et al [5] showed how the access patterns follow
a Zipf-like distribution for web objects and how that access
patterns get tempered for immutable (like KaZaA) objects.
The bottomline is that the access pattern is skewed. Well-
known structured p2p networks like Chord and Pastry have
rigid topologies with no tuning parameters, and are clearly
inefficient (as measured by the total routing hops needed to
communicatewith peers)fordealingwith skeweddemands.
Unstructured networks on the other hand, have more free-
dom in tuning the network topology to skewed demands in
traffic. Our goal is to study mechanisms using which the
topology of peer-to-peer networks can adapt to the chang-
ing communicationrequirementof the peers. One approach
is content caching, that has a large space overhead but a
wider scope. An alternative approach with a limited goal
is to tweak the network topology, and neighbor selection
strategy, so that the hop count to the frequently accessed
nodes becomes smaller. This is called route caching. Route
caching addresses only communication cost reduction, and
leaves the congestion unchanged. In this note, we focus on
adaptation via route caching.
Some significant related work are: (1) reduction of com-
munication cost in Gnutella due to Wang [10] (2) Shen and
Xu’s elastic routing tables to deal with skewed lookup in
DHT-based networks [9], (3) flow-adaptivity in structured
p2p networks via request profiling and identifying interest-
based clusters (Ghosh and Bejan [1]). Adaptation strate-
gies are challenged (1) churns and (2) selfishness of the
peers. Fabrikant et al [4] presented a game theoretic view
of network creation by selfish agents, and studied the exis-
tence of Nash equilibria under various cost models. Sub-
sequently, many researchers (for example [3] [7]) extended
this. The simple cost model did not consider skewed de-
mands. Moscibroda, Schmid and Wattenhofer [7] showed
that P2P topologies formed by selfish peers under Fab-
rikant’s cost model may not stabilize, even if there is no
churn. Extreme selfishness in neighbor selection can be
detrimental,since it can cause networkpartition. Therefore,
our base topology is a network that preserves connectivity.
We assume that a set of n peers (machines) 0 through n−1
resides in a metric space. Each peer hosts one or more ob-
jects, andtheirlocationsaredeterminedbyaknownhashing
function. By definition,each peeri has a link to peer(i+1)
mod n. On top of the underlying unidirectional ring, each
peer has k additional links to k distinct peers. The number
of neighbors that a peer can support will depend on the ca-
pacity and the quality of its network connection. The set
of k neighbors defines the strategy of the peer. The strat-
egy will be determined by its communication requirement
specified as follows:
Let the flow f(i,j) denote the number of packets from
i to j delivered per unit time, and d(i,j) be the latency (as
measured by the number of hops) for one such communi-
cation after each peer added its strategy edges. Define two
cost functions for the peers:
Private cost. For peer i, Ci=?j=n−1
Total cost. C =?n
j=0
f(i,j).d(i,j),
i=1Ci.
Page 2
These costs should be as low as possible, and they in-
terplay in different ways. A low value of the total cost re-
flects an efficient network. However, individual processes
may have a selfish goal too. A peer responsible for routing
packets between otherpeers may adopta routingpolicy that
minimizes its own cost, possibly at the expense of others.
With a growing trend of sharing audio (e.g Skype) or video,
ormanagingstreamingRSS feeds overa P2P network,each
node is faced with dual responsibilities: that of forwarding
other’s data vs forwardingone’s own data/query. Determin-
ing priorities is a classic scheduling problem in P2P com-
puting. Given the flows between pairs of nodes, the goal of
self-optimization is to minimize the communication cost(s)
by letting each peer choose an appropriate strategy and an
appropriate routing algorithm.
Flows between peers change overtime. We assume each
peer will record the flow using an inexpensive profiling al-
gorithm running in the background, and periodically self-
optimize by adapting its strategy to the new flow. The prob-
lem underconsiderationinvolvesa subtle interplaybetween
cooperation and competition. Peers cooperate with one an-
other to guarantee that eventually every packet reaches its
destination, but compete to minimize their private costs at
the expense of other processes. Based on Fabrikant’s work,
oncecan formulatethis as a non-cooperativegame– the tar-
get configuration will reflect a Nash equilibrium, where no
user can decrease its private cost by unilaterally changing
its strategy. However,this requireseachpeerto haveknowl-
edgeof the routingtables of all otherpeers, which is unreal-
istic. As an alternative,we proposefourdifferentadaptation
protocols, each protocol is defined by a strategy and a rout-
ing mechanism. The first strategy is selfish, where for each
peer the neighbors are destinations with highest flows orig-
inating from itself. The second strategy is altruistic, where
each peer selects neighbors that are destinations of highest
flows originating from other peers. We demonstrate that
when the top flow destinations are uniformly distributed,
the altruistic policy is much more effective for lowering the
communicationcost thanits selfish counterpart. Finally,de-
spite a prescription of policies, some peers will cheat. To
deal with cheating, we propose a penalty mechanism that
will promote compliance.
The paper is organized as follows: Section 2 formalizes
the model. Section 3 introduces the adaptation protocols,
and analyzes the communication costs. Section 4 describes
the penalty mechanism for mandating compliance. Finally,
Section 5 contains some concluding remarks.
2 Preliminaries
A P2P network is an overlay network whose topology is
a directed graph G = (V,E), and V = {0,1,···,n − 1}.
Each i ∈ V denotes a process (peer) and every directed
edge (i,j) ∈ E represents an overlay link from i and j.
Each peer can choose a subset of other peers as its imme-
diate neighbors, and this choice is called its strategy. The
strategy space of a peer i is represented by Si = 2V \{i},
and si ∈ Sidenotes a specific strategy of peer i. A col-
lection of the strategies of all peers (s0,s1,···,sn−1) ∈
(S0× S1× ···Sn−1) defines the topology G.
The strategy of each peer has a fixed component, and
a variable component, The fixed component is externally
mandated, whose violation may jeopardize the connectivity
ofG. In thepresentcase, by definition,thefixed component
of the strategy of peer i is (i + 1) mod n, so the strategy
space shrinks to 2V \{i,i+1}. In the variable part, each peer
can arbitrarily choose k peers to minimize its private cost.
Theactualvalueofk mayvaryfromonepeertoanother. but
for the purpose of analysis, we will assume that all peers
use the same value of k. Structured networks have fixed
strategies, and there is no variable component.
Let f : V × V −→ Z∗denote the network flow. Given
the flow, the network topology and the routing policy, one
can compute the communication cost.
modelthenon-uniformityofthenetworkflowbyanordered
set of h destination nodes (h < n) {i1,i2···,ih}. By def-
inition, (1) ∀j : 1 ≤ j ≤ h : f(i,ij) ≥ f(i,ij+1), and (2)
each such flow is substantially larger than the flow towards
the remaining(n−h) destinations, so that the packets to the
remaining destinations contribute minimally to the commu-
nicationcost. Itis well-knownthatmostofteneachpeerfre-
quently communicates with a small number of peers based
on its interests. This is also a simplistic way of modeling a
Zipf-like query pattern for analytical purposes. We assume
h > k, otherwise each peer will choose its top flow desti-
nations as elements of its strategy set, and the costs become
trivially optimal. By modifying the variable component of
the topology, the communication cost of a peer-to-peer net-
workcanbeoptimized,andthisis theessenceofadaptation.
For each peer i,
Given a flow f, a viable adaptation protocol must satisfy
three requirements:
Stabilization. Thestrategiesmuststabilize, sothattherout-
ing tables converge to a final configuration, and continue in
that configurationunless the system records anotherchange
of flow.
Reachability. The terminal configuration of the routing ta-
bles must guarantee that each packet sent by a node reaches
its destination in a bounded number of hops.
Cost optimization. The communication cost for the given
flow f must be as low as possible, certainly lower that what
it was for the unoptimized network.
Lemma 2.1. Given a flow, the problem of finding an opti-
mal topology that minimizes the communicationcost is NP-
hard.
2
Page 3
This follows from a possible reduction of this problem
to the network creation game [4] known to be NP-hard.
3 Adaptation protocols
We deviate from the classic game-theoretic formulation
[4] on the groundthat peers do not have an easy mechanism
forinspectingtheresidualgraphwhenafractionofthestrat-
egy edges have been added. We present four different pro-
tocols for flow-adaptation. Each protocol is characterized
by a strategy and a routing mechanism. These are: (1) Self-
ish strategy with last-hop routing (2) Selfish strategy with
greedy routing (3) Altruistic strategy with last-hop routing
(4) Altruistic strategy with greedy routing.
3.1Selfish protocols
The selfish protocolsare basedon the followingstrategy:
∀i ∈ V : si = {i1,···,ik} Using this protocol, packets
directed towards k of the h top flow destinations will arrive
in a single hop regardless of the routing strategy chosen,
but the cost of of routing packets to the remaining (h − k)
destinations deserves a close look, and will depend on the
routing strategy. We consider two different cases.
Case 1. Selfish strategy with last-hop routing. The rout-
ing is last-hop, when each packet from node i is always
forwarded to the next node (i + 1) mod n unless there is a
link to the final destination.
Theorem 3.1. For the selfish protocol using last-hop rout-
ing, if the top flow distributions are randomly distributed,
then packets from a peer i will arrive at the top k-out-of-h
destinationsinasinglehop,andreachtheremaining(h−k)
destinations in an expected number ofn−1
Sincek is muchless thann,theexpectedO(n) hopcount
is unacceptably large. Thus this protocol performs poorly.
Case 2. Selfish strategywith greedyrouting. With greedy
routing, a packet towards a destination j will be forwarded
by every peer i to neighbor itsuch that ∀t : 0 ≤ t ≤ k :
d(it,j) is the smallest.
If the top flow destinations are distributed around the
ring with uniform random probability, then the diameter of
the graph is approximately logn. However, Kleinberg [6]
showed that no efficient algorithm exists for finding such
paths1. Routing in polylog time is possible, when the exis-
tence of an edge between nodes i and j is proportional to
1
d(i,j)r (for a ring r = 1). In [8], Shah showed that with
k such edges, the routing time is O(log2n
cial case and is unlikely to help out the selfish strategy in
general.
k
hops.
k
). This is a spe-
1The best decentralized algorithm has an expected routing time of
O(n
2
3).
3.2 Altruistic protocols
Inthe altruisticstrategy,eachnodei ranksthe flowsorig-
inating from other nodes j ?= i (and routed throughi) in the
descending order, and chooses the top k of these destina-
tions as its strategy.
Case 1. Altruistic strategy with last-hop routing. The al-
truistic strategy defines the routing tables of the peers. Us-
ing last-hop routing, if the destination of a packet does not
belong to the strategy of a given peer, then the packet is
forwarded to the next peer (i + 1) mod n.
Theorem 3.2. The altruistic strategy with last hop routing
delivers packets to the top h destinations at an average of
O(h
Proof. (Hint only) Consider a global order of all flows in
the network: q.1 ≥ q.2 ≥ q.3 ≥ ··· If the routing tables
stabilize, then the nodes that generate the flows q.1 − q.k
must have their packets delivered within at most two steps,
and nodes that generate flows q.(m−1)k +1−q.mk must
have the packets delivered in at most (m + 1) steps. As a
result, all packets directed to the top n.h destinations (for
n nodes) will be delivered within 1 +n.h
result.
k) hops.
ksteps. Hence the
Theorem 3.3. Using altruistic strategy and last-hop rout-
ing, the routing tables stabilize in O(n.k)2steps.
We conjecture that the average stabilization time for the
routing tables is much lower than the worst case figure.
Case 2. Altruistic strategy with greedy routing. For the
altruistic strategy with greedy routing, we first show that if
there is only one peer sending out requests, then packets
to the its top h destinations in an expected number oflog h
hops. To understand this, consider the k-ary tree with node
i as the root. After r hops, packets to the top (k + k2+
··· + kr−1) destinations will be delivered. Thus h = (k +
k2+ ··· + kr−1), which leads to the result. This is a lower
bound. The following theorem computes the expected hop
count in the general case of multiple peers.
Theorem3.4. Ifthetopflowdestinationsareuniformlydis-
tributed, then the altruistic strategy with greedy routing de-
livers packets to the top h destinations in an expected num-
ber ofh
Proof. A packet from a node i will be delivered to its desti-
nation by another node j ?= i if the flow to that destination
ranks among the top k of all flows passing through j. Since
there are n.h total contenders, the expected number of con-
tenders at a single node is h and the probability of making
the cut is k/h. If this packet fails to make the cut, then it is
forwarded to another node. The probability of making the
cut after r hops isk
h
time ofh
log k
khops.
h.?1 −k
?r−1So, the expected routing
khops.
3
Page 4
The selfish protocol with the strategy edges satisfying
Kleinberg’s distribution can outperformthe altruistic proto-
col whenlog2n
k
<h
k, i.e. h > log2n.
4Protocol enforcement
Despite the attempt to mandate the altruistic protocol, a
fraction β (β < 1) of all peers will try to derive undue ad-
vantage by switching to the selfish protocol, while expect-
ing others to be altruistic. Restricting the game to a choice
between the selfish and the altruistic protocols, the follow-
ing lemma quantifies the advantage of the cheaters.
Lemma 4.1. In a network of n peers, if a small fraction
β cheat by using the selfish protocol, then the average hop
count of the cheaters will be1
h
the average hop count of the loyal peers will
?k +(1 −β)(h −k)h
k
?, and
h
k(1−β)
The advantage of the cheaters will gradually fade away
as β increases, making cheating less lucrative. However,
due to the concern that the cheaters may get undue advan-
tage, in a stable equilibrium, all peers will eventually revert
to the selfish protocol. This leads to the following theorem:
Theorem 4.2. The Nash equilibrium corresponds to all
peers using the selfish protocol.
Since the selfish strategy leads to an expected average
routingcost of O(n
to O(h
[4]), adoption of the selfish strategy raises the communica-
tion cost by a factor at least O(n
Mandating compliance with the altruistic protocol needs
(1) a mechanism for the detection of cheating, and (2) a
deterrent for penalizing the cheaters, The following lemma
is the cornerstone for the detection of cheating.
k), and the altruistic strategy reduces this
k) (or perhaps lower if we use the formulation from
h)
Lemma 4.3. In the altruistic protocol, no loyal peer should
route its packets in a single hop, unless the destination of
the packet matches with a top destination of another peer.
This is a detectable event by the destination peer, unless
there is a coalition of cheaters working to suppress such ob-
servations. The penalty for this action is the addition of an
amount ζ to the hop count for packets originating from the
potential offender, so that they are discouraged from cheat-
ing. In practice, ζ will translate to an additional delay. Con-
siderthe flows from a cheater i to its h top destinations. As-
suming these flows to be identical, the total cost for routing
(includingthe penalty) will be Cost1= k+(h−k)(h
If peer i did not cheat, then the cost for routing to the
top h destinations would have been h.h
there is a small probability that some of the top destinations
of i will match with a peer in its strategy set, and routing
to these destinations will cause i to be unfairly penalized.
k+ζ)
k=h2
k. However,
Assuming uniform distribution of the top flow destinations,
theprobabilityofeachsuch eventis O(k
cost without cheating will be Cost2= h(1−k
To be a deterrent, Cost1> Cost2. Since k << h <<
n, ζ > 1.
n). Thusthe routing
n).h
k+h.k
n.ζ
5 Conclusion
Our analysis disregardedchurns. Intuitively, churns tend
to destabilize the routing tables generated by the altruistic
protocols. The ability to combat churns will depend on the
speed at which the network adapts to the new flow pattern,
as well as the churn rate. If the mean time between the
arrival or departure of peers exceeds the mean time to stabi-
lize the routing tables, then the protocol will fail to deliver
the promised performance. One method is to first adapt the
network to the reputation of the peers [2], and then run our
protocol on top of this. This will prevent or minimize ad-
versarial churns. The impact of random churns is likely to
be much gentler.
References
[1] Bejan, A. Ghosh, S. Self-Optimizing DHTs using Request
Profiling. OPODIS 2004, pp. 87-95 (also in LNCS 3544,
140-153).
[2] Condie T., Kamvar, S.D., and Garcia-Molina H. Adaptive
Peer-to-Peer Topologies. Proc. 4th International Conference
on Peer-to-Peer Computing (P2P’04) 2004.
[3] Eidenbenz S, Kumar V, and Zust S. Equilibria in Topology
Control GamesforAd HocNetworks. DIALM-POMC,2003.
[4] Fabrikant A, Luthra A, Maneva E, Papadimitriou C.H., and
Shenker, S. On a Network Creation Game. ACM PODC
2003, 347-351.
[5] Gummadi P.K., Dunn R.J., Saroiu S., Gribble S.D., Levy
H.M., Zahorjan J. Measurement, Modeling, and Analysis of
a Peer-to-Peer File-sharing Workload. SOSP 2003, 314-329.
[6] Kleinberg, J. The Small-World Phenomenon: An Algorith-
mic Perspective. STOC 2000.
[7] Moscibroda T, Schmid S, Wattenhofer, R. On the Topology
formed by Selfish Peers. ACM PODC 2006.
[8] Shah G. Distributed Data Structures for Peer-to-Peer Sys-
tems. Ph.D. thesis, Yale University, 2003.
[9] Shen H, Xu C-Z: Elastic Routing Table with Provable Per-
formance for Congestion Control in DHT Networks. ICDCS
2006.
[10] Wang, C. Xiao, L., LinY., and Zheng, P.Distributed Caching
and Adaptive Search in Multilayer P2P Networks. ICDCS
2004.
4
Download full-text