Quantifying disincentives in P2P Systems

advertisement
Survey on
Peer-to-Peer Systems: Incentives for Cooperation
Smita Rai
Guides: Prof. Dipak Ghosal, Prof. Xin Liu
Abstract
Peer-to-peer computation, because of its unique advantages over the common client / server
model, has given rise to several killer applications like Napster, Gnutella and KaZaA. The central
tenet of P2P systems is cooperation. However, mostly users are not altruistic and have some
natural disincentives to cooperate. Thus, incentive mechanisms that motivate users to contribute
resources may be critical to the eventual success of such systems. This report looks at some
well-known peer-to-peer applications and the problems posed by “free-riders” in such systems.
We then survey some of the incentive based schemes proposed to overcome this problem.
ECS289
Survey Report
Table of Contents
Introduction ......................................................................................................................... 3
The Tragedy of the Commons ........................................................................................ 3
Gnutella: A Case Study................................................................................................... 4
Incentive based Schemes .................................................................................................... 6
Quantifying disincentives in P2P Systems ..................................................................... 6
Rationality and Self Interest in P2P Networks ............................................................... 7
Peer-Approved Incentive Mechanism............................................................................. 8
Incentives for cooperation in Peer-to-Peer Networks ................................................... 10
Addressing the Non-cooperation Problem in P2P Systems .......................................... 11
Conclusions ....................................................................................................................... 13
References ......................................................................................................................... 14
2
ECS289
Survey Report
Introduction
The appearance of new forms of Peer-to-Peer (P2P) network applications such
as Gnutella [Gn00a], KaZaA [Kazaa] and FreeNet [Fr00], holds promise for the
emergence of fully distributed information sharing systems. These systems, inspired by
Napster [Na00], will allow users worldwide access and provision of information while
enjoying a level of privacy not possible in the present client-server architecture of the
Web. The traditional client / server architecture has the following limitations:
a)
b)
c)
d)
Hard to achieve scalability.
Single point of failure.
Administrative requirements.
Unused resources at the edges of the network.
P2P computing, which aims to avoid the above problems, is defined as the
sharing of computer resources and services by direct exchange between systems [P2P].
These resources and services can include the exchange of information, processing
cycles, cache storage, and disk storage for files. P2P computing takes advantage of the
existing computing power, computer storage and networking connectivity, allowing users
to leverage their collective power for the ‘benefit’ of all. Keeping in mind the central tenet
of co-operation, this gives rise to the issue of securing enough cooperation in such large
and autonomous systems so that they become truly useful. There is a possibility that
users will stop producing and only consume. This free riding behavior is the result of a
social dilemma that all users of such systems confront and may result in “The Tragedy of
the Commons” [Hardin68] for the system.
The Tragedy of the Commons
This term was first coined by G. Hardin [Hardin68] to denote the situation in
which a group of people attempts to utilize a common good in the absence of central
authority. In the context of P2P applications this common good can be the provision of a
very large library of files, music and other documents to the user community. The
dilemma for each individual is then to either contribute to the common good, or to shirk
and free ride on the work of others.
Hardin used a simple example of an open pasture to demonstrate how the
tragedy develops. It is to be expected that each herdsman will try to keep as many cattle
as possible on the commons. As a rational being, each herdsman seeks to maximize his
gain. His utility for adding one more animal to his herd has one negative and one
positive component.
1. The positive component is a function of the increment of one animal. Since the
herdsman receives all the proceeds from the sale of the additional animal, the positive
utility is nearly + 1.
2. The negative component is a function of the additional overgrazing created by one
more animal. Since, however, the effects of overgrazing are shared by all the herdsmen,
the negative utility for any particular decision making herdsman is only a fraction of - 1.
3
ECS289
Survey Report
Adding together the component partial utilities, the rational herdsman concludes
that the only sensible course for him to pursue is to add another animal to his herd.
However, this is the conclusion reached by each and every rational herdsman sharing
the commons. “Therein is the tragedy. Each man is locked into a system that compels
him to increase his herd without limit -- in a world that is limited. Ruin is the destination
toward which all men rush, each pursuing his own best interest in a society that believes
in the freedom”.
In the following section we look at a typical P2P file sharing system and how the
“free-riders” create problems that limit the utility of the system.
Gnutella: A Case Study
The architecture for the Gnutella network [Gn00a] is as follows:
Fig 1: Gnutella
(Courtesy: http://computer.howstuffworks.com/file-sharing.htm)
1. No central servers.
2. In order to join the system, a user initially connects to one of the several
known hosts that are almost always available.
3. The user uses an application that adheres to the Gnutella protocol. Each
instance of this application is called a peer. A peer can act as a client
(consumer of information) or server (a supplier of information).
4. Peers broadcast query messages for a file with a TTL. The peers that receive
the query message, either send a query response (if they have the file) or
forward them to their neighbors, unless limited by the TTL.
4
ECS289
Survey Report
Since files on Gnutella are treated like a public good and the users are not
charged in proportion to their use, it appears rational for people to download music files
without contributing by making their own files accessible to other users. Because every
individual can reason this way and free ride on the efforts of others, the whole system's
performance can degrade considerably, which makes everyone worse off.
The second problem caused by free riding is to create vulnerabilities for a system
in which there is risk to individuals. If only a few individuals contribute to the public good,
these few peers effectively act as centralized servers. Users in such an environment
thus become vulnerable to lawsuits, denial of service attacks, and potential loss of
privacy.
Extensive analysis of user traffic on Gnutella shows a significant amount of free
riding in the system [AdHu00]. The authors, by sampling messages on the Gnutella
network, discover that almost 70% of Gnutella users share no files, and nearly 50% of all
responses are returned by the top 1 % of sharing hosts.
The top
333 hosts (1%)
Share As percent of the whole
1,142,645 37%
1,667 hosts (5%) 2,182,087 70%
3,334 hosts (10%) 2,692,082 87%
5,000 hosts (15%) 2,928,905 94%
6,667 hosts (20%) 3,037,232 98%
8,333 hosts (25%) 3,082,572 99%
Table 1 – Statistics for the Gnutella Network [AdHu00]
This study also differentiates between two kinds of free riders:
1. Peers that do not provide files for download by others.
2. Peers that provide downloadable content that is not desirable. Essentially, a
quality versus quantity issue. This poses a social dilemma when there is a cost to
the provider to make desirable files available to others.
Thus, the case of Gnutella network demonstrates the need for providing
incentives for the users to cooperate in similar P2P applications. In the next section we
look at some of the proposals that seek to mitigate selfish behavior of the users to
promote the utility of the overall system.
5
ECS289
Survey Report
Incentive based Schemes
Quantifying disincentives in P2P Systems
In this paper [Feldman03], the authors attempt to quantify the performancebased disincentive a user in a typical P2P file-sharing system may have. They use the
average latency of a file transfer as the performance metric. The authors try to capture
how the performance experienced by a user varies as a function of:
a) the sharing level
b) whether a user shares files or not
c) the asymmetry in the host incoming and outgoing bandwidths
d) the system load.
bSin
Server
TCP acks
bCout
sender
S
C
bSout
data
Client
bCin
Fig 2: Local view of one host downloading from another [Feldman03]
The authors make the following assumptions about their model:
a)
b)
c)
d)
e)
f)
g)
Download time is dominated by transfer time
Bottleneck is always at the edge of the network
Traffic follows TCP protocol
Searches experience no delay; require negligible BW
Files have the same size, popularity and spatial distribution
Generated load is evenly distributed
Number of uploads per node is proportional to its outgoing bandwidth
The authors distinguish between potential and actual disincentives. The potential
disincentive is when the users think their download will be delayed by their uploads. The
reason is that the throughput of a TCP traffic depends on the interactions between the
data and the ACK flows. The authors show by simulation of a three node network, that
as a result of uploads, ACKs from a node get delayed sufficiently (in contending with the
uploaded data on the outgoing link) to result in the following utilization (% of incoming
link used for downloads):
ADSL
Ethernet
in bw
1.5Mb/s
10Mb/s
out bw
128Kb/s
10Mb/s
6
link utilization
0.2
0.8
ECS289
Survey Report
Thus, there is a high potential disincentive for a node to allow uploads,
particularly in the ADSL case.
However, the authors based on their model theoretically show that the actual
disincentive depends on the location of the bottleneck in transmission. If the server is the
bottleneck, which occurs for a low level of sharing, there is no actual disincentive for the
client to share. If the client is the bottleneck, when the level of sharing is high in the
network, the client has a disincentive to share.
The authors’ simulation results substantiate their theoretical results. For
homogeneous systems there is no disincentive to share whatever be the sharing level
and for heterogeneous system, the nodes with ADSL experience actual disincentive to
share, but at a high level of sharing in the network.
Fig 3: Latency experienced by nodes in a heterogeneous system 95% ADSL, 5%
Ethernet nodes [Feldman03]
To remove this disincentive, the authors propose that TCP ACKs should be
prioritized over normal data flows on the outgoing link. The simulations results however,
the authors feel are unclear, since this has a positive effect on the receiver’s incoming
throughput but a negative effect on the sender’s outgoing throughput.
Rationality and Self Interest in P2P Networks
This paper by [Shn03] is basically theoretical and it was included in the survey
because it gives an interesting perspective and proposes the use of an emerging field of
computer science and artificial intelligence, to solve the problem of rational behavior in
P2P systems. The paper has three objectives:
a) To convince the reader that rationality is a real issue in peer-to-peer networks.
b) To introduce Algorithmic Mechanism Design (AMD) and Distributed Algorithmic
Mechanism Design (DAMD) as tools, which can be used when designing
networks with rational nodes.
c) To describe three open problems that are relevant in the peer to peer setting but
are unsolved in existing AMD/DAMD work.
7
ECS289
Survey Report
The authors give examples of the existence of rational behavior in all forms of
P2P systems, whether peer-to-peer search, as in Kazaa [Kazaa], or peer-to-peer
computation, as in Seti@Home project, to prove the existence of rational behavior in any
P2P system.
Proposing the use of the field of Mechanism Design (MD), the authors give an
overview of the objectives of MD, which are of interest to the designer of a P2P system.
The idea in MD is to define the strategic situation, or “rules of the game”, so that the
system as a whole exhibits good behavior in equilibrium when self-interested nodes
pursue self-interested strategies. Formally, a mechanism is a specification of possible
player strategies and a mapping from the set of played strategies to outcomes. MD can
be thought of as inverse game theory – where game theory reasons about how agents
will play a game, MD reasons about how to design games that produce desired
outcomes.
MD assumes that the players feed their calculated strategies to a special
obedient center that performs the mechanism calculation and declares the outcome.
A famous example of a good mechanism (with a center) is the second-price sealed-bid
auction (Vickrey Auction). As opposed to MD, the field of DAMD assumes that the
mechanism calculation is carried out via a distributed computation.
The authors finally raise three open problems which are unsolved in the AMD /
DAMD work but which are relevant to P2P setting:
a) Open Problem #1: What effect does network topology have on message passing
in a centralized mechanism running on a peer-to-peer network? What about in a
decentralized mechanism?
b) Open Problem #2: What are the bounds on the guarantees that mechanism
design can provide in a distributed setting, and what is the minimum set of helper
technologies that must be employed in concert with DAMD ideas in distributed
networks?
c) Open Problem #3: How can assumptions about the distribution (but not the
identity) of various node strategy types help to create mechanisms with good
properties?
Peer-Approved Incentive Mechanism
The authors in [Rang03] model the problem of co-operation in P2P systems as a
Multi-Person Prisoner’s Dilemma (MPD) [MPD]. The following four conditions define an
MPD:
1. There are n players in the system, each with the same binary choice and payoffs.
2. Each player has the same preferred choice, which does not change, no matter what
other players do.
3. A player is always better off if more among the others choose the un-preferred
alternative.
4. For a certain k > 1, if k or more players choose the un-preferred alternative, they are
better off than if all players had chosen the preferred alternative.
8
ECS289
Survey Report
They classify incentivizing schemes as either using pricing policies or using nonpricing policies and they compare and give simulation results of one from each category:
a)
Token Exchange – A form of pricing scheme, in which a consumer
must transfer a token to the supplier prior to a file download. To enable
newcomers to use the system, each first-time user might be allotted a
fixed number of tokens, but once these run out, the user has to serve
files to earn tokens.
b)
Peer-Approved - A reputation system is used to maintain ratings for
users, who are allowed to download files only from others with a lower
or equal rating. This strategy motivates users to increase their rating in
order to gain access to more files. User ratings can be based on
different metrics: e.g., the number of files advertised by a user or the
number of file-requests served by a user. First-time users without files
to share should be allowed to download a small number of files so that
they can enter the system and build their rating.
The authors believe the second non – pricing scheme is more flexible since the
user does not have to take a decision each time they want a file. A kind of flat price
versus usage based price argument. However, the authors assume that the underlying
reliable and secure mechanism to implement the above schemes is already in place and
focus on the policies.
The authors compare the above two strategies and a modification of PeerApproved - Peer Approved Tier (in which only a limited number of user rating categories
are allowed) with help of simulations. In the simulation analysis, the heterogeneous set
of users change the number of files they share, depending on the perceived benefits in
each iteration of the simulation. Each user has 50 files and they are assumed to be
equally popular. All the users have the same bandwidth and storage space. Each user
initially advertises only a percentage of his files, according to a Zipf distribution. For the
Peer-Approved schemes the rating of a user is the number of files currently advertised.
The results are illustrated in Fig 4.
Fig 4: Simulation results for a Zipf file advertising distribution [Rang03]
9
ECS289
Survey Report
Thus, the performance of Peer- Approved is comparable to Token Exchange and
so the authors conclude that it is a useful scheme in scenarios where pricing scheme
like Token Exchange are not preferred.
However, it is to be noted that the authors ignore the existence of the second
type of free riders, as shown in the Gnutella study, those who advertise content that is
not desired by others. So, even if a user is advertising files he may choose to advertise
files which are of no use to others, and thus manipulate the system.
Incentives for cooperation in Peer-to-Peer Networks
In this paper [Lai03], the authors use and extend the Evolutionary Prisoner’s
Dilemma [EPD] to study co-operation in a P2P system. The EPD adds to the classical
Prisoner Dilemma by introducing repetition of games and the building of reputation. The
authors’ extended version, called the Asymmetric EPD (AEPD) works as follows:
a)
b)
c)
d)
e)
f)
AEPD consists of players who meet for games.
A player can be a client in one game and a server in another.
The server has a choice between co-operation and defection.
Players decide depending on a strategy.
They may maintain histories of other players’ actions.
As a result of client and server’s actions, the payoffs from a payoff matrix are
added to their scores.
g) Round consists of one game by each player in the system as a client and a
server.
h) A generation consists of r rounds.
i) After a generation, all history is cleared.
j) Players evolve from their current strategies to higher scoring strategies in
proportion to the difference between the average scores of the two strategies,
after a generation.
They assume three types of players at the start of the game: 100 % Cooperators, 100 % Defectors, Reciprocatives who use the decision function - P(cooperation with X)= Min { (Co-op X gave/ co-operation X received), 1}. They give
simulation results for different mix proportions of the initial population.
10
ECS289
Survey Report
They also compare the performance of different stranger policies:
a. 100% Defect.
b. 100% Co-operate.
c. Adaptive.
Pc t+1 = (1- mu)* Pc t + mu * Ct
Ct = 1 if last stranger co-operated, 0 otherwise.
Pc t = probability to co-operate with stranger at time t.
In summary, their simulation results show the following:
1. Incentive techniques relying on private history of other players’ actions fail as
population size increases.
2. Shared history scales to large population but requires supporting infrastructure,
and is subject to collusion. Collusion is when a group of players conspire and
share wrong history about their defector friends, saying they co-operated.
3. Incentive techniques that adapt to the behavior of strangers converge to
complete co-operation despite no centralized identity allocation.
Addressing the Non-cooperation Problem in P2P Systems
The authors in [Kamvar03] look at the problem of cooperation with a fresh
perspective. They make the following observations:
a)
In a P2P system, where users gain from answering a query, free riding
is an unlikely problem. For example, in a pay-per-transaction
file-sharing system where peers get paid for uploading files, peers will
want to share files, because this generates income. In an auction
system, where the auction advertisement is analogous to a query and
bids analogous to query responses, peers will want to submit bids. Not
11
ECS289
b)
Survey Report
only are peers eager to provide services (e.g. share files), but they are
in competition with other peers to provide their services.
This competition creates another problem for networks like Gnutella,
which depend on peers to forward queries to other peers, since a peer
may drop the query to improve its chances of winning the auction
(being selected to answer the query).
The authors propose the Right To Respond (RTR) protocol for tackling the
second issue. They propose to run this protocol on top of Gnutella. They assume the
existence of an efficient micropayment scheme for their proposal.
The RTR protocol works as follows:



At the core of the protocol is the concept of a right to respond, or RTR. An RTR is
simply a token signifying that a peer has a right to respond to a query message.
A query is really a commodity. Peers should pay to receive the query, because
that in turn brings in potential business. If a peer never receives any queries,
then it can never provide its service to anyone. An analogous concept in real life
markets are companies that buy lists of emails or referrals from other companies,
so that they have a new pool of potential customers.
Once a peer buys an RTR for a given query, it may do one or both of the
following: (a) respond to the query and hope that it is chosen to upload its
services, (b) sell the RTR to other peers. (It can still respond to the RTR even if it
resells it).
Peers can buy and sell RTRs with their neighbors only.
In this framework, selling an RTR is equivalent to forwarding a query. Hence, there is
built-in incentive to forward queries, since peers get paid to do so. Of course, some
peers may still choose to not forward any queries in order to increase the probability that
they will be chosen to provide the service. However, their actions will be offset by those
peers who hedge their risk by selling a few RTRs, and by those peers who speculate in
RTRs (buying RTRs simply to resell them).
Basic Implementation of RTR
An RTR has the following format:
RTR = {Q, ts, query} SKQ
Q is the identity of the querying peer
ts is the timestamp at which the query was first issued,
and query is the actual query string.
These three values are signed by the querying peer's secret key SKQ , so that RTRs
cannot be forged. Hence, each query requires a single signature generation, and a
verification per forward. When a peer A forwards a query to a neighbor B,
it will first send the offer containing partial RTR information and a price:
Offer = {rep(Q), ts, query, price}
12
ECS289
Survey Report
where rep(Q) is the reputation of the querying peer. The authors do not specify how this
reputation is exactly determined.
The offer contains enough information for B to determine whether to purchase
the RTR, and whether the RTR is a duplicate B has seen before. However, because the
identity of Q is not revealed, B cannot actually answer the query without purchasing the
full RTR. If B decides not to purchase the RTR, he will simply drop the offer. Otherwise,
B will send a purchase request to A, and peer A will forward the full RTR to B.
This RTR protocol also allows the use of filters to restrict the RTRs received by a
peer. These filters in turn, can be used by the querying node to judge the desirability of
RTR by its neighbor and it can fix an appropriate price. Peers also have the option of
disconnecting from neighbors who are either bad sellers (sell uninteresting RTRs) or bad
buyers (do not buy RTRs).
The authors propose to study the performance of the protocol using simulation in
future studies.
Conclusions
The peer-to-peer networking paradigm promises to revolutionalize the way we
design, build and use the communications network of tomorrow. The fundamental
premise of peer-to-peer systems is that individual peers voluntarily contribute resources
to the system. However, the inherent tension between universal cooperation for optimal
overall utility, individual incentive to defect, and rational behavior leads to suboptimal
utility in such systems. This problem has recently come into sharp focus with the
revealing study of Gnutella by [AdHu00]. The alternative is to provide incentives for the
users to cooperate in such systems.
Most of the schemes surveyed in this paper, use game theoretic models to
analyze the problem. One proposal introduces the field of inverse game theory to design
P2P systems. Others propose the use of pricing mechanisms to act as incentives to
mitigate selfish behavior. However, the effectiveness of these schemes can only be
tested after a full-fledged implementation. Most of them work with certain assumptions
about the P2P systems that need to be verified by an actual implementation. The
growing popularity of P2P systems, as is evidenced by the fact that at present, there is
more KaZaA traffic than Web traffic (!) [RossInfocom] demands an urgent interest in
looking at the issues threatening their survival.
13
ECS289
Survey Report
References
[Gn00a] The Gnutella home page, http://www.gnutella.com/
[Na00] The Napster home page, http://www.napster.com/
[Fr00] The FreeNet home page, http://freenet.sourceforge.net/
[Kazaa] The KaZaA home page, http://www.kazaa.com/us/
[P2P] http://www-sop.inria.fr/mistral/personnel/Robin.Groenevelt/Publications/Peer-toPeer_Introduction_Feb.ppt
[Kamvar03] S. Kamvar, B. Yang, and H. Garcia-Molina, "Addressing the Non
Cooperation Problem in Competitive P2P Systems," Workshop on Economics of Peer-toPeer Systems, June 2003.
[Shn03] J. Shneidman and D. Parkes, "Rationality and Self-Interest in Peer-to-Peer
Networks," Proceedings of 2nd Int. Workshop on Peer-to-Peer Systems, February 2003.
[Lai03] K. Lai, M. Feldman, I. Stoica, and J. Chuang, "Incentives for Cooperation in
Peer-to-Peer Networks," Workshop on Economics of Peer-to-Peer Systems, June 2003.
[Hardin68] Hardin, G. The Tragedy of the Commons. Science 162 (1968), 1243–1248.
[EPD] Axelrod, R. The Evolution of Cooperation. Basic Books,1984.
[Rang03] K. Ranganathan, M. Ripeanu, A. Sarin, and I. Foster, "To Share or Not to
Share: An Analysis of Incentives to Contribute in Collaborative File Sharing
Environments," Workshop on Economics of Peer-to-Peer Systems, June 2003.
[MPD] Schelling, T.C., Micromotives and Macrobehavior. 1978: W.W.Norton &
Company.
[Feldman03] M. Feldman, K. Lai, J. Chuang, and I. Stoica, "Quantifying Disincentives in
Peer-to-Peer Networks", 1st Workshop on Economics of Peer-to-Peer Systems, June
2003
[AdHu00] Adar, E. and B.A. Huberman, Free Riding on Gnutella, 2000, First Monday.
http://www.firstmonday.dk/issues/issue5_10/adar/
[RossInfocom] http://cis.poly.edu/~ross/papers/P2PtutorialInfocom.pdf
14
Download