A Measurement Study of Peer-to-Peer File Sharing Systems Stefan Saroiu, P. Krishna

advertisement
A Measurement Study of
Peer-to-Peer File Sharing
Systems
Stefan Saroiu, P. Krishna
Gummadi, Steven D. Gribble
Presented by Zhengxiang Pan
March 18th, 2003
Introduction
•
•
•
•
•
Napster & Gnutella
Population of users
Bottleneck bandwidth of hosts & latencies
Duration time of remain connected
Number of files shared & downloaded
Methodology-architecture
• Napster’s architecture
– A cluster of central servers
– Each peer connects to one server
– Servers cooperate to process query
• Gnutella’s architecture
– No centralized servers
– Peers form overlay network
– Send a query by a controlled flood
Methodology-crawler
• Napster crawler
– A larger number of connections to a single
server
– Issue popular queries in parallel
– Captured 40%-60% local users
• Gnutella crawler
– Iteratively send ping messages with large
TTLs
– Discover new hosts by receiving pong
messages.
– Capture 25%-50% of the total population
Methodology-directly measure characteristics
• Latency
– Measure the time spent by exchanging a 40-byte
TCP packet.
• Lifetime
– Offline: not respond to TCP SYN packets
– Inactive: respond with TCP RST
– Active: accept the connection
• Bottleneck bandwidth
– Approximate to available bandwidth
– Actively measure upstream and downstream using
a few TCP packets
Results-bandwidth
Downstream & upstream bottleneck bandwidth
-50% in Napster & 60% in Gnutella use broadband connections
-25% in Napster & 8% in Gnutella use modems
-20% in Napster & 30% in Gnutella have high bandwidth (>3Mbps)
Result-reported bandwidth
22% in Napster report “unknown” bandwidth
Result- latency
Latencies for Gnutella users
-Unstructured, ad-hoc, a substantial fraction suffer from highlantency
-Difference in trans-oceanic peers
Result- availability
-only 20% peers had an IP-level uptime of 93% or more
-Median session duration : 60 minutes
Result-files
-25% in Gnutella do not share any files
-40%-60% peers share 5%-20% of the shared files
Result-download & upload
the percentage of peers in each bandwidth class is roughly
the same as the percentage of files shared by that
bandwidth class.
Result- cooperate
-30% of the users that report their bandwidth as 64 Kbps or
less actually have a significantly greater bandwidth.
-10% of the users reporting high bandwidth (3Mbps or higher)
in reality have significantly lower bandwidth.
Result-resilience of Gnutella overlay
Although highly resilient in the face of random
breakdowns, Gnutella is nevertheless highly vulnerable in
the face of well-orchestrated, targeted attacks.
Conclusion
• Heterogeneity of hosts
– Carefully delegate responsibilities
• Clearly evidence of client-like and serverlike behaviors
• Peers tend to misreport information if there
is an incentive to do so
– Built-in incentive for telling the truth
– Verify reported information
Download