Peer-to-Peer Filesystems Tom Roeder CS414 2005sp

advertisement
Peer-to-Peer Filesystems
Tom Roeder
CS414 2005sp
Nature of P2P Systems

We discussed this a little in 415 on Friday



In some sense, P2P is older than the name



P2P: communicating peers in the system
normally an overlay in the network
many protocols used symmetric interactions
not everything is client-server
What’s the real definition?


no-one has a good one, yet
depends on what you want to fit in the class
Nature of P2P Systems

Standard definition



Minimally: is the Web a P2P system?




symmetric interactions between peers
no distinguished server
We don’t want to say that it is
but it is, under this definition
I can always run a server if I want: no asymmtery
There must be more structure than this

Let’s try again
Nature of P2P Systems

Recent definition





Try again: is the Web P2P?


No distinguished initial state
Each server has the same code
servers cooperate to handle requests
clients don’t matter: servers are the P2P system
No, not under this def: servers don’t interact
Is the Google server farm P2P?

Depends on how it’s set up? Probably not.
Overlays


Recall: two types of overlays
Unstructured



No infrastructure set up for routing
Random walks, flood search
Structured




Small World Phenomenon: Kleinberg
Set up enough structure to get fast routing
We will see O(log n)
For special tasks, can get O(1)
Overlays: Unstructured

From Gribble



a common unstructured overlay
look at connectivity
more structure than it seems at first
Overlays: Unstructured

Gossip: state synchronization technique




Convergence of state is reasonably fast



Instead of forced flooding, share state
Do so infrequently with one neighbor at a time
Original insight from epidemic theory
with high probability for almost all nodes
good probabilistic guarantees
Trivial to implement

Saves bandwidth and energy consumption
Overlays: Structured

Need to build up long distance pointers


think of routing within levels of a namespace
eg. namespace is 10 digit numbers base 4



0112032101
then you can hop levels to find other nodes
This is the most common structure imposed
Distributed Hash Tables

One way to do this structured routing





Assign each node each node an id from space
eg. 128 bits: SHA-1 salted hash of IP address
build up a ring: circular hashing
assign nodes into this space
Value



diversity of neighbors
even coverage of space
less chance of attack?
Distributed Hash Tables

Why “hash tables”?




Stored named objects by hash code
Route the object to the nearest location in space
key idea: nodes and objects share id space
How do you find an object without its name?


Cost of churn?


Close names don’t help because of hashing
In most P2P apps, many joins and leaves
Cost of freeloaders?
Distributed Hash Tables

Dangers



Sybil attacks: one node becomes many
id attacks: can place your node wherever
Solutions hard to come by



crytpo puzzles / money for IDs?
Certification of routing and storage?
Many routing frameworks in this spirit


Very popular in late 90s early 00s
Pastry, Tapestry, CAN, Chord, Kademlia
Applications of DHTs

Almost anything that involves routing





illegal file sharing: obvious application
backup/storage
filesystems
P2P DNS
Good properties



O(log N) hops to find an id
Non-fate-sharing id neighbors
Random distribution of objects to nodes
Pastry: Node state
Pastry: Node Joins

Find another geographically nearby node







Hash IP address to get Pastry id
Try to route a join message to this id
get routing tables from each hop and dest
select neighborhood set from nearby node
get the leaf set from the destination
Give info back to nodes so they can add you
Assuming the Pastry ring is well set up, this
procedure will give good parameters
Pastry: Node Joins

Consider what happens from node 0



bootstraps itself
next node to come adds itself and adds this node
Neighborhood information will be bad for a while



need a good way to discover network proximity
This is a current research problem
On node leaves, do the reverse


If a node leaves suddenly, must be detected
removal from tables by detecting node
Pastry: Routing

The key idea: grow common prefix




given an object id, try to send to a node with at
least one more digit in common
if not possible, send to a node that is closer
numerically
if not possible, then you are the destination
Gives O(log N) hops


Each step gets closer to destination
Guaranteed to converge
Pastry: Routing
PAST: Pastry Filesystem

Now a simple filesystem follows:



Punt on metadata/discovery



to get a file, hash its name and look up in Pastry
to store a file, store it Pastry
Can implement directories as files
Then just need to know the name of root
Shown to give reasonable utilization of
storage space
PAST: File Replication

Since any one node might fail, replicate




Uses the neighbor set for k-way storage
Keeps the same file at each neighbor
Diversity of neighbors helps fate-sharing
Certification

Each node signs a certificate



Says that it stored the file
Client will retry storage if not enough certificates
OK guarantees
PAST: Tradeoffs

No explicit FS structure:




Speed vs. storage



Could build any sort of system by storing files
Basically variable-sized block storage mechanism
This buys simplicity at the cost of optimization
See Beehive for this tradeoff
Makes it an explicit formula; can be tuned
Ease of use vs. security

Hashes make file discovery non-transparent
Rationale and Validation

Backing up on other systems



no fate sharing
automatic backup by storing the file
But





Cost much higher than regular filesystem
Incentives: why should I store your files?
How is this better than tape backup?
How is this affected by churn/freeloaders
Will anyone ever use it?
PAST: comparsion to CFS

CFS: a filesystem built on Chord/DHash


Pastry is MSR, Chord/DHash is MIT
Very similar routing and storage
PAST: comparison to CFS

PAST stores files, CFS blocks


Thus CFS can use more fine-grained space
lookup could be much longer


CFS claims: ftp-like speed




get each block: must go through routing for each
Could imagine much faster: get blocks in parallel
thus routing is slowing them down
Remember: hops here are overlay, not internet, hops
Load balancing in CFS

predictable storage requirements per file per node
References



A. Rowstron and P. Druschel, "Pastry: Scalable, distributed
object location and routing for large-scale peer-to-peer systems".
IFIP/ACM International Conference on Distributed Systems
Platforms (Middleware), Heidelberg, Germany, pages 329-350,
November, 2001.
A. Rowstron and P. Druschel, "Storage management and caching
in PAST, a large-scale, persistent peer-to-peer storage utility",
ACM Symposium on Operating Systems Principles (SOSP'01),
Banff, Canada, October 2001.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and
Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup
Service for Internet Applications, ACM SIGCOMM 2001, San
Deigo, CA, August 2001, pp. 149-160.
References



Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris,
and Ion Stoica, Wide-area cooperative storage with CFS, ACM
SOSP 2001, Banff, October 2001.
Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A
Measurement Study of Peer-to-Peer File Sharing Systems,
Proceedings of Multimedia Computing and Networking 2002
(MMCN'02), San Jose, CA, January 2002.Kleinberg
C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby
copies of replicated objects in a distributed environment. In
Proceedings of the 9th Annual ACM Symposium on Parallel
Algorithms and Architectures, Newport, Rhode Island, pages
311-320, June 1997.
Conclusions

Tradeoffs are critical



DHT applications


Why are you using it?
What sort of security/anonymity guarantees?
Think of a good one and become famous
PAST



caches whole files
Save some routing overhead
Harder to implement true filesystem
Download