Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li

advertisement
Chord: A Scalable Peer-to-peer Lookup Protocol
for Internet Applications
Xiaozhou Li
COS 461: Computer Networks (precept 04/06/12)
Princeton University
Background
• We studied P2P file sharing in class
– Napster
– Gnutella
– KaZaA
– BitTorrent
• Today, let’s learn more!
– Chord: a scalable P2P lookup protocol
– CFS: a distributed file system built on top of Chord
– http://pdos.csail.mit.edu/chord
2
Review: distributed hash table
Distributed application
put(key, data)
get (key)
data
Distributed hash table
node
node
….
node
DHT provides the information look up service for P2P applications.
• Nodes uniformly distributed across key space
• Nodes form an overlay network
• Nodes maintain list of neighbors in routing table
• Decoupled from physical network topology
3
The lookup problem
N2
N3
N1
Key = “beat it”
Value = MP3 data…
Publisher
Internet
?
N4
N6
Client
Lookup(“beat it”)
N5
4
Centralized lookup (Napster)
SetLoc(“beat it”, N4)
Publisher@ N
4
Key = “beat it”
Value = MP3 data…
N2
N1
N3
DB
N9
N6
N7
Client
Lookup(“beat it”)
N8
Simple, but O(N) state and a single point of failure
5
Flooded queries (Gnutella)
N2
N1
Publisher@ N4
Key = “beat it”
Value = MP3 data…
N6
N7
N3
Lookup(“beat it”)
Client
N8
N9
Robust, but worst case O(N) messages per lookup
6
Routed queries (Chord)
N2
N1
Publisher
Key = “beat it”
Value = MP3 data…
N3
Client
N4
Lookup(“beat it”)
N6
N7
N8
N9
7
Routing challenges
•
•
•
•
Define a useful key nearness metric
Keep the hop count small
Keep the tables small
Stay robust despite rapid change
• Chord: emphasizes efficiency and simplicity
8
Chord properties
• Efficient: O(log(N)) messages per lookup
– N is the total number of servers
• Scalable: O(log(N)) state per node
• Robust: survives massive failures
• Proofs are in paper / tech report
– Assuming no malicious participants
9
Chord overview
• Provides peer-to-peer hash lookup:
– Lookup(key)  return IP address
– Chord does not store the data
• How does Chord route lookups?
• How does Chord maintain routing tables?
10
Chord IDs
•
•
•
•
Key identifier = SHA-1(key)
Node identifier = SHA-1(IP address)
Both are uniformly distributed
Both exist in the same ID space
• How to map key IDs to node IDs?
– The heart of Chord protocol is “consistent hashing”
11
Review: consistent hashing
for data partitioning and replication
1 0
hash(key1)
E
replication factor N=3
A
C
hash(key2)
F
D
B
1/2
A key is stored at its successor: node with next higher ID
12
Identifier to node mapping example
•
•
•
•
•
Node 8 maps [5,8]
Node 15 maps [9,15]
Node 20 maps [16, 20]
…
Node 4 maps [59, 4]
4
58
8
15
• Each node maintains a
pointer to its successor
44
20
35
32
13
Lookup
lookup(37)
4
• Each node maintains
its successor
• Route packet (ID, data)
to the node
responsible for ID using
successor pointers
58
8
node=44
15
44
20
35
32
14
Join Operation




Node with id=50 joins the
ring via node 15
Node 50: send join(50) to
node 15
Node 44: returns node 58
Node 50 updates its
successor to 58
succ=4
pred=44
4
58
8
join(50)
succ=nil
succ=58
pred=nil
50
15
58
succ=58
pred=35
44
20
35
32
15
Periodic Stabilize


succ=4
pred=44
pred=50
Node 50: periodic
stabilize
stabilize(node=50)

Sends stabilize
notify(node=50)
message to 58
Node 50: send notify
message to 58
succ=58

Update pred=44
4
58
8
succ.pred=44
pred=nil
15
50
succ=58
pred=35
44
20
35
32
16
Periodic Stabilize



Node 44: periodic stabilize
Asks 58 for pred (50)
Node 44 updates its successor
to 50
succ=4
pred=50
4
58
8
stabilize(node=44)
succ=58
pred=nil
15
50
succ=58
succ=50
pred=35
44
20
35
32
17
Periodic Stabilize


succ=4
pred=50
Node 44 has a new successor
(50)
Node 44 sends a notify
message to node 50
4
58
succ=58
pred=nil
pred=44
50
succ=50
pred=35
8
15
notify(node=44)
44
20
35
32
18
Periodic Stabilize Converges!

pred=50
This completes the
joining operation!
4
58
succ=58
pred=44
succ=50
8
50
15
44
20
35
32
19
Achieving Efficiency: finger tables
Finger Table at 80
i
0
1
2
3
4
5
6
ft[i]
96
96
96
96
96
112
20
Say m=7
0
(80 + 26) mod 27 = 16
80 + 25
112
20
96
32
80 + 24
80 + 23
80 + 22
80 + 21
80 + 20
80
45
ith entry at peer with id n is first peer with id >= n  2i (mod 2m )
• Each node only stores O(log N) entries
• Each look up takes at most O(log N) hops
20
Achieving Robustness
• What if nodes FAIL?
• Ring robustness: each node maintains the k (> 1) immediate
successors instead of only one successor
– If smallest successor does no respond, substitute the second entry in its
successor list
– Unlikely all successors fail simultaneously
• Modifications to stabilize protocol (see paper!)
21
Example of Chord-based storage system
Storage App
Storage App
Storage App
Distributed
Hashtable
Distributed
Hashtable
Distributed
Hashtable
Chord
Chord
Chord
Client
Server
Server
22
Cooperative File System (CFS)
Block storage
Availability / replication
Authentication
Caching
Consistency
Server selection
Keyword search
DHash distributed
block store
Lookup
Chord
• Powerful lookup simplifies other mechanisms
23
Cooperative File System (cont.)
• Block storage
– Split each file into blocks and distribute those blocks
over many servers
– Balance the load of serving popular files
• Data replication
– Replicate each block on k servers
– Increase availability
– Reduce latency (fetch from the server with least
latency)
24
Cooperative File System (cont.)
• Caching
– Caches blocks along the lookup path
– Avoid overloading servers that hold popular data
• Load balance
– Different servers may have different capacities
– A real server may act as multiple virtual servers, by
being hashed to several different IDs.
25
Q&A
26
Download