p2p_3

advertisement
Routing and Location in P2P Networks
Klaus Marius Hansen
University of Aarhus
2003/09/23
Routing and location in previously introduced systems. Routing in structured overlay
networks: Pastry and Chord.
Routing and Location in P2P Networks





Material
Routing and Location in Introduced P2P Systems
Pastry
Chord
Summary
Material
Material


o
o

o
o
Previous systems introduced in the course
(Rowstron & Druschel 2001)
Rowstron, A. & Druschel, P. (2001), Pastry: Scalable, distributed object
location and routing for large-scale peer-to-peer systems, in IFIP/ACM International
Conference on Distributed Systems Platforms (Middleware 2001), pp. 329-350.
P2P routing and location in a structured overlay network taking into account
network locality
(Stoica, Morris, Liben-Nowell, Karger, Kaashoek, Dabek & Balakrishnan 2003)
Stoica, I., Morris, R., Liben-Nowell, D., Karger, D. R., Kaashoek, M. F.,
Dabek, F. & Balakrishnan, H. (2003), Chord: A scalable peer-to-peer lookup protocol for
internet applications, IEEE/ACM Transactions on Networking Software 11(1), 17-32.
Algorithms for location in P2P networks. Provable correctness and
performance.
Routing and Location in Introduced P2P Systems
Background


o
o
o
Routing: the process of moving a data packet to a location
Central Questions
How do we efficiently locate a peer?
When a peer is located, how do we efficiently route messages to and from that
peer?
... of course all in the context of no global network knowledge, and frequent
joins and leaves by peers...
A Traceroute Example
Indexed

o
o
o
Using index servers for location, IP for routing
Napster
SETI@Home
ICQ
Walking/flooding

o
o
o
(Unstructured) walks based on neighbor sets
Gnutella
Kazaa (hybrid)
(JXTA)
Key-Proximity

o
o
Route based on (unstructed) narrowing down of difference in keys
Freenet
Windows P2P
Weaknesses



Single point of failure for indexed
Potentially low performance for walking/flooding
Hard to prove correctness/performance/space requirements for routing protocols
Pastry
Pastry
Overview

o
o
Effective, distributed object location and routing substrate for P2P networks
"Effective": O(log N) routing hops
"Distributed": no servers, routing and location distributed to nodes, only
limited knowledge (size O(log N) of routing tables) at nodes
o



"Substrate": not an application itself, rather it provides Application Program
Interface (API) to be used by applications

Runs on all nodes joined in a Pastry network
Each node has a unique identifier (nodeId)
Given a key and a message, Pastry routes the message to the node with nodeId
numerically closest to the key
Takes into account network locality
Pastry API

o
o

o
o

Pastry exports
nodeId = pastryInit(Credentials, Application)make the
local node join/create a Pastry network. Credentials are used for authorization. An
object used for callbacks is passed through the Application parameter
route(msg, key): routes a message to the live node D with nodeId
numerically closest to the key (at the time of delivery)
Application interface to be implemented by applications using Pastry
deliver(msg, key): called on the application at the destination node for
the given id
forward(msg, key, nextId): invoked on applications when the
underlying node is about to forward the given message to the node with nodeId = nextId.
(Actually using the FreePastry 1.3 open source Java is slightly more involved)
Assumptions and Guarantees

Each node is assigned a 128 bit nodeId
o
nodeIds are assumed to be uniformly distributed in the 128 bit id space =>
numerically close nodeIds belong to diverse nodes
o
can be achieved, e.g., using a crypthographic hash of IP address of a node

Pastry can route to numerically closest node in ceiling(log2^b(N)) steps (b is a
configuration parameter)
o
If less than |L|/2 (|L| is a configuration parameter) adjacent nodeIds fail
concurrently, eventual delivery is guaranteed

Join, leave in O(log N)

Maintains locality based on application-defined scalar proximity metric
Example Applications

SCRIBE:group communication/event notification
o
Groups can be created and joined

Members of a group may multicast messages to all members of a group
(delivered using best-effort)

Each group has a unique id, groupId (from a hash of the group name
and the creators name)
o
The node with nodeId numerically closest to groupId acts as rendezvous for
the group

Group creation is handled by sending a CREATE message to the node
with id groupId

Nodes wishing to join sends a JOIN message to this node
o
To send a message, a node sends a MULTICAST message to the rendezvous


o
o
o

o

o

In principle, the rendezvous might just then send messages to all joined
nodes. (SCRIBE actually builds a multicast tree rooted in the rendezvous for
optimization)
PAST: Archival storage
Each file inserted gets a 160 bit fileId (from a hash of file name, owner's public
key, and random salt)
Pastry routes the file to the k nodes that are numerically closest to the first 128
bits of the fileId
Lookup ensures that the file is found as long as 1 of the k nodes are alive
SQUIRREL
co-operative web caching
SplitStream
high-bandwidth content distribution
...
Routing Table
Routing Example
Pastry Routing Algorithm
Pastry Routing Algorithm - Analysis

o
o

o
o
o

o
o
o
o
Observation: Either 1), 2), or 3) must hold
For 3): leaf set must contain nodes numerically closer to the key with same
shared prefix as us (otherwise, we are the closest node)
- unless |L|/2 nodes in leaf set have failed... simultaneously
Termination
1) Directly terminates at chosen node
2) Node routed to shares a longer prefix with key
3) Node routed to shares a prefix of same length but with numerically closer
key
(Expected) performance
1) Destination one hop away
2) The set of possible nodes with a longer prefix match is reduced by 2^b
3) Only one extra routing step is needed (with high probability)

Given accurate routing tables, the probability for 3) is the probability
that a node with the given prefix does not exist and that the key is not covered by the
leaf set
=> expected performance is O(log N)
Self-Organization - Node Arrival






New node, X, needs to know existing, nearby node, A, (can be achieved using, e.g.,
multicast on local network)
X asks a to route a "join" message with key equal to X
o
Pastry routes this message to node Z with key numerically closest to X
o
All nodes enroute to Z returns their state to X
X updates its state based on returned state
o
neighborhood set = neighborhood set of A
o
leaf set is based leaf set of Z (since Z has id closet to id of X)
o
Rows of routing table is initialized based on rows of routing tables of nodes
visited enroute to Z (since these share common prefixes with X)
X calibrates routing table and neighborhood set based on data from the nodes
referenced therein
X sends it state to all the nodes mentioned in its leaf set, routing table, and
neighborhood set
O(log2^b(N)) messages exchanged
Self-Organization - Node Departure


o
o
o

o
o

Assumption: A node that can no longer be communicated with has failed
Repair of leaf set
Contact the live node with the largest index on the side of the failed node and
get leaf set from that node
Returned leaf set will contain an appropriate one to insert
Contacting works unless |L|/2 nodes with adjacent nodeIds have failed
Repair of routing table
Contact other node on the same row to check if this node has a replacement
node (the contacted node may have a replacement node on the same row of its routing
table)
If not, contact node on next row of routing table
Repair of neighborhood set
o
o
Neighborhood set is normally not used in routing => contact periodically to
check for liveness
If a neighbor is not responding, check with other neighbors for close nodes
Locality

Routing performance is based on small number of routing hops - and "good" locality
of routing with respect to underlying network

Scalar proximity metric (e.g., number of IP routing hops, geographical distance, or
available bandwidth)
o
Applications are responsible for providing proximity metrics

Join protocol maintains closeness invariant
Handling Malicious Nodes?

Choose randomly between nodes satisfying the criteria of the routing protocol...
Experimental Results - Routing Performance
Experimental Results - Routing Distance
Experimental Results - Routing Distance
Experimental Results - Quality of Routing Tables
Summary

o
o


Pastry is a P2P content location and routing substrate
Structured overlay network
Usable for building various P2P application
Space and time requirements (expected) in O(log N), N = number of nodes in network
Takes locality into account
Chord
Overview

o


o
o
One operation
IP address = lookup(key): Given a key, find node responsible for
that key
Goals: Load balancing, decentralization, scalability, availability, flexible naming
Performance and space usage
Lookup in O(log N)
Each node needs information about O(log N) other nodes
Example Applications

o
o
Cooperative File System (CFS)
Building a distributed hash table on top of Chord (DHash)
Storing blocks using DHash, lookup using Chord

o
o
Distributed Indices
Derive keys from desired keywords
Let values be servers holding documents matching the desired keywords
Use of Consistent Hashing in Chord (1)

Keys are assigned to nodes with consistent hashing
Hash function balances load
Rebalancing (when node joins or leaves) requires moving only O(log 1/N)

Nodes and keys are assigned m-bit identifiers
o
Using SHA-1 on nodes' IP addresses and on keys
o
m should be big enough to make collisions improbable

"Ring-based" assignment of keys to nodes
o
Identifiers are ordered on an identifier circle modulo 2^m
o
A key k is assigned to the first node n whose identifier is equal to or follows k
- n = successor(k)

Chord improves on consistent hashing by only requiring knowledge about O(log N)
other nodes at each node
o
o
Use of Consistent Hashing in Chord (2)
Use of Consistent Hashing in Chord (3)

o
o
o


Designed to let nodes enter and leave network easily
Node n leaves: all of n's assigned keys are assigned to successor(n)
Node n joins: keys k <= n assigned to successor(n) become assigned to n
Compare "traditional hashing", e.g., h(x) = ax + b (mod p), in which p
changes...
Example: node 26 joins => key 24 becomes assigned to node 26
(Each physical node runs a number of virtual nodes each with its own identifier to
balance load)
Simple Key Location


(a) Simple key location can be implemented in time O(log N) and space O(1)
(b) Example: Node 8 performs a lookup for key 54
Scalable Key Location (1)

o
Uses finger tables
n.finger[i] = successor(n + 2^(i-1)), 1 <= i <= m
Scalable Key Location (2)

If successor not found, search finger table to find n' whose ID most immediately
precedes id
o
Rationale: this node will know the most about n' of all nodes in the finger table
Scalable Key Location (3)

o
o

o
Performance is O(log N) with high probability
Each node can forward a query at least halfway along the remaining distance
=> less than m steps to find node
After 2log N steps, the distance is max 2^m/2^(2log N) = 2^m/N^2 - and the
probability for two nodes to be in such an interval is 1/N, i.e., negligible
Space required is O(log N) with high probability
As above: for i <= m - 2log N, the i'th finger of the node will be the node's
immediate successor with high probability
Self-organization - Node failures

o
o
Chord maintains successor lists to cope with node failures
Node leave could be viewed as a failure
If nodes leaves voluntarily, it may notify its successor and predecessor
Experimental Results - Path Length
Summary

o


Decentralized lookup of nodes responsible for storing keys
Based on distributed, consistent hashing
Performance and space in O(log N) for stable networks
Simple; provable performance and correctness
Summary
Summary

o
o

o
o
"First generation" routing and location in P2P networks
Largely application-specific
Hard to analyse
"Second generation" routing and location in P2P networks
Based on structured network overlays
Typically expected O(log N) time and space requirements
Created by JackSVG
Download