Peer-to-Peer Service

advertisement
Peer-to-Peer Systems
ECE7650
P2P Service
1-1
Outline
 What and Why P2P?
 Examples of P2P Applications
 File Sharing
 Voice over IP
 Steaming
 Overview of P2P Architecture
P2P Service
1-2
What is P2P?
 “P2P is a class of applications that take
advantage of resources – storage, cycles,
content, human presence – available at the
edges of the Internet. Because accessing
these decentralized resources means
operating in an environment of unstable and
unpredictable IP addresses P2P nodes must
operate outside the DNS system and have
significant, or total autonomy from central
servers”
Clay Shirky (www.shirky.com)
Defining Characteristics
Significant autonomy from central servers
2. Exploits resources at the edges of the
Internet
1.



storage and content
CPU cycles
human presence
3. Resources at edge have intermittent
connectivity, being added & removed
P2P Service
1-4
Examples of P2P Usages
 File Sharing and Content Delivery

BitTorrent, eDonkey, Gnutella, etc
 P2P Communication and appl-level multicast
Voice over IP and Instant Messaging: Skype
 Video Streaming: PPLive

 Distributed Processing
 SETI@Home, PlanetLab, etc
 Distributed databases
 Collaboration and Distributed games
 Ad hoc networks
P2P Service
1-5
Current Status (as of 10/2010)
 Cisco Virtual Networking Index revealed that
 P2P traffic, mainly file sharing, to be doubled by 2014 ,
growing to 7+ petabytes per month
 Share drops dwon from 75% of all bw on the Internet 5
years ago down to 39% at the end of 2009
 Roughly 46% of all traffic in 2014 will be attributed
Internet Video
P2P Service
1-6
Client/Server Architecture
 Well known,
powerful, reliable
server is a data
source
 Clients request data
from server
Server
Internet
 Very successful
model

WWW (HTTP), FTP,
Web services, etc.
Client
Client
Client
* Figure from http://project-iris.net/talks/dht-toronto-03.ppt
Client
Client/Server Limitations
 Scalability is hard to achieve
 Presents a single point of failure
 Requires administration
 Unused resources at the network edge
 P2P systems try to address these
limitations
P2P Overlay Network
 All nodes are both clients
and servers


Provide and consume data
Any node can initiate a
connection
 No centralized data
source
 Overlay Graph
Virtual edge
Overlay maintenance
 TCP connection
 Periodically ping to make sure
 or simply a pointer to an IP address neighbor is still alive
Or verify liveness while messaging
If neighbor goes down, may want to
establish new edge
New node needs to bootstrap
Overlay networks
Overlay
IP
IPv6 – Caen, 11 juin 2003
10
Overlay networks
Overlay
IP
IPv6 – Caen, 11 juin 2003
11
Overlays: all in the application layer
 Tremendous design
flexibility
Topology, maintenance
 Message types
 Protocol
 Messaging over TCP or UDP

 Underlying physical net is
transparent to developer

But some overlays exploit
proximity
P2P Service
1-12
Main Components of a P2P app
 A web portal of the application, aka login server
 Check out the availability of services as well as peers
 Directory server, furnishing peers availability info,
aka tracker
 Peers, which join and leave at will, obey behavior
rules and implement business logics of the apps



Discovery: find out if a particular service is available,
data (meta-data about the service), and peers holding
the actual data
Location: acquire location info about tracker of a
service, and info about peers having the data; report its
own location and data already possessed
Data transfer: push vs pull approaches for data
exchange; structured vs unstructured interconnect
network of peers
P2P Service
1-13
P2P Goals and Benefits
 Service availability: efficient use of resources
 Unused bw, storage, proc power at the “edge of the network”
 Performance and Scalability
 No central information, comm and computation bottleneck
 a certain rate of population growth that can be supported while
maintaining a stable level of perf
 Reliability
 Replicas, Geographic distribution
 No single point of failure
 Robustness to change (or dynamism)
 Probability that p2p system can continue to provide a certain
level of perf subject to a particular model of peer dynamics
 Ease of administration
 Nodes self-organize
 Built-in fault tolerance, replication, and load balancing
 Increased autonomy
 Trust: Security, Privacy (Anonymity), Incentives
Outline
 What and Why P2P?
 Examples of P2P Applications
 File Sharing
 Voice over IP
 Steaming
 Overview of P2P Architecture
P2P Service
1-15
P2P file sharing
Example
 Alice runs P2P client
application on her
notebook computer
 Intermittently
connects to Internet;
gets new IP address
for each connection
 Asks for “Hey Jude”
 Application displays
other peers that have
copy of Hey Jude.
 Alice chooses one of
the peers, Bob.
 File is copied from
Bob’s PC to Alice’s
notebook: HTTP
 While Alice downloads,
other users uploading
from Alice.
 Alice’s peer is both a
Web client and a
transient Web server.
All peers are servers =
highly scalable!
P2P Service
16
Key Issues in P2P File Sharing
 Search: the file-sharing system has to support a
convenient and accurate file search userinterface.
 Peer Selection: The file-sharing system has to
support an efficient peer selection mechanism so
as to minimize the download time.
 Connection. Peers should be able to set up more or
less stable data transfer connections so that file
data packets can be exchanged efficiently.
 Performance. The key performance metrics are
download time and availability.
P2P Service
1-17
How Did it Start?
 A killer application: Napster
 Free
music over the Internet
 Key idea: share the storage and
bandwidth of individual (home) users
Internet
P2P Service
18
Main Challenges
 Find where a particular file is stored
 Note: problem similar to finding a particular
page in web caching
 Nodes join and leave dynamically
E
F
D
E?
A
C
B
P2P Service
19
P2P file sharing Architectures
 Centralized Directory:
 Central Directory keeps track of peer IPs and their
shared content
 Example: Napster and Instant Messaging
 Distributed Query Flooding:
 Peers keep their own shared directory and content is
located in nearby peers.
 Example: Gnutella protocol
 Distributed Heterogeneous Peers
 proprietary protocol, group leaders with high bandwidth
act as central directories searched by connected peers
 Example: KaZaA
P2P Service
20
P2P: centralized directory
Original “Napster” design
1) when peer connects, it
informs central server:


Bob
centralized
directory server
1
peers
IP address
content
2) Alice queries for “Hey
Jude”
3) Alice requests file from
Bob
1
3
1
2
1
Alice
P2P Service
21
Napster: Example
m5
E
m6
F
E?
E
E?
m5
m1
m2
m3
m4
m5
m6
m4
C
A
m1
D
A
B
C
D
E
F
B
m3
m2
P2P Service
22
Napster: History
 history:
 5/99: Shawn Fanning (freshman, Northeasten U.) founds
Napster Online music service
 12/99: first lawsuit
 3/00: 25% UWisc traffic Napster
 2000: est. 60M users
 2/01: US Circuit Court of
Appeals: Napster knew users
violating copyright laws
 7/01: # simultaneous online users:
Napster 160K, Gnutella: 40K,
 Now: try to come back
http://www.napster.com
23
P2P: problems with centralized directory
 Single point of failure: if
the central directory
crashes the whole
application goes down.
 Performance bottleneck:
the central server
maintains a large
database
 Copyright infringement
file transfer is
decentralized, but
locating content is
highly centralized
P2P Service
24
Query flooding: Gnutella
 fully distributed
 no central server
 public domain protocol
 many Gnutella clients
implementing protocol
 Peers discover other
peers through Gnutella
hosts that maintain
and cache list of
available peers.
Discovery is not part
of the Gnutella
protocol.
overlay network: graph
 edge between peer X
and Y if there’s a TCP
connection
 all active peers and
edges is overlay
network
 Edge is not a physical
link but logical link
 Given peer will
typically be connected
with < 10 overlay
neighbors
P2P Service
25
Gnutella: Example
 Assume: m1’s neighbors are m2 and m3;
m3’s neighbors are m4 and m5;…
m5
E
m6
F
E
D
E?
E?
m4
E?
E?
C
A
m1
B
m3
m2
P2P Service
26
Gnutella: Peer joining or leaving
1.
2.
3.
4.
5.
6.
Joining peer X must find some other peer in Gnutella
network: use list of candidate peers
X sequentially attempts to make TCP with peers on list
until connection setup with Y
X sends Ping message to Y; Y forwards Ping message. The
frequency of Ping messages are not part of the protocol
but they should be minimized.
All peers receiving Ping message respond with Pong
message containing the number of files shared and their
size in kbytes.
X receives many Pong messages. It can then setup
additional TCP connections
When a peer leaves the network, other peers try to
connect sequentially to others
P2P Service
27
Gnutella Scoped Flooding
Searching by flooding:
 If you don’t have the file
you want, query 7 of your
neighbors.
 If they don’t have it,
they contact 7 of their
neighbors, for a maximum
hop count of 10.
 Requests are flooded, but
there is no tree
structure.
 No looping but packets
may be received twice.
 Reverse path forwarding
* Figure from http://computer.howstuffworks.com/file-sharing.htm
Gnutella protocol Query
 A Query message (each with a
MessageID) is sent over existing
TCP connections.
 peers forward Query message
and keep track of the last socket
source of the message with the
message ID and decrement the
peer-count field.
 QueryHit message sent over
reverse path using the message ID
so that peers can remove the
QueryHit messages from the
network.
QueryHit
limited scope query flooding has been
implemented where a peer-count field of
the query is decremented when it reaches a
peer and returned to sender when it reaches
0
File transfer:
HTTP
Query
QueryHit
Query
The number of edges of a Gnutella
overlay network with N nodes =
N(N-1)/2
P2P Service
29
Gnutella vs Napster
 Distribute file location and decentralize lookup.
 Idea: multicast the request
 Hot to find a file:
 Send request to all neighbors
 Neighbors recursively multicast the request
 Eventually a machine that has the file receives the request,
and it sends back the answer
 Advantages:
 Totally decentralized, highly robust
 Disadvantages:
 Not scalable; the entire network can be swamped with
request (to alleviate this problem, each request has a TTL)
P2P Service
30
Recap: P2P file sharing Arch
 Centralized Directory:
 Central Directory keeps track of peer IPs and their
shared content
 Example: Napster and Instant Messaging
 Distributed Query Flooding:
 Peers keep their own shared directory and content is
located in nearby peers.
 Example: Gnutella protocol
 Distributed Heterogeneous Peers
 proprietary protocol, group leaders with high bandwidth
act as central directories searched by connected peers
 Example: KaZaA
P2P Service
31
Exploiting heterogeneity: KaZaA
 Proprietary protocol,
encrypts the control traffic
but not the data files
 Each peer is either a group
leader or assigned to a group
leader.


TCP connection between peer
and its group leader.
TCP connections between some
pairs of group leaders.
 Group leader tracks the
content in all its children.
ordinary peer
group-leader peer
neighoring relationships
in overlay network
P2P Service
32
KaZaA: Querying
 Each file has a hash and a descriptor
 Client sends keyword query to its group
leader
 Group leader responds with matches:

For each match: metadata, hash, IP address
 If group leader forwards query to other
group leaders, they respond with matches.
limited scope query flooding is also
implemented by KaZaA.
 Client then selects files for downloading

HTTP requests using hash as identifier sent to
peers holding desired file
P2P Service
33
KaZaA tricks to improve performance
 Request queuing: each peer can limit the
#simultaneous uploads (~3-7) to avoid long
delays
 Incentive priorities: the more a peer
uploads the higher his priority to download
 Parallel downloading of a file across peers:
peer can download different portions of
the same file from different peers using
the byte-range header of http.
P2P Service
34
BitTorrent
 P2P file sharing communication protocol
 130 millions installations, as of 2008.Q1
torrent: group of
peers exchanging
chunks of a file
tracker: each torrent has an infrstrctr
node, which keeps record of peers
participating in the torrent
obtain list
of peers
trading
chunks
peer
P2P Service
35
BitTorrent (1)
 file divided into 256KB chunks.
 peer joining torrent:
has no chunks, but will accumulate them over time
 registers with tracker to get list of peers, connects to
subset of peers (“neighbors”) concurrently in TCP
Alice’s neighboring peers may fluctuate over time
Alice periodically ask each of her neighbor for the list of
chunks they have (pull chunk)
while downloading, peer uploads chunks to other peers.
peers may come and go
once peer has entire file, it may (selfishly) leave or
(altruistically) remain






P2P Service
36
BitTorrent (2)
Which request to be responded
first: tit-for-tat trading
 Alice sends chunks to four
Which chunk to pull first?
neighbors who are currently sending
her chunks at the highest rate
 at any given time, diff

re-evaluate top 4 every 10 secs
peers have different
 every 30 secs: randomly select
subsets of file chunks
another peer, starts sending chunks
 the new peer may join top 4
 periodically, a peer
 Random selection allows new peers
(Alice) asks each
to get chunks, so they can start to
neighbor for list of
trade
chunks that they have.  Trading algorithm helps eliminate
free-riding problem.
 Alice issues requests
for her missing chunks
 rarest first
P2P Service
37
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth
Server
us
u1
d1
u2
ui: peer i upload
bandwidth
d2
File, size F
dN
uN
Network (with
abundant bandwidth)
di: peer i download
bandwidth
File distribution time: server-client
 server sequentially
sends N copies:

NF/us time
 client i takes F/di
time to download
Server
F
us
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
Time to distribute F
to N clients using = d = max { NF/u , F/min(d ) }
cs
s
i
client/server approach
i
increases linearly in N
(for large N)
File distribution time: P2P
Server
 server must send one
u1 d1 u2
F
d2
copy: F/us time
us
 client i takes F/di time
Network (with
dN
to download
abundant bandwidth)
uN
 NF bits must be
downloaded (aggregate)
 fastest possible upload rate: us + Sui
dP2P = max { F/us, F/min(di) , NF/(us + Sui) }
i
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
N
25
30
35
Issues with P2P
 Free Riding (Free Loading)
 Two types of free riding
• Downloading but not sharing any data
• Not sharing any interesting data

On Gnutella
• 15% of users contribute 94% of content
• 63% of users never responded to a query
– Didn’t have “interesting” data
 No ranking: what is a trusted source?
 “spoofing”
P2P Case study: Skype
 P2P (pc-to-pc, pc-to-
phone, phone-to-pc)
Voice-Over-IP (VoIP)
application
Skype
 also IM
login server
 proprietary applicationlayer protocol (inferred
via reverse engineering)
 hierarchical overlay
Skype clients (SC)
Supernode
(SN)
 Founded by the same
people of Kazaa
 Acquired by eBay in
2005 for $2.6B
 300 millions accounts by
2008.Q1
P2P Service
43
Skype: making a call
 User starts Skype
 SC registers with Super Node



a cached SN list, or
list of bootstrap SNs that are
hardwired
Any node could be SN
Skype
login server
 SC logs in (authenticate)
 Call: SC contacts SN with callee
ID

SN contacts other SNs (unknown
protocol, maybe flooding) to find
addr of callee; returns addr to SC
 SC directly contacts callee, overTCP
 Up/down link bw of 5 kbytes/sec
 Over 60% behind NAT. How to call them?
P2P Service
44
NAT: Network Address Translation
2: NAT router
changes datagram
source addr from
10.0.0.1, 3345 to
138.76.29.7, 5001,
updates table
2
NAT translation table
WAN side addr
LAN side addr
138.76.29.7, 5001 10.0.0.1, 3345
……
……
1: host 10.0.0.1
sends datagram to
128.119.40.186, 80
S: 10.0.0.1, 3345
D: 128.119.40.186, 80
10.0.0.1
S: 138.76.29.7, 5001
D: 128.119.40.186, 80
138.76.29.7
S: 128.119.40.186, 80
D: 138.76.29.7, 5001
3: Reply arrives
dest. address:
138.76.29.7, 5001
3
1
10.0.0.4
S: 128.119.40.186, 80
D: 10.0.0.1, 3345
10.0.0.2
4
10.0.0.3
4: NAT router
changes datagram
dest addr from
138.76.29.7, 5001 to 10.0.0.1, 3345
NAT traversal problem in Skype
 Relay
NATed client establishes connection to relay
 External client connects to relay
 relay bridges packets between to connections

2. connection to
relay initiated
by client
Client
3. relaying
established
1. connection to
relay initiated
by NATted host
138.76.29.7
NAT
router
10.0.0.1
Teleconference in Skype
 Involve multiple parties
in a VoIP session
 Mixing operation is
required: merging
several streams of voice
packets for delivery to
the receivers
 Possible mixing
approaches:




Each user has its own
multicast tree
One designate handles
mixing and subsequent
operation
Single multicast tree
Decoupled: one for mixing
and the other for
distribution
P2P Service
1-47
Skype Outage in Dec 2010
 On Dec 22, 2010, the P2P skype network became
unstable and suffered a critical failure
 It lasted about 24 hours. Causes:




a cluster of support servers responsible for offline
instant messaging became overloaded.
As a result, some Skype clients received delayed
responses from the overloaded servers.
In a version of the Skype for Windows client (version
5.0.0152), the delayed responses from the overloaded
servers were not properly processed, causing Windows
clients running the affected version to crash.
around 50% of all Skype users globally were running this
Skype version, and the crashes caused approximately
40% of those clients to fail. These clients included 25–
30% of the publicly available supernodes, also failed as a
result of this problem.
P2P Service
1-48
Skype Outage in Dec 2010
 Causes (cont’)
 The failure of 25–30% of supernodes in the P2P network
resulted in an increased load on the remaining
supernodes.
 Supernodes have a built in mechanism to protect
themselves and to avoid adverse impact on the systems
hosting them when operational parameters do not fall
into expected ranges.
 increased load in supernode traffic led to some of these
parameters exceeding normal limits, and as a result, more
supernodes started to shut down. This further increased
the load on remaining supernodes and caused a positive
feedback loop, which led to the near complete failures
that occurred a few hours after the triggering event.
 How to recover from this outage?
P2P Service
1-49
Skype Reading List
 Baset, S. A. and Schulzrinne, H. G. An Analysis of the Skype
Peer-to-Peer Internet Telephony Protocol. In Proceedings of
INFOCOM 2006.
 Caizzone, G. Et al. Analysis of the Scalability of the Overlay
Skype System. In Proceedings of ICC 2008.
 Kho, W., Baset, S. A., and Schulzrinne, H. G. (2008). Skype
Relay alls: Measurements and Experiments. In Proceedings
of INFOCOM 2008.
 Gu, X., et al peerTalk: A Peer-to-Peer Multiparty Voiceover-IP System. IEEE TPDS, 19(4):515–528, 2008.
 Lars Rabbe, CIO update: Post-mortem on the Skype outage,
http://blogs.skype.com/en/2010/12/cio_update.html
P2P Service
1-50
P2P Video Streaming
 A strong case for P2P


Youtube costs $1B+ for network BW
> 1Mpbs last mile download bw, and over several
hundreds Kbps upload bw are widely available
 Examples of the systems


Tree-based push approach: peers are organized into
a tree structure for data delivery, with each packet
being disseminated using the same overlay
structure.
Pull-based data driven approach: nodes maintain a
set of partners, and periodically exchange data
availability info with the partners
• SplitStream, CoolStream, PPStream, PPLive
P2P Service
1-51
Video Streaming
 A generic arch for retrieving, storing, and playing
back video packets



Original video packet stream is broken down into chunks,
each with a unique chunk ID
Buffer map: indicates the presence/absence of video
chunks in the node
A peer requests for missing packets from its connected
peers.
P2P Service
1-52
Key Issues
 Chunk Size: determines the size of buffer map, which needs to
be exchanged frequently
 Replication strategies: how video/chunk should be cached,
dependent on video coding


Selection and replacement of videos
Prefetching
 Chunk selection: which missing chunks should be downloaded
first: sequential, rarest first, anchor first

Anchor first: get all the chunks located at predefined anchor points
 Transmission strategies: to maximize download rate and to
minimize overheads



Parallel download from multiple peers, but may causing duplication
Selective download of different chunks from multiple peers
Sequential download chunks from peer to peer
P2P Service
1-53
Scalable Coding
 Scalable Video Coding (SVC): split images into
different hierarchical layers, each successive layer
improving the image quality

Extension of H.264/MPEG-4 video compression standard
 Multiple Description Coding (MDC): a video stream is
encoded into multiple substreams (aka descriptions),
with different importance as to restoring the original
contents at the viewer’s machine.

E.g. In MPEG-2, I-frames be encoded as the first layer, the
first P-frames as the second layer, the second P-frames as
the third layer, the B-frames as even higher layers
 MDC creates independent descriptors (bitstreams),
while SVC creates hierarchical dependent layers
P2P Service
1-54
P2P Streaming Example: PPTV
 PPLive: free P2P-based IPTV
 As of January 2006, the PPLive network provided




200+ channels with 400,000 daily users on
average.
The bit rates of video programs mainly range from
250 Kbps to 400 Kbps with a few channels as high
as 800 Kbps.
The video content is mostly feeds from TV
channels in Mandarin.
The channels are encoded in two video formats:
Window Media Video (WMV) or Real Video
(RMVB).
The encoded video content is divided into chunks
and distributed to users through the PPLive P2P
network.
 Cached contents can be uploaded to other peers watching the same







channel.
This peer may also upload cached video chunks to multiple peers.
Received video chunks are reassembled in order and buffered in
queue of PPLive TV engine, forming local streaming file in memory.
When the streaming file length crosses a predefined threshold, the
PPLive TV engine launches media player, which downloads video
content from local HTTP streaming server.
After the buffer of the media player fills up to required level, the
actual video playback starts.
When PPLive starts, the PPLive TV engine downloads media content
from peers aggressively to minimize playback start-up delay.
When the media player receives enough content and starts to play
the media, streaming process gradually stabilizes.
The PPLive TV engine streams data to the media player at media
playback rate.
Measurement setup
 “Insights into PPLive: A Measurement Study of a




LargeScale P2P IPTV System” by X. Hei et al.
One residential and one campus PC “watched”
channel CCTV3
The other residential and campus PC “watched”
channel CCTV10
Each of these four traces lasted about 2 hours.
From the PPLive web site, CCTV3 is a popular
channel with a 5-star popularity grade and CCTV10
is less popular with a 3-star popularity grade.
Session durations
 Signaling versus video sessions
 All sessions are TCP based
 The median video session is about 20 seconds and about 10%
of video sessions last for over 15 minutes or more.
Video traffic breakdown among
sessions
Start-up delays
 Two types of start-up delay:
 the delay from when one channel is selected until the
streaming player pops up;
 the delay from when the player pops up until the playback
actually starts.
 The player pop-updelay is in general 10-15 seconds
and the player buffering delay is around 10-15
seconds.
 Therefore, the total start-up delay is around 20
30 seconds.
 Nevertheless, some less popular channels have a
total start-up delays of up to 2 minutes.
Upload-download rates
Estimating the redundancy ratio
 It is possible to download same video blocks more than once
 Excluding TCP/IP headers, determine total streaming payload




for the downloaded traffic.
Utilizing video traffic filtering heuristic rule (packet size >
1200B) extract video traffic.
Given playback interval and the media playback speed, obtain a
rough estimate of the media segment size.
Compute the redundant traffic by the difference between the
total received video traffic and the estimated media segment
size.
Define redundancy ratio as ratio between redundant traffic and
estimated media segment size.
Dynamics of video participants
Peer arrivals & departures
Geographic distribution of peers
Video Streaming Reading
 “Opportunities and Challenges of Peer-to-
Peer Internet Video Broadcast” by J. Liu
et al. IEEE Proceedings, 2007.
 “Insights into PPLive: A Measurement
Study of a LargeScale P2P IPTV System”
by X. Hei et al.
Unstructured vs Structured P2P
 The systems we described do not offer any
guarantees about their performance (or even
correctness)
 Structured P2P
Scalable guarantees on numbers of hops to answer
a query
 Maintain all other P2P properties (load balance,
self-organization, dynamic nature)

 Approach: Distributed Hash Tables (DHT)
Distributed Hash Tables (DHT)
A hash table is a data structure that
uses a hash func to map id values
(keys) to their associated values
 Stores (key, value) pairs
 The key is like a filename
 The value can be file contents, or pointer to location
 Goal: Efficiently insert/lookup/delete (key, value)
pairs
 Each peer stores a subset of (key, value) pairs in
the system
 Core operation: Find node responsible for a key


Map key to node
Efficiently route insert/lookup/delete request to this
node
 Allow for frequent node arrivals/departures
DHT Design Goals
 An “overlay” network with:
Flexible mapping of keys to physical nodes
 Small network diameter
 Small degree (fanout)
 Local routing decisions
 Robustness to churn
 Routing flexibility
 Decent locality (low “stretch”)

 A “storage” or “memory” mechanism with
No guarantees on persistence
 Maintenance via soft state

Basic Ideas
 keys are associated with globally unique
IDs

integers of size m (for large m)
 key ID space (search space) is uniformly
populated - mapping of keys to IDs using
(consistent) hashing
 a node is responsible for indexing all the
keys in a certain subspace (zone) of the ID
space
 nodes have only partial knowledge of other
node’s responsibilities
DHT ID Assignment
 Assign (key, value) to each peer in range [0,2n-1].
 Each identifier can be represented by n bits.
Rule: assign key to the peer that has the closest
ID.
 Require each key to be an integer in same range.

nodeID
Consistent Hashing
dataID
Any metric space will do
Circular DHT (1)
1
3
15
4
12
5
10
8
 Each peer only aware of immediate successor
and predecessor.
 “Overlay network”
P2P Service
1-72
Circle DHT (2)
0001
O(N) messages
on avg to resolve
query, when there
are N peers
I am
Who’s resp
for key 1110 ?
0011
1111
1110
0100
1110
1110
1100
1110
1110
Define closest
as closest
successor
0101
1110
1010
1000
P2P Service
1-73
Circular DHT with Shortcuts
1
Who’s resp
for key 1110?
3
15
4
12
5
10
8
 Each peer keeps track of IP addresses of
predecessor, successor, short cuts.
 Reduced from 6 to 2 messages.
P2P Service
1-74
Design Space
 How many shortcut neighbors should each peer
have, and which peers should be these shortcut
neighbors
 Possible to design shortcuts so O(log N) neighbors,
O(log N) messages in query


E.g. Chord [Stoica 2001]
High overhead for routing table maintenance in a large
scale network
 How to reduce maintenance overhead without
compromising lookup efficiency?


constant-degree DHT with degree=O(1), O(log N) query
efficiency
E.g. Cycloid [Shen’06]
P2P Service
1-75
Peer Churn
1
3
15
4
12
5
10
•To handle peer churn, require
each peer to know the IP address
of its two successors.
• Each peer periodically pings its
two successors to see if they
are still alive.
8
 Peer 5 abruptly leaves
 Peer 4 detects; makes 8 its immediate successor;
asks 8 who its immediate successor is; makes 8’s
immediate successor its second successor.
 What if peer 13 wants to join?
P2P Service
1-76
Content Addressable Network (CAN) :
2D Space
 Space divided between nodes
 All nodes cover the entire
space
 Each node covers either a
square or a rectangular area
of ratios 1:2 or 2:1
 Example:

Node n1:(1, 2) first node that
joins  cover the entire space
7
6
5
4
3
n1
2
1
0
0
1
2
3
4
5
6
7
CAN Example: 2D Space
 Node n2:(4, 2) joins 
space is divided between
n1 and n2
7
6
5
4
3
n2
n1
2
1
0
0
1
2
3
4
5
6
7
CAN Example: 2D Space
 Node n2:(4, 2) joins 
space is divided between
n1 and n2
7
6
n3
5
4
3
n2
n1
2
1
0
0
1
2
3
4
5
6
7
CAN Example:2D Space
 Nodes n4:(5, 5) and
n5:(6,6) join
7
6
n5
n4
n3
5
4
3
n2
n1
2
1
0
0
1
2
3
4
5
6
7
CAN Example: 2D Space
 Nodes: n1:(1, 2); n2:(4,2);
n3:(3, 5);
n4:(5,5);n5:(6,6)
7
n5
6
5
 Items: f1:(2,3); f2:(5,1);
f3:(2,1); f4:(7,5);
n4
n3
f4
4
f1
3
n2
n1
2
 Each item is stored by
the node who owns its
mapping in the space
f3
1
f2
0
0
1
2
3
4
5
6
7
CAN: Query Example
 Each node knows its
neighbors in the d-space
 Forward query to the
neighbor that is closest to
the query id
 Example: assume n1 queries
f4
 Can route around some
failures

some failures require local
flooding
7
n5
6
n4
n3
5
f4
4
f1
3
n2
n1
2
f3
1
f2
0
0
1
2
3
4
5
6
7
DHT Reading List
 Stoica, et al., Chord: a scalable peer-to-peer lookup service for






Internet applications, Proc. 2001 ACM SIGCOMM
Ratnasmay, et al., A scalable content-addressable network, Proc.
2001 ACM SIGCOMM
Shen, et al, Cycloid: A constant-degree lookup efficient P2P
network, Proc. of IPDPS’04. (Performance Evaluation, 2006)
H. Shen and C. Xu, “Locality-aware and churn-resilient load
balancing algorithms in structured peer-to-peer networks,” IEEE
TPDS, vol.18(6):849-862, June 2007.
H. Shen and C. Xu, “Hash-based proximity clustering for efficient
load balancing in heterogeneous DHT networks,” JPDC,
vol.68(5):686-702, May 2008.
H. Shen and C. Xu, “Elastic routing table with provable performance
for congestion control in DHT networks,” IEEETPDS ,
vol.21(2):242—256, February 2010.
H. Shen and C. Xu, “Leveraging a compound graph based DHT for
multi-attribute range queries with performance analysis,” IEEE
Transactions on Computers, 2011 (accepted)
P2P Service
1-83
Examples of Network Services
 E-mail
 Internet telephone
 Web and DNS
 Real-time video
 Instant messaging
 Remote login
 P2P file sharing
conference
 Cloud computing and
Storage
 Multi-user network
games
 Streaming stored
video clips
P2P Services
84
Download