Pure P2P architecture P2P: centralized directory no

advertisement
Pure P2P architecture
❒
P2P: centralized directory
no always-on server
original “Napster” design
1) when peer connects, it
informs central server:
❒ arbitrary end systems
directly communicate
❒ peers are intermittently
peer-peer
connected and change IP
addresses
❒ Three topics:
❍
❍
❍
❍
❍
❍
Bob
centralized
directory server
1
peers
1
IP address
content
File sharing
File distribution
Searching for information
Case Studies: Bittorrent
and Skype
2
1
Alice
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
P2P: problems with centralized directory
Query flooding: Gnutella
❒ single point of failure
❒ fully distributed
❍ no central server
❒ performance bottleneck
❒ copyright infringement:
“target” of lawsuit is
obvious
file transfer is
decentralized, but
locating content is
highly centralized
❒ public domain protocol
❒ many Gnutella clients
implementing protocol
TDTS04/09: Peer-to-peer
overlay network: graph
❒ edge between peer X
and Y if there’s a TCP
connection
❒ all active peers and
edges form overlay net
❒ edge: virtual (not
physical) link
❒ given peer typically
connected with < 10
overlay neighbors
TDTS04/09: Peer-to-peer
Gnutella: protocol
File transfer:
HTTP
❒ Query message
sent over existing TCP
connections
❒ peers forward
Query message
❒ QueryHit
sent over
reverse
path
3
1
2) Alice queries for “Hey
Jude”
3) Alice requests file from
Bob
Query
QueryHit
Query
Hierarchical Overlay
❒ between centralized
index, query flooding
approaches
❒ each peer is either a
group leader or assigned
to a group leader.
❍
QueryHit
❍
Scalability:
limited scope
flooding
TCP connection between
peer and its group leader.
TCP connections between
some pairs of group leaders.
❒ group leader tracks
content in its children
TDTS04/09: Peer-to-peer
ordinary peer
group-leader peer
neighoring relationships
in overlay network
TDTS04/09: Peer-to-peer
1
Distributed Hash Table (DHT)
DHT Identifiers
❒ DHT = distributed P2P database
❒ Assign integer identifier to each peer in range
[0,2n-1].
❒ Database has (key, value) pairs;
❍
❍
❍
key: ss number; value: human name
key: content type; value: IP address
Each identifier can be represented by n bits.
❒ Require each key to be an integer in same range.
❒ To get integer keys, hash original key.
❒ Peers query DB with key
❍ DB returns values that match the key
❍
❒ Peers can also insert (key, value) peers
❍
eg, key = h(“Led Zeppelin IV”)
This is why they call it a distributed “hash” table
TDTS04/09: Peer-to-peer
How to assign keys to peers?
TDTS04/09: Peer-to-peer
Circular DHT (1)
1
❒ Central issue:
❍
Assigning (key, value) pairs to peers.
3
15
❒ Rule: assign key to the peer that has the
closest ID.
❒ Convention in lecture: closest is the
immediate successor of the key.
❒ Ex: n=4; peers: 1,3,4,5,8,10,12,14;
key = 13, then successor peer = 14
❍ key = 15, then successor peer = 1
4
12
5
10
8
only aware of immediate successor
and predecessor.
❒ “Overlay network”
❒ Each peer
❍
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
Circular DHT with Shortcuts
Circle DHT (2)
1
O(N) messages
on avg to resolve
query, when there
are N peers
0001
Who’s
responsible
for key 1110 ?
I am
3
15
0011
4
1111
12
5
1110
0100
1110
10
1110
1100
8
❒ Each peer keeps track of IP addresses of predecessor,
1110
1110
Define closest
as closest
successor
Who’s
responsible
for key 1110?
0101
1110
1010
successor, short cuts.
❒ Reduced from 6 to 2 messages.
❒ Possible to design shortcuts so O(log N) neighbors,
O(log N) messages in query
1000
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
2
Peer Churn
P2P Case study: Skype
1
3
15
4
12
•To handle peer churn, require
each peer to know the IP address
of its two successors.
• Each peer periodically pings its
two successors to see if they
are still alive.
5
10
8
❒ Peer 5 abruptly leaves
❒ Peer 4 detects; makes 8 its immediate successor;
asks 8 who its immediate successor is; makes 8’s
immediate successor its second successor.
❒ What if peer 13 wants to join?
Skype clients (SC)
❒ inherently P2P: pairs
of users communicate.
❒ proprietary
Skype
login server
application-layer
protocol (inferred via
reverse engineering)
❒ hierarchical overlay
with Supernodes
(SNs)
❒ Index maps usernames
to IP addresses;
distributed over SNs
Supernode
(SN)
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
Peers as relays
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
❒ Problem when both
Alice and Bob are
behind “NATs”.
❍
NAT prevents an outside
peer from initiating a call
to insider peer
us: server upload
bandwidth
Server
u1
❒ Solution:
❍ Using Alice’s and Bob’s
SNs, Relay is chosen
❍ Each peer initiates
session with relay.
❍ Peers can now
communicate through
NATs via relay
d1
u2
us
ui: peer i upload
bandwidth
d2
di: peer i download
bandwidth
File, size F
dN
uN
Network (with
abundant bandwidth)
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
File distribution time: server-client
File distribution time: P2P
Server
❒ server sequentially
sends N copies:
❍
NF/us time
❒ client i takes F/di
F
us
dN
Server
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
time to download
❒ server must send one
copy: F/us time
❒ client i takes F/di time
to download
❒ NF bits must be
downloaded (aggregate)
❒ fastest possible upload rate: us +
Time to distribute F
to N clients using
client/server approach = dcs = max { NF/us, F/min(di) }
u1 d1 u2
F
us
d2
Network (with
abundant bandwidth)
dN
uN
Σu
i
i
increases linearly in N
(for large N)
TDTS04/09: Peer-to-peer
dP2P = max { F/us, F/min(di) , NF/(us +
i
Σu ) }
i
TDTS04/09: Peer-to-peer
3
File distribution: BitTorrent
Server-client vs. P2P: example
❒ P2P file distribution
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
torrent: group of
tracker: tracks peers
peers exchanging
chunks of a file
participating in torrent
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
obtain list
of peers
2
1.5
trading
chunks
1
0.5
0
peer
0
5
10
15
20
25
30
35
N
TDTS04/09: Peer-to-peer
TDTS04/09: Peer-to-peer
BitTorrent-like systems
Download using BitTorrent
Background: Incentive mechanism
❒ File split into many smaller pieces
❒ Pieces are downloaded from both seeds and downloaders
❒ Distribution paths are dynamically determined
❒ Establish connections to large set of peers
❍ At each time, only upload to a small (changing) set of
peers
Based on data availability
❍
❒ Rate-based tit-for-tat policy
❍ Downloaders give upload preference to the downloaders
that provide the highest download rates
Downloader
Seed
Downloader
Seed
Downloader
Highest download rates
Torrent
Arrivals
Pick top four
(x downloaders; y seeds)
Departures
Downloader
Seed residence
time
Download time
Optimistic unchoke
TDTS04/09: Peer-to-peer
Pick one at random
TDTS04/09: Peer-to-peer
Live Streaming
Download using BitTorrent
using BT-like systems
Background: Piece selection
Peer 1:
1
2
3
Peer 2:
1
2
3
Peer N :
1
2
3
Pieces in neighbor set: (1)
1
2
3
(2) (1)
(2)
…
k
…
K
…
k
…
K
…
k
…
K
…
k
…
from
Internet
piece
upload/downloads
(2) (3) (2)
K
to
❒ Rarest first piece selection policy
❍ Achieves high piece diversity
❒ Request pieces that
❍ the uploader has;
❍ the downloader is interested (wants); and
TDTS04/09: Peer-to-peer
❍ is the rarest among this set of pieces
Media player
queue/buffer
Buffer window
❒ Live streaming (e.g., CoolStreaming)
❍ All peers at roughly the same play/download position
• High bandwidth peers can easily contribute more …
❍
(relatively) Small buffer window
• Within which pieces are exchanged
TDTS04/09: Peer-to-peer
4
Peer-assisted VoD streaming
Some research questions ...
❒ Can BitTorrent-like protocols provide scalable on-
demand streaming?
❒ How sensitive is the performance to the application
configuration parameters?
❍ Piece selection policy
❍ Peer selection policy
❍ Upload/download bandwidth
❒ What is the user-perceived performance?
❍
❍
Start-up delay
Probability of disrupted playback
ACM SIGMETRICS 2008; IFIP Networking 2007; IFIP Networking 2009
TDTS04/09: Peer-to-peer
5
Download