2011 - IDA

advertisement
Client-server architecture
Pure P2P architecture
server:
❍ always-on host
❍ permanent IP address
❍ server farms for
scaling
clients:
❍
client/server
❍
❍
❍
communicate with server
may be intermittently
connected
may have dynamic IP
addresses
do not communicate
directly with each other
❒
no always-on server
❒ arbitrary end systems
directly communicate
❒ peers are intermittently
❍
❍
❍
❍
File sharing
File distribution
Searching for information
Case Studies: Bittorrent
and Skype
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
Hybrid of client-server and P2P
P2P file sharing
Skype
❍ voice-over-IP P2P application
❍ centralized server: finding address of remote
party:
❍ client-client connection: direct (not through
server)
Instant messaging
❍ chatting between two users is P2P
❍ centralized service: client presence
detection/location
• user registers its IP address with central
server when it comes online
• user contacts central server to find IP
addresses of buddies
Example
❒ Alice runs P2P client
application on her
notebook computer
❒ intermittently
connects to Internet;
gets new IP address
for each connection
❒ asks for “Hey Jude”
❒ application displays
other peers that have
copy of Hey Jude.
TDDD36: Peer-to-peer
P2P: centralized directory
original “Napster” design
1) when peer connects, it
informs central server:
❍
❍
Bob
peers
1
2
the peers, Bob.
❒ file is copied from
Bob’s PC to Alice’s
notebook: HTTP
❒ while Alice downloads,
other users uploading
from Alice.
❒ Alice’s peer is both a
Web client and a
transient Web server.
All peers are servers =
highly scalable!
TDDD36: Peer-to-peer
❒ single point of failure
❒ performance bottleneck
1
3
1
❒ Alice chooses one of
P2P: problems with centralized directory
centralized
directory server
IP address
content
2) Alice queries for “Hey
Jude”
3) Alice requests file from
Bob
peer-peer
connected and change IP
addresses
❒ Three topics:
❒ copyright infringement:
“target” of lawsuit is
obvious
file transfer is
decentralized, but
locating content is
highly centralized
1
Alice
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
1
Gnutella: protocol
Query flooding: Gnutella
❒ fully distributed
❍ no central server
❒ public domain protocol
❒ many Gnutella clients
implementing protocol
overlay network: graph
❒ edge between peer X
and Y if there’s a TCP
connection
❒ all active peers and
edges form overlay net
❒ edge: virtual (not
physical) link
❒ given peer typically
connected with < 10
overlay neighbors
File transfer:
HTTP
❒ Query message
sent over existing TCP
connections
❒ peers forward
Query message
❒ QueryHit
sent over
reverse
path
Query
QueryHit
Query
QueryHit
Scalability:
limited scope
flooding
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
Gnutella: Peer joining
Hierarchical Overlay
joining peer Alice must find another peer in
Gnutella network: use list of candidate peers
2. Alice sequentially attempts TCP connections with
candidate peers until connection setup with Bob
3. Flooding: Alice sends Ping message to Bob; Bob
forwards Ping message to his overlay neighbors
(who then forward to their neighbors….)
❒ peers receiving Ping message respond to Alice
with Pong message
4. Alice receives many Pong messages, and can then
setup additional TCP connections
❒ between centralized
1.
index, query flooding
approaches
❒ each peer is either a
group leader or assigned
to a group leader.
❍
❍
TCP connection between
peer and its group leader.
TCP connections between
some pairs of group leaders.
❒ group leader tracks
content in its children
ordinary peer
group-leader peer
neighoring relationships
in overlay network
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
Distributed Hash Table (DHT)
DHT Identifiers
❒ DHT = distributed P2P database
❒ Assign integer identifier to each peer in range
[0,2n-1].
❒ Database has (key, value) pairs;
❍
❍
❍
key: ss number; value: human name
key: content type; value: IP address
Each identifier can be represented by n bits.
❒ Require each key to be an integer in same range.
❒ Peers query DB with key
❍ DB returns values that match the key
❒ Peers can also insert (key, value) peers
TDDD36: Peer-to-peer
❒ To get integer keys, hash original key.
❍
❍
eg, key = h(“Led Zeppelin IV”)
This is why they call it a distributed “hash” table
TDDD36: Peer-to-peer
2
How to assign keys to peers?
Circular DHT (1)
1
❒ Central issue:
❍
Assigning (key, value) pairs to peers.
3
15
❒ Rule: assign key to the peer that has the
closest ID.
❒ Convention in lecture: closest is the
immediate successor of the key.
❒ Ex: n=4; peers: 1,3,4,5,8,10,12,14;
❍
❍
key = 13, then successor peer = 14
key = 15, then successor peer = 1
4
12
5
10
8
only aware of immediate successor
and predecessor.
❒ “Overlay network”
❒ Each peer
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
Circular DHT with Shortcuts
Circle DHT (2)
1
O(N) messages
on avg to resolve
query, when there
are N peers
0001
Who’s
responsible
for key 1110 ?
I am
15
0011
4
1111
12
5
1110
10
0100
1110
1110
1100
8
❒ Each peer keeps track of IP addresses of predecessor,
1110
1110
Define closest
as closest
successor
Who’s
responsible
for key 1110?
3
0101
1110
1010
successor, short cuts.
❒ Reduced from 6 to 2 messages.
❒ Possible to design shortcuts so O(log N) neighbors,
O(log N) messages in query
1000
TDDD36: Peer-to-peer
Peer Churn
P2P Case study: Skype
1
3
15
4
12
•To handle peer churn, require
each peer to know the IP address
of its two successors.
• Each peer periodically pings its
two successors to see if they
are still alive.
5
10
TDDD36: Peer-to-peer
8
❒ Peer 5 abruptly leaves
❒ Peer 4 detects; makes 8 its immediate successor;
asks 8 who its immediate successor is; makes 8’s
immediate successor its second successor.
❒ What if peer 13 wants to join?
TDDD36: Peer-to-peer
Skype clients (SC)
❒ inherently P2P: pairs
of users communicate.
❒ proprietary
Skype
login server
application-layer
protocol (inferred via
reverse engineering)
❒ hierarchical overlay
with Supernodes
(SNs)
❒ Index maps usernames
to IP addresses;
distributed over SNs
Supernode
(SN)
TDDD36: Peer-to-peer
3
Scalable Content Delivery
Peers as relays
Motivation
❒ Problem when both
❒ Use of Internet for content delivery is massive … and
Alice and Bob are
behind “NATs”.
❍
becoming more so (e.g., recent projection that by 2013,
90% of all IP traffic will be video content)
NAT prevents an outside
peer from initiating a call
to insider peer
❒ How to make scalable and efficient?
❍
❒ Solution:
❍ Using Alice’s and Bob’s
SNs, Relay is chosen
❍ Each peer initiates
session with relay.
❍ Peers can now
communicate through
NATs via relay
Here: Cost efficiency (e.g., energy efficiency) ...
❒ Variety of approaches: broadcast/multicast, batching,
replication/caching (e.g. CDNs), P2P, peer-assisted, …
❒ In these slides (only examples of a few peer-assisted
techniques):
❍
❍
BitTorrent (peer-to-peer)
Peer-assisted streaming
TDDD36: Peer-to-peer
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth
Server
u1
d1
u2
us
ui: peer i upload
bandwidth
d2
di: peer i download
bandwidth
File, size F
dN
uN
Network (with
abundant bandwidth)
TDDD36: Peer-to-peer
File distribution time: server-client
Server
❒ server sequentially
sends N copies:
❍
u1 d1 u2
d2
F
us
NF/us time
dN
❒ client i takes F/di
Network (with
abundant bandwidth)
uN
time to download
Time to distribute F
to N clients using
client/server approach = dcs = max { NF/us, F/min(di) }
i
increases linearly in N
(for large N)
TDDD36: Peer-to-peer
File distribution time: P2P
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Server
copy: F/us time
❒ client i takes F/di time
to download
❒ NF bits must be
downloaded (aggregate)
❒ fastest possible upload rate: us +
u1 d1 u2
F
us
d2
Network (with
abundant bandwidth)
dN
uN
Σu
i
3.5
Minimum Distribution Time
❒ server must send one
TDDD36: Peer-to-peer
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
dP2P = max { F/us, F/min(di) , NF/(us +
i
Σu ) }
0
5
10
15
20
25
30
35
i
N
TDDD36: Peer-to-peer
TDDD36: Peer-to-peer
4
File distribution: BitTorrent
BitTorrent-like systems
❒ P2P file distribution
❍
torrent: group of
tracker: tracks peers
❒ File split into many smaller pieces
❒ Pieces are downloaded from both seeds and downloaders
❒ Distribution paths are dynamically determined
peers exchanging
chunks of a file
participating in torrent
Based on data availability
Downloader
obtain list
of peers
Seed
Downloader
Seed
Downloader
trading
chunks
Torrent
Arrivals
(x downloaders; y seeds)
Departures
peer
Downloader
Seed residence
time
Download time
25
26
Background
Background
Peer discovery in BitTorrent
Peer discovery in BitTorrent
❒ Torrent file
❍
❒ Torrent file
“announce” URL
❍
❒ Tracker
❍
❍
Register torrent file
Maintain state information
❍
❍
❒ Peers
❍
❍
❍
❍
❍
Register torrent file
Maintain state information
❒ Peers
Obtain torrent file
Announce
Report status
Peer exchange (PEX)
❍
❍
❍
❍
❒ Issues
❍
“announce” URL
❒ Tracker
Obtain torrent file
Announce
Report status
Peer exchange (PEX)
❒ Issues
Central point of failure
Tracker load
❍
Swarm = Torrent
27
❍
Central point of failure
Tracker load
Background
Background
Multi-tracked torrents
Multi-tracked torrents
❒ Torrent file
❍
❍
❍
❍
❍
❍
❍
Register torrent file
Maintain state information
❍
❍
❍
Register torrent file
Maintain state information
❒ Peers
Obtain torrent file
Choose one tracker at random
Announce
Report status
Peer exchange (PEX)
❒ Issue
“announce-list” URLs
❒ Trackers
❒ Peers
❍
28
❒ Torrent file
“announce-list” URLs
❒ Trackers
❍
Swarm = Torrent
Swarm⊆Torrent
Multiple smaller swarms
❍
❍
❍
❍
❍
Swarm⊆Torrent
Obtain torrent file
Choose one tracker at random
Announce
Report status
Peer exchange (PEX)
❒ Issue
29
❍
Swarm⊆Torrent
Multiple smaller swarms
Swarm⊆Torrent
30
5
Background
Download using BitTorrent
Multi-tracked torrents
Background: Incentive mechanism
❒ Establish connections to large set of peers
❍ At each time, only upload to a small (changing) set of
peers
❒ Torrent file
❍
“announce-list” URLs
❒ Rate-based tit-for-tat policy
❍ Downloaders give upload preference to the downloaders
that provide the highest download rates
❒ Trackers
❍
❍
Register torrent file
Maintain state information
❒ Peers
❍
❍
❍
❍
❍
Obtain torrent file
Choose one tracker at random
Announce
Report status
Peer exchange (PEX)
Swarm⊆Torrent
❒ Issue
❍
Highest download rates
Pick top four
Swarm⊆Torrent
Multiple smaller swarms
Optimistic unchoke
31
Pick one at random
32
Live Streaming
Download using BitTorrent
using BT-like systems
Background: Piece selection
Peer 1:
1
2
3
Peer 2:
1
2
3
Peer N :
1
2
3
Pieces in neighbor set: (1)
1
2
(2) (1)
(2)
3
…
k
…
K
…
k
…
K
…
k
…
K
…
k
…
from
Internet
piece
upload/downloads
(2) (3) (2)
K
Media player
queue/buffer
to
❒ Rarest first piece selection policy
❍ Achieves high piece diversity
❒ Request pieces that
❍ the uploader has;
❍ the downloader is interested (wants); and
❍ is the rarest among this set of pieces
Buffer window
❒ Live streaming (e.g., CoolStreaming)
❍ All peers at roughly the same play/download position
• High bandwidth peers can easily contribute more …
❍
(relatively) Small buffer window
• Within which pieces are exchanged
34
33
Peer-assisted VoD streaming
A final note …
Some research questions ...
❒ If you are interested in one (or more)
❒ Can BitTorrent-like protocols provide scalable on-
topic(s) in this course, please do not
hesitate to contact me ...
❒ There are always lots interesting problems
to work on (for an exjobb, for example)!!!
❒ You can also find out more about my
research here: www.ida.liu.se/~nikca/
demand streaming?
❒ How sensitive is the performance to the application
configuration parameters?
❍ Piece selection policy
❍ Peer selection policy
❍ Upload/download bandwidth
❒ What is the user-perceived performance?
❍
❍
Start-up delay
Probability of disrupted playback
ACM SIGMETRICS 2008; IFIP Networking 2007; IFIP Networking 2009
35
TDDD36: Peer-to-peer
6
Download