3rd Edition: Chapter 2

advertisement
Chapter 2
Application Layer
Part 5: P2P
Computer Networking: A Top Down Approach
6th edition
Jim Kurose, Keith Ross
Addison-Wesley
March 2012
2: Application Layer
1
Chapter 2: Application layer
 2.1 Principles of
network applications
 2.2 Web and HTTP
 2.3 FTP
 2.4 Electronic Mail

 2.6 P2P applications
 2.7 Socket programming
with UDP
 2.8 Socket programming
with TCP
SMTP, POP3, IMAP
 2.5 DNS
2: Application Layer
2
Pure P2P architecture
 no always-on server
 arbitrary end systems
directly communicate peer-peer
 peers are intermittently
connected and change IP
addresses
 Peers always send entire
file to another peer

Some systems (like bit
torrent) break files up
into pieces.
2: Application Layer
3
Pure P2P architecture
 We’ll consider three
examples:



File distribution
Searching for
information
Case Study: Skype
peer-peer
2: Application Layer
4
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
Distribution time: the time it takes to get a copy of
the file to all N peers.
Assumptions:
1. the internet core has abundant bandwidth
therefore all bottlenecks are in network access.
2. Server and peers are not doing anything else on
the internet.
5
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: peer i upload
bandwidth
d2
di: peer i download
bandwidth
Network (with
abundant bandwidth)
2: Application Layer
6
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from one server to N peers?
Time to upload file
from server for
first peer: F/us
Time to upload file
from server for
second peer: F/us
…
Time to upload file
from server for nth
peer: F/us
Time to download file to first peer: F/d1 (could be long)
Could start
downloading before
entire file is uploaded
by server
The smaller di the larger
F/di and the longer the
download time.
Time to download file to second peer: F/d2
2: Application Layer
7
File distribution time: server-client
 server sequentially
sends N copies:

NF/us time
 client i takes F/di
time to download
Server
F
us
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
Time to distribute F
to N clients using = dcs >= max { NF/us, F/min(di) }
i
client/server approach
increases linearly in N
(for large N) 2: Application Layer
8
File distribution time: P2P
 server must send one
Server
F
u1 d1 u2
d2
copy: F/us time
us
 client i takes F/di time
Network (with
dN
to download
abundant bandwidth)
uN
 NF bits must be
downloaded (aggregate)
 fastest possible upload rate: us + Sui
 In other words, if every node had a copy of the file
 And every node uploaded the file simultaneously
 Then the total rate of upload would be the sum of
all the rates
2: Application Layer
9
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from one server to N peers?
Time to upload file
from server for
first peer: F/us
Time to upload file
from server for
second peer: F/us
…
Time to upload file
from server for nth
peer: F/us
Total Time to download file to first peer (from many
other peers): F/d1 (all the bits have to be downloaded)
The smaller di the larger
F/di and the longer the
download time.
Time to download file to second peer: F/d2
2: Application Layer
10
File distribution time: P2P
Server
 server must send one
F
u1 d1 u2
d2
copy: F/us time
us
 client i takes F/di time
Network (with
dN
to download
abundant bandwidth)
uN
 NF bits must be
downloaded (aggregate)
 fastest possible upload rate: us + Sui
dP2P >= max { F/us, F/min(di) , NF/(us + Sui) }
i
2: Application Layer
11
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
N
2: Application Layer
12
File distribution: BitTorrent
 P2P file distribution
tracker: tracks peers
participating in torrent
torrent: group of
peers exchanging
chunks of a file
obtain list
of peers
trading
chunks
peer
2: Application Layer
13
BitTorrent (1)
 file divided into 256KB chunks.
 peer joining torrent:
has no chunks, but will accumulate them over time
 registers with tracker to get list of peers,
connects to subset of peers (“neighbors”)
 while downloading, peer uploads chunks to other
peers.
 peers may come and go
 once peer has entire file, it may (selfishly) leave or
(altruistically) remain

2: Application Layer
14
BitTorrent (2)
Pulling Chunks
 at any given time,
different peers have
different subsets of
file chunks
 periodically, a peer
(Alice) asks each
neighbor for list of
chunks that they have.
 Alice sends requests
for her missing chunks
 rarest first
Sending Chunks: tit-for-tat
 Alice sends chunks to four
neighbors currently
sending her chunks at the
highest rate
 re-evaluate top 4 every
10 secs
 every 30 secs: randomly
select another peer,
starts sending chunks
 newly chosen peer may
join top 4
 “optimistically unchoke”
2: Application Layer
15
BitTorrent: Tit-for-tat
(1) Alice “optimistically unchokes” Bob
(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
(3) Bob becomes one of Alice’s top-four providers
With higher upload rate,
can find better trading
partners & get file faster!
2: Application Layer
16
Distributed Hash Table (DHT)
 DHT = distributed P2P database
 Database has (key, value) pairs;

key: social security number; value: human name
• E.g., (123-45-6789, George Bush)

key: content type; value: IP address
• E.g., (Newsboys Shine, 203.17.123.38)
 Peers query DB with key



DB returns values that match the key
Example: query with 123-45-6789 get “George Bush”
Example: query with “Newsboys Shine” get 203.17.123.38
 Peers can also insert (key, value) peers
How do we store the DB?
 One central server
Early napster
 Defeats the purpose in some ways

 Randomly distribute pieces
 Each peer maintains a list of the IP addresses
of all participating peers.
 Not scalable
2: Application Layer
18
DHT Identifiers
 Better: Assign integer identifier to each peer in range
[0,2n-1].

Each identifier can be represented by n bits.
 Require each key in DB to be an integer in same range.
 There are different types of keys, e.g., SSN or band name
 Doesn’t matter: every key will be an integer in this range.
 Problem: keys are not necessarily integers!
 To get integer keys, hash original key.
 eg, key = h(“Led Zeppelin IV”) = some integer.
 Hash function can insure result is in our range
 This is why they call it a distributed “hash” table
How to assign keys to peers?
 Central issue:

Assigning (key, value) pairs to peers.
• i.e., which peer will store with (key, value) pairs?

Recall each peer is assigned an identifier
 Rule: assign key to the peer that has the
closest ID.
 Convention in lecture: closest is the
immediate successor of the key.
How to assign keys to peers?
 Example: n=4;
Then all peer identifiers are in range [0, 15]
 peers: 1,3,4,5,8,10,12,14;
 key = 13, then successor peer = 14
 Key = 8, then successor peer = 8
 key = 15, then successor peer = 1

Circular DHT (1)
1
3
15
4
12
5
10
8
 Each peer only aware of immediate successor and predecessor.
 “Overlay network”:


the peers form their own “network”
A successor/predecessor may be many hops away
Circle DHT (2)
Disadv: O(N)
messages
on avg (actually N/2)
to resolve
query, when there
1111
are N peers
Advantage: little info kept in
each peer
0001
I am
Who’s resp
0011
for key 1110 ?
1110
0100
1110
1110
1100
1110
1110
Define closest
as closest
successor
1010
1110
1000
0101
Circular DHT with Shortcuts
1
3
15
Who’s resp
for key 1110?
4
12
5
10
8
 Each peer keeps track of IP addresses of predecessor,
successor, short cuts.
 Reduced from 6 to 2 messages.
 Possible to design shortcuts so O(log N) neighbors, O(log
N) messages in query
Peer Churn
1
•To handle peer churn, require
3
15
4
12
5
10
each peer to know the IP address
of its two successors.
• Each peer periodically pings its
two successors to see if they
are still alive.
8
 Peer 5 abruptly leaves
 Peer 4 detects; makes 8 its immediate successor;
asks 8 who its immediate successor is; makes 8’s
immediate successor its second successor.
 What if peer 13 wants to join?
P2P Case study: Skype
Skype clients (SC)
 inherently P2P: pairs of users




communicate.
proprietary application-layer
protocol (inferred via reverse
Skype
engineering)
login server
hierarchical overlay with
supernodes
Index maps usernames to IP
addresses; distributed over
SNs
Skype is proprietary but
guess is that it uses DHT
Supernode
(SN)
2: Application Layer
26
Peers as relays
 Problem when both
Alice and Bob are
behind “NATs”.

NAT prevents an outside
peer from initiating a call
to insider peer
 Solution:
 Using Alice’s and Bob’s
SNs, Relay is chosen
 Each peer initiates
session with relay.
 Peers can now
communicate through
NATs via relay
2: Application Layer
27
Download