Chapter 2
Application Layer
Part 5: P2P
Computer Networking: A Top Down Approach
6th edition
Jim Kurose, Keith Ross
Addison-Wesley
March 2012
2: Application Layer
1
Chapter 2: Application layer
2.1 Principles of
network applications
2.2 Web and HTTP
2.3 FTP
2.4 Electronic Mail
2.6 P2P applications
2.7 Socket programming
with UDP
2.8 Socket programming
with TCP
SMTP, POP3, IMAP
2.5 DNS
2: Application Layer
2
Pure P2P architecture
no always-on server
arbitrary end systems
directly communicate peer-peer
peers are intermittently
connected and change IP
addresses
Peers always send entire
file to another peer
Some systems (like bit
torrent) break files up
into pieces.
2: Application Layer
3
Pure P2P architecture
We’ll consider three
examples:
File distribution
Searching for
information
Case Study: Skype
peer-peer
2: Application Layer
4
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
Distribution time: the time it takes to get a copy of
the file to all N peers.
Assumptions:
1. the internet core has abundant bandwidth
therefore all bottlenecks are in network access.
2. Server and peers are not doing anything else on
the internet.
5
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: peer i upload
bandwidth
d2
di: peer i download
bandwidth
Network (with
abundant bandwidth)
2: Application Layer
6
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from one server to N peers?
Time to upload file
from server for
first peer: F/us
Time to upload file
from server for
second peer: F/us
…
Time to upload file
from server for nth
peer: F/us
Time to download file to first peer: F/d1 (could be long)
Could start
downloading before
entire file is uploaded
by server
The smaller di the larger
F/di and the longer the
download time.
Time to download file to second peer: F/d2
2: Application Layer
7
File distribution time: server-client
server sequentially
sends N copies:
NF/us time
client i takes F/di
time to download
Server
F
us
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
Time to distribute F
to N clients using = dcs >= max { NF/us, F/min(di) }
i
client/server approach
increases linearly in N
(for large N) 2: Application Layer
8
File distribution time: P2P
server must send one
Server
F
u1 d1 u2
d2
copy: F/us time
us
client i takes F/di time
Network (with
dN
to download
abundant bandwidth)
uN
NF bits must be
downloaded (aggregate)
fastest possible upload rate: us + Sui
In other words, if every node had a copy of the file
And every node uploaded the file simultaneously
Then the total rate of upload would be the sum of
all the rates
2: Application Layer
9
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from one server to N peers?
Time to upload file
from server for
first peer: F/us
Time to upload file
from server for
second peer: F/us
…
Time to upload file
from server for nth
peer: F/us
Total Time to download file to first peer (from many
other peers): F/d1 (all the bits have to be downloaded)
The smaller di the larger
F/di and the longer the
download time.
Time to download file to second peer: F/d2
2: Application Layer
10
File distribution time: P2P
Server
server must send one
F
u1 d1 u2
d2
copy: F/us time
us
client i takes F/di time
Network (with
dN
to download
abundant bandwidth)
uN
NF bits must be
downloaded (aggregate)
fastest possible upload rate: us + Sui
dP2P >= max { F/us, F/min(di) , NF/(us + Sui) }
i
2: Application Layer
11
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
N
2: Application Layer
12
File distribution: BitTorrent
P2P file distribution
tracker: tracks peers
participating in torrent
torrent: group of
peers exchanging
chunks of a file
obtain list
of peers
trading
chunks
peer
2: Application Layer
13
BitTorrent (1)
file divided into 256KB chunks.
peer joining torrent:
has no chunks, but will accumulate them over time
registers with tracker to get list of peers,
connects to subset of peers (“neighbors”)
while downloading, peer uploads chunks to other
peers.
peers may come and go
once peer has entire file, it may (selfishly) leave or
(altruistically) remain
2: Application Layer
14
BitTorrent (2)
Pulling Chunks
at any given time,
different peers have
different subsets of
file chunks
periodically, a peer
(Alice) asks each
neighbor for list of
chunks that they have.
Alice sends requests
for her missing chunks
rarest first
Sending Chunks: tit-for-tat
Alice sends chunks to four
neighbors currently
sending her chunks at the
highest rate
re-evaluate top 4 every
10 secs
every 30 secs: randomly
select another peer,
starts sending chunks
newly chosen peer may
join top 4
“optimistically unchoke”
2: Application Layer
15
BitTorrent: Tit-for-tat
(1) Alice “optimistically unchokes” Bob
(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
(3) Bob becomes one of Alice’s top-four providers
With higher upload rate,
can find better trading
partners & get file faster!
2: Application Layer
16
Distributed Hash Table (DHT)
DHT = distributed P2P database
Database has (key, value) pairs;
key: social security number; value: human name
• E.g., (123-45-6789, George Bush)
key: content type; value: IP address
• E.g., (Newsboys Shine, 203.17.123.38)
Peers query DB with key
DB returns values that match the key
Example: query with 123-45-6789 get “George Bush”
Example: query with “Newsboys Shine” get 203.17.123.38
Peers can also insert (key, value) peers
How do we store the DB?
One central server
Early napster
Defeats the purpose in some ways
Randomly distribute pieces
Each peer maintains a list of the IP addresses
of all participating peers.
Not scalable
2: Application Layer
18
DHT Identifiers
Better: Assign integer identifier to each peer in range
[0,2n-1].
Each identifier can be represented by n bits.
Require each key in DB to be an integer in same range.
There are different types of keys, e.g., SSN or band name
Doesn’t matter: every key will be an integer in this range.
Problem: keys are not necessarily integers!
To get integer keys, hash original key.
eg, key = h(“Led Zeppelin IV”) = some integer.
Hash function can insure result is in our range
This is why they call it a distributed “hash” table
How to assign keys to peers?
Central issue:
Assigning (key, value) pairs to peers.
• i.e., which peer will store with (key, value) pairs?
Recall each peer is assigned an identifier
Rule: assign key to the peer that has the
closest ID.
Convention in lecture: closest is the
immediate successor of the key.
How to assign keys to peers?
Example: n=4;
Then all peer identifiers are in range [0, 15]
peers: 1,3,4,5,8,10,12,14;
key = 13, then successor peer = 14
Key = 8, then successor peer = 8
key = 15, then successor peer = 1
Circular DHT (1)
1
3
15
4
12
5
10
8
Each peer only aware of immediate successor and predecessor.
“Overlay network”:
the peers form their own “network”
A successor/predecessor may be many hops away
Circle DHT (2)
Disadv: O(N)
messages
on avg (actually N/2)
to resolve
query, when there
1111
are N peers
Advantage: little info kept in
each peer
0001
I am
Who’s resp
0011
for key 1110 ?
1110
0100
1110
1110
1100
1110
1110
Define closest
as closest
successor
1010
1110
1000
0101
Circular DHT with Shortcuts
1
3
15
Who’s resp
for key 1110?
4
12
5
10
8
Each peer keeps track of IP addresses of predecessor,
successor, short cuts.
Reduced from 6 to 2 messages.
Possible to design shortcuts so O(log N) neighbors, O(log
N) messages in query
Peer Churn
1
•To handle peer churn, require
3
15
4
12
5
10
each peer to know the IP address
of its two successors.
• Each peer periodically pings its
two successors to see if they
are still alive.
8
Peer 5 abruptly leaves
Peer 4 detects; makes 8 its immediate successor;
asks 8 who its immediate successor is; makes 8’s
immediate successor its second successor.
What if peer 13 wants to join?
P2P Case study: Skype
Skype clients (SC)
inherently P2P: pairs of users
communicate.
proprietary application-layer
protocol (inferred via reverse
Skype
engineering)
login server
hierarchical overlay with
supernodes
Index maps usernames to IP
addresses; distributed over
SNs
Skype is proprietary but
guess is that it uses DHT
Supernode
(SN)
2: Application Layer
26
Peers as relays
Problem when both
Alice and Bob are
behind “NATs”.
NAT prevents an outside
peer from initiating a call
to insider peer
Solution:
Using Alice’s and Bob’s
SNs, Relay is chosen
Each peer initiates
session with relay.
Peers can now
communicate through
NATs via relay
2: Application Layer
27