P2P Content Distribution

advertisement
Peer-Assisted
Content Distribution
Pablo Rodriguez
Christos Gkantsidis
Traditional Content Distribution
Server Farm
Often, large content needs to
be distributed to millions of
clients:
• Currently:
• Huge server farms
• Infrastructure-based
solutions (e.g. Akamai)
slow, expensive, non
scalable
2
Content Distribution Evolution
Layer-7 Switches
Satellite CDNs
CDNs
Akamai
Disappointment
Hype
P2P
Caching
IP Multicast
Enterprise
CDNs
Growth
3
2004
2003
2002
2001
2000
1999
Realism
Peer-Assisted Content
Distribution
4
Peer-Assisted Content Distribution
Server Farm
Desktop PCs can help each other!
• Clients become new servers
• Capacity increases with the
number of clients
• Limitless scalability and fast
speeds at extremely low cost!!
10000000
Number of Clients Served
1000000
100000
Cooperative
10000
Client/Server
1000
100
57
50
43
36
28
21
14
7
0
10
Tim e (sec)
4 MB file. Server 100 Mbps. Client 1 Mbps
5
Examples
• Updates/Critical Patches
– Adding large servers and egress capacity to absorb pick load is
quite expensive
– Alternative solution is to delay clients
» Patches do not arrive on-time
• Software Distribution
• TV On-Demand. Movie/Music downloads
• PodCasting
• Enterprise content distribution
6
P2P Content Distribution
• Benefits:
–
–
–
–
Dramatically improves speed
Limitless scalability
Minimum server requirements
Very cheap
• Challenges:
–
–
–
–
–
–
–
–
7
Requires incentives for cooperation
Hard to ensure end2end full connectivity
Security
Manageability
Lack of locality increases transit costs for ISPs
Asymmetric links (traffic engineering)
Variable bandwidth, peers come and go
Need for more sophisticated distribution algorithms
P2P Swarming
•
•
•
•
File is divided into many small pieces for distribution
Clients request different pieces from the server or from other clients
Clients become servers for those pieces downloaded
When all pieces are downloaded, clients can re-construct the whole file
Server
1 2 3 4 5 6
1
3
5 6
2
4
1 2 3 4 5 6
[Rodriguez, Biersack, Infocom’00]
8
The Challenge
If there are many users,
deciding which is the best piece to
download can be very hard!!
 Incorrect decisions result in low
throughput, nodes not able to
finish, bandwidth wasted, etc.
1
3
Solutions that require to have full
knowledge of who has what are nonscalable
Server
1 2 3 4 5 6
5 6
2
1 2 3 4 5 6
9
4
Avalanche:
Improving file swarming using Coding Techniques
10
Goal
• Provide a very fast and robust Peer-Assisted
solution for the distribution of legal content
• Current problems in existing File Swarming
solutions:
•Rare-blocks are hard to obtain
•Tit-for-tat incentive mechanisms decrease speeds
•Arrival of new users slows down old users
•Heterogeneous nodes do not interact well
•Same information travels repeatedly over bottleneck links
•Too much dependency from seeds
•Sudden departures can prevent peers from finishing
11
The Problem of Efficient
Scheduling of Information
Source
Block 1
Block 1
Block 2
Node C
Node A
Node B
Block 1, or 2, or 12?
12
The Avalanche Magic
• To solve problems of existing P2P file distribution
solutions, Avalanche uses special encoding algorithms
• Each encoded piece has the “DNA” of all pieces in the file.
=> A given encoded piece can be used by any peer in place of any piece
• Encoded pieces are created using linear equations that involve all
pieces in the file
• Reconstructing the file requires collecting enough encoded pieces
and solving the set of mathematical equations
13
Coding in general
• Assume file: F = [x1 x2], where xi is a block.
• Define code Ei(ai,1, ai,2) = ai,1*x1+ ai,2*x2, where
ai,1, ai,2 are numbers.
• “Infinite” number of Ei’s.
• Any two linearly independent Ei(ai,1, ai,2) can
recover [x1 x2].
– Similar as solving a system of linear equations.
• Operations in finite fields [such as GF(216)].
14
Avalanche Coding
File
B1
B2
Bn
Server
a1
Client A
a2
b1
b2
an
E1
E2
w1
Client B
•
•
15
bn
w2
E3
Content is encoded at the server
Clients can produce new encoded packets out of partial files [Chou et al., ’03]
Avalanche Robustness
Avalanche
Typical file-swarming systems
If server suddenly goes down (after serving the full file one), all Avalanche users are able to
complete the download. Only 10% of users using typical file-swarming techniques are able to
complete.
16
Finish Times
Avalanche Download Time
Avalanche
Typical swarming
Peers using typical fileswarming techniques that did
not finish.
Nodes (sorted by order of arrival)
=> Much lower and predictable download times
17
Finish Times
No need for nodes to stay around…
Nodes stay for ever
Nodes leave immediately
Nodes (sorted by order of arrival)
• With Avalanche, there is no need for nodes to stay after they finish the
download to help other nodes (the performance remains unchanged)
18
Minimum Server Requirements
Less than half the server requirements compared to systems based
on current file-swarming techniques.
19
Decoding Performance
Avalanche trades-off better speeds and less server
load for more processing power at each node
File Size (MB)
Blocks
Time
10
100
5 sec
50
100
37 sec
100
100
2m 21 sec
200
100
3m 38 sec
Note: Pentium III, 650MHz, 512MB RAM.
Decoding time is less than 4% of the total download
20
Summary
• Adding resources in an arbitrary fashion is not
efficient or cost effective
• We are witnessing a new Revolution
•Peer-Assisted solutions can be used by
content providers to provide hugely scalable,
and very fast distribution of legal content at
low cost
21
Download