Scale and Performance in the CoBlitz Large-File Distribution Service KyoungSoo Park

advertisement
Scale and Performance
in the CoBlitz
Large-File Distribution Service
KyoungSoo Park
Vivek S. Pai
Princeton University
Large-file Distribution
• Increasing demand for large files
• Movies or software release
• On-line movie downloads
• Linux distribution
• Files are 100MB ~ a couple of GB
• One-to-many downloads
• Nice to use a CDN, but…
KyoungSoo Park
NSDI 2006
2
Why Not Web CDNs?
• Whole file caching
• Optimized for 10KB objects
• 2GB = 200,000 x 10KB
• Memory pressure
• Working sets do not fit in memory
• Disk access 1000 times slower
• Waste of resources
• More servers needed
• Provisioning is a must
KyoungSoo Park
NSDI 2006
3
Peer-to-Peer?
• BitTorrent takes up ~30% Internet BW
• Custom software
• Deployment is a must
• Configurations needed
• Companies may want managed service
• Handles flash crowds
• Handles long-lived objects
KyoungSoo Park
NSDI 2006
4
What We’d like is
Large-file Service with
No custom client
No custom server
No prepositioning
No rehosting
No manual provisoning
KyoungSoo Park
NSDI 2006
5
CoBlitz: Scalable large-file CDN
• Reducing the problem to small-file CDN
•
•
•
•
Split large-files into chunks
Distribute chunks at proxies
Aggregate memory/cache
HTTP needs no deployment
• Benefits
• Faster than BitTorrent by 55-86% (~500%)
• One copy from origin serves 43-55 nodes
• Incremental build on existing CDNs
KyoungSoo Park
NSDI 2006
6
How it works
CDN = Redirector +
DNS
Reverse Proxy
Only reverse proxy(CDN) caches the chunks!
chunk1
chunk2
CDN
CDN
HTTP
RANGE QUERY
Origin
Server
coblitz.codeen.org
chunk1
Client
Agent
CDN
chunk 3
chunk 3
CDN
Agent
Client
chunk 5
chunk 5
CDN
chunk5
KyoungSoo Park
chunk 1
chunk3
CDN
chunk4
NSDI 2006
7
Smart Agent
• Preserves HTTP semantics
• Parallel chunk requests
sliding window of “chunks”
HTTP
Client
KyoungSoo Park
waiting
done
done
done
waiting
waiting
done
done
waiting
…
waiting
…
waiting
…
NSDI 2006
CDN
CDN
CDN
CDN
CDN
8
Operation & Challenges
• Provides public service over 2 years
• http://coblitz.codeen.org:3125/URL
• Challenges
• Scalability & robustness
• Peering set difference
• Load to the origin server
KyoungSoo Park
NSDI 2006
9
Unilateral Peering
• Independent peering decision
• No synchronized maintenance problem
• Motivation
• Partial network connectivity
• Internet2, CANARIE nodes
• Routing disruption
• Isolated nodes
• Improve both scalability & robustness
KyoungSoo Park
NSDI 2006
10
Peering Set Difference
• No perfect clustering by design
• Assumption
• Close nodes shares common peers
Both can reach
Only can reach
Only can reach
KyoungSoo Park
NSDI 2006
11
Peering Set Difference
• Highly variable App-level RTTs
• 10 x times variance than ICMP
• High rate of change in peer set
• Close nodes share less than 50%
• Low cache hit
• Low memory utility
• Excessive load to the origin
KyoungSoo Park
NSDI 2006
12
Peering Set Difference
• How to fix?
•
•
•
•
Avg RTT  min RTT
Increase # of samples
Increase # of peers
Hysteresis
• Close nodes share more than 90%
KyoungSoo Park
NSDI 2006
13
Reducing Origin Load
• Still have peering set difference
Origin server
• Critical in traffic to origin
• Proximity-based routing
•
•
•
•
cf. P2P: key-based routing
Converge exponentially fast
3-15% do one more hop
Implicit overlay tree
Rerun hashing
• Result
• Origin load reduction by 5x
KyoungSoo Park
NSDI 2006
14
Scale Experiments
• Use all live PlanetLab nodes as clients
• 380~400 live nodes at any time
• Simultaneous fetch of 50MB file
• Test scenarios
•
•
•
•
Direct
BitTorrent Total/Core
CoBlitz uncached/cached/staggered
Out-of-order numbers in paper
KyoungSoo Park
NSDI 2006
15
Throughput Distribution
1
Fraction of Nodes <= X (CDF)
0.9
0.8
0.7
BT-Core
Out-of-order staggered
55-86%
0.6
0.5
Direct
0.4
BT - total
0.3
BT - core
0.2
In - order uncached
In - order staggered
0.1
In - order cached
0
0
KyoungSoo Park
2000
4000
6000
NSDI 2006
Throughput(Kbps)
8000
10000
16
Downloading Times
95% percentile: 1000+ secs faster
1
0.9
Fraction of Nodes <= X
0.8
In-order cached
0.7
0.6
In-order staggered
0.5
In-order uncached
0.4
BT-core
0.3
BT-total
0.2
Direct
0.1
0
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Download Time (sec)
KyoungSoo Park
NSDI 2006
17
Synchronized Workload
Congestion
Origin Server
KyoungSoo Park
NSDI 2006
18
Addressing Congestion
• Proximity-based multi-hop routing
• Overlay tree for each chunk
• Dynamic chunk-window resizing
• Increase by 1/log(x), (where x is win size)
if chunk finishes < average
• Decrease by 1 if retry kills the first chunk
KyoungSoo Park
NSDI 2006
19
Number of Failures
5.7
Failure Percentage(%)
6
5
4.3
4
3
2.1
2
1
0
Direct
KyoungSoo Park
BitTorrent
NSDI 2006
CoBlitz
20
Performance after Flash Crowds
1
BitTorrent
0.9
Fraction of Nodes > X
In-order CoBlitz
CoBlitz:70+% > 5Mbps
0.8
0.7
0.6
0.5
0.4
BitTorrent: 20% > 5Mbps
0.3
0.2
0.1
0
0
KyoungSoo Park
5000
10000
15000
20000
Throughput(Kbps)
NSDI 2006
25000
30000
35000
21
Data Reuse
7 fetches for 400 nodes, 98% cache hit
Utility
(# of nodes served / copy)
60
55
50
40
35
30
20
10
7.7
0
Shark
KyoungSoo Park
BitTorrent
NSDI 2006
CoBlitz
22
Comparison with Other Systems
• Shark [NSDI05]
• Med thruput 0.96 Mbps with 185 clients
• CoBlitz: 3.15Mbps with 380~400 clients
• Bullet, Bullet’[SOSP03, USENIX05]
• Using UDP, Avg 7Mbps with 41 nodes
• CoBlitz: slightly better(7.4Mbps) with only
TCP connections
KyoungSoo Park
NSDI 2006
23
Real-world Usage
• Fedora Core official mirror
• http://coblitz.planet-lab.org/
• US-East/West, England, Germany, Korea, Japan
• CiteSeer repository (50,000+ links)
• PlanetLab researchers
• Stork(U of Arizona) + ~10 others
KyoungSoo Park
NSDI 2006
24
1.0E+07
107
1.0E+06
106
1.0E+05
105
1.0E+04
104
1.0E+03
103
102
1.0E+02
10
1.0E+01
1.0E+000
Number of Requests
Number of requests
Usage in Feb 2006
B
4G
2~ B
2G
1~ GB
B
1
5~ M
0. 512
B
6~ M
25 256
8~ B
12 28M
~1 B
64 4M
~6 B
32 2M
~3
16 MB
16
8~ B
8M
4~ B
4M
2~ B
2M
0~
File Size
KyoungSoo Park
NSDI 2006
25
600
CD ISO
500
400
DVD ISO
300
200
100
Total Bytes Served(GB)
Number of Bytes Served
0
B
4G
2~ B
2G
1~ B
1G B
5~ 2M
0.
51 B
6~
25 56M
2
8~ B
12 8M
2
~1 B
64 M
4
~6
32 MB
2
~3
16 B
M
16
8~ B
8M
4~ B
4M
2~ B
2M
0~
File Size
26
NSDI 2006
KyoungSoo Park
Fedora Core 5 Release
• March 20th, 2006
• Peaks over 700Mbps
M
M
M
Release point 10am
KyoungSoo Park
NSDI 2006
27
Conclusion
• Scalable large-file transfer service
• Evolution under real traffic
• Up and running 24/7 for over 2 years
• Unilateral peering, multi-hop routing,
window size adjustment
• Better performance than P2P
• Better throughput, download time
• Far less origin traffic
KyoungSoo Park
NSDI 2006
28
Thank you!
More information:
http://codeen.cs.princeton.edu/coblitz/
How to use:
http://coblitz.codeen.org:3125/URL*
*Some content restrictions apply
See Web site for details
Contact me if you want full access!
KyoungSoo Park
NSDI 2006
29
Download