An Architecture for Internet Data Transfer Niraj Tolia Carnegie Mellon University

advertisement
An Architecture for Internet Data Transfer
Niraj Tolia
Michael Kaminsky*, David G. Andersen, and Swapnil Patil
Carnegie Mellon University and *Intel Research Pittsburgh
Innovation in Data Transfer is Hard
2001
CAN
2001
Bittorrent
2003
2003
DataStaging Riverbed
2001
Pastry
2003
CASPER
2002
2004
Segank
2004
OpenDHT
2003
2004
BlueFS
2004
USPS
2004
2005
ALM-FastReplica
2005
2001
Chord
2001
IBP
2002
Value-Caching
2003
Bullet
2003
LBFS
2001
Avalanche
2005
2006
CoBlitz
2005
Lookaside Caching
2004
• Imagine: You have a novel data transfer technique
• How do you deploy?
1. Update HTTP. Talk to IETF. Modify Apache, IIS, Firefox,
Netscape, Opera, IE, Lynx, Wget, …
2. Update SMTP. Talk to IETF. Modify Sendmail, Postfix, Outlook…
3. Give up in frustration
2
Niraj Tolia © May 2006
Barriers to Innovation in Data Transfer
• Applications bundle:
• Content Negotiation: What data to send
– Naming (URLs, directories, …)
– Languages
– Identification
–…
• Data Transfer: Getting the bits across
• Both are tightly coupled (e.g., HTTP, SMTP)
• Hinders innovation and evolution of new
services
3
Niraj Tolia © May 2006
Solution: A Data Transfer Service
Sender
Application Protocol
and Data
Xfer Service
Receiver
Xfer Service
Data
• Decouple content negotiation from data transfer
• Applications perform negotiation as before
• But hand data objects to the Transfer Service
• The Transfer Service is shared by applications
4
Niraj Tolia © May 2006
Extensible Transfer Architecture
Sender
Application Protocol
Xfer Service
USB
Keychain
Receiver
Xfer Service
Bittorrent
Bittorrent
Plugins
 Application-independent cache
 New network features
 Non-networked transfers
5
Local
Cache
USB
Keychain
Niraj Tolia © May 2006
Transfer Service Benefits
 Apps. can reuse available transfer techniques
• No reimplementation needed
 Easier deployment of new technologies
• Applications need no modification
 Provides for cross-application sharing
• Can interpose on all data transfers
 Handles transient disconnections
6
Niraj Tolia © May 2006
Outline
•
•
•
•
•
Motivation
Data Oriented Transfer (DOT) service
Evaluation
Open Issues and Future Work
Conclusion
7
Niraj Tolia © May 2006
10,000 Foot View of Transfers using DOT
Request File X
Sender
Receiver
put(X)
?
read()
data
?
Xfer Service
Xfer Service
• How does the transfer service name data?
• How does the transfer service locate data?
8
Niraj Tolia © May 2006
DOT: Object Naming
• Application defined names are not portable
• Use content-naming for globally unique names
• Objects represented by an OID
Foo.txt
OID
Cryptographic Hash
File
• Objects are further sub-divided into “chunks”
Desc1
Desc2
Desc3
File
• Each OID corresponds to a list of descriptors
• Descriptor lists allow for partial transfers
9
Niraj Tolia © May 2006
DOT: Object Location
• Data transfers in DOT are receiver driven
• Receiver has better idea of available resources
• Senders specify ‘hints’ - potential data locations
– dot://sender.example.com:12000/
– dht://opendht.org/
–…
10
Niraj Tolia © May 2006
A Transfer using DOT
Request File X
OID, Hints
Sender
put(X)
OID, Hints
Xfer Service
Receiver
get(OID, Hints)
Transfer
Plugins
11
read()
data
Xfer Service
Niraj Tolia © May 2006
DOT’s Modular Architecture
Application
(1) Application
API
Transfer Network
Plugin
DOT
(3) Storage Plugin
API
Storage
Plugin
(2) Transfer
Plugin API
Local Storage
12
Niraj Tolia © May 2006
Transfer Plugin API
• Simple API
• get_descriptor_list( OID, hints )
• get_chunks( descriptor_list, hints )
• cancel_chunks( chunk_list )
• Transfer plugin chaining is easy
• e.g., multipath plugin
DOT
Transfer
Plugin
Network
Transfer
MultiPath
Plugin
Plugin
13
Transfer
Plugin
Niraj Tolia © May 2006
Implementation
• In C++ using libasync event-driven library
• One storage plugin:
– In-memory hash tables, disk backed.
• Three transfer plugins:
• Default Xfer-Xfer plugin
• Portable Storage plugin
• Multipath plugin
• Applications
• gcp, an scp-like tool for file transfers
• A DOT-enabled Postfix email server
– Included a socket-like adapter library
14
Niraj Tolia © May 2006
Current DOT Prototype
USB
Xfer
USB
NET
wireless
Xfer
NET
Internet
SENDER
NET
( DSL )
Multipath
cache
RECEIVER
NET
Plugins
Xfer
 Application-independent cache
 Multipath and Mirror support
 Non-networked transfers
MIRROR
15
Niraj Tolia © May 2006
Outline
•
•
•
•
•
Motivation
Data Oriented Transfer (DOT) service
Evaluation
Open Issues and Future Work
Conclusion
16
Niraj Tolia © May 2006
Evaluation
•
•
•
•
Standard file transfer
Portable Storage
Multi-Path
Case Study: Postfix Email Server
• Capture and analysis of email trace
• Evaluation of DOT-enabled SMTP server
• Integration effort
17
Niraj Tolia © May 2006
Standard File Transfer Setup
Network
Emulator
• Two DOT-enabled machines
• Network Emulator
• Evaluate various b/w + delay combinations
• Use gcp for the file transfers
• Used 40MB, 4MB, 400KB, 40KB, 4KB files
• Presenting 40MB here
18
Niraj Tolia © May 2006
Standard File Transfer
scp
scp w/o encr.
gcp
3 .7
3 .6
4.0
3 .7
4 .1
5.0
3.0
1 .2
0 .7
1.0
1 .6
2.0
0 .5
Transfer Time (sec) - 40 MB
wget
0.0
1000 Mb/s
100 Mb/s
• Overhead: hashing, extra RTT
• No noticeable overheads with latency
19
Niraj Tolia © May 2006
Portable Storage Experiment
2 Mbit/s
• 255 MB transfer over emulated DSL
• Based on Virtual Machine transfers at Carnegie Mellon
• DOT preemptively copies data onto Flash drive
• Wait 5 minutes, plug flash drive into receiver
• Two drive speeds
• 8MB/s - 1GB
• 20MB/s - 2GB
20
Niraj Tolia © May 2006
Portable Storage Results
Device Inserted
.. 1126s
(~ 19 min)
21
Niraj Tolia © May 2006
Multipath Plugin: Load Balancing
Network
Emulator
Gigabit
Experimental links
• Varied capacity + delay of experimental links
• Compare fastest link alone with multipath plugin
on both links; what speedup?
• Transferred 40MB file
• 128 KB socket buffer sizes
22
Niraj Tolia © May 2006
Multipath Plugin is Effective
Link 1
Gigabit
Link 1
Link 2
100/0
100/0
10/0
Link 2
Single
Multipath
Savings
3.59
1.90
3.54
47%
1.4%
-40 MB @100/66
100Mbit/s ideal: 3.223.20
seconds
46%
100/66
10/66
43.33
22.97
47%
-Multipath
1/66 plugin nearly doubles
38.25throughput
12%
- TCP effects dominate. Pipe not full.
- Multipath plugin doubles by adding second
stream. Actual capacity irrelevant.
23
Niraj Tolia © May 2006
Postfix Email Trace Replay
• Generated 10,000 email messages from trace
• Random data matched to chunk hash data
• Preserves some similarity between messages
• Replayed through Postfix to a single local server
Program
Seconds
Bytes Sent
Postfix
468
172 MB
Postfix + DOT
468
117 MB (68%)
• Postfix disk bound… DOT CPU overhead negligible
• Savings due to duplication within emails
24
Niraj Tolia © May 2006
Postfix Integration
• Integrated DOT with the Postfix mail server
Program
LoC
Added LoC
%
GTC Lib
--
421
Postfix
70,824
184
0.3%
smtpd
6,413
107
1.7%
smtp
3,378
71
2.1%
• 1 part-time week, 1 student new to Postfix
• Includes time to write generic adapter library
25
Niraj Tolia © May 2006
Discussion on Deployment
• Application Resilience
• DOT is a service - it’s outside the control of the
application.
• Our Postfix falls back to normal SMTP if
– No Transfer Service contact
– Transfer keeps failing
• In the short term, a simple fallback is encouraged.
However, this could interfere with some functions
– DOT-based virus scanner…
• In the long term, DOT would be a part of a
system’s core infrastructure
26
Niraj Tolia © May 2006
Future Work
• Security
• Application encrypts before DOT
- No block-based caching, reuse, mirroring, …
• No encryption
- Resembles the status quo
• In progress: Convergent encryption
– Requires integration with DOT chunking
• Application Preferences
• Encryption, QoS, priorities, …
– DOT might benefit from application input
• Need an extensible way to express these
27
Niraj Tolia © May 2006
Conclusion
• DOT separates app. logic from data transfer
• Makes it easier to extend both
• Architecture works well
• Overhead low (especially in wide-area)
• Major benefits
• Caching
• Flexibility to implement new transfer techniques
28
Niraj Tolia © May 2006
Backup Slides
29
Niraj Tolia © May 2006
Normal SMTP
Server
SMTP Client
EHLO
250 Hello
MAIL FROM: user
…
DATA
250 OK
30
Niraj Tolia © May 2006
DOT-Enabled SMTP
Server
SMTP Client
EHLO
250 Hello
MAIL FROM: user
…
X-DOT-DATA
(OID+Hints)
Xfer Service
250 OK
31
Niraj Tolia © May 2006
Convergent Encryption
Hash1
Hash2
File
Hash3
• Chunki is encrypted using Hashi
• All identical cleartext blocks will map to the
same encrypted block
• Hashi is further encrypted using a private key
32
Niraj Tolia © May 2006
28.0
25.5
25.4
24.0
gcp
0 .7
1 .2
1 .6
3 .6
3 .7
3 .7
4 .1
10.0
1.0
scp w/o encr.
43.5
50.4
48.1
44.9
scp
100.0
0 .5
Transfer Time (sec) - 40 MB
wget
70.5
72.3
72.3
72.0
Standard File Transfer
0.1
1000 Mb/s
100 Mb/s
100 Mb/s
66ms
33
20 Mb/s
33ms
5 Mb/s 66ms
Niraj Tolia © May 2006
Mail Server Evaluation
• Trace: 159 days at low volume academic
mail server
• 458,861 messages
• hash, size of: message, headers, body
• Message chunks
• hash and size of each chunk
• Static chunking and Rabin fingerprinting
(Content-based block division)
34
Niraj Tolia © May 2006
DOT chunk caching benefits email
Method
Total Bytes
Percent Bytes
SMTP Default
6800 MB
-
DOT body
5876 MB
86.41%
Rabin body
5056 MB
74.35%
Rabin whole
5496 MB
80.81%
35
Niraj Tolia © May 2006
Default GTC-GTC Transfer Protocol
Sender
Receiver
GET_DESCRIPTORS(OID)
Desc list 1,2,…
GET_CHUNKS(…
)
Chunk 1
Chunk 2
GET_CHUNKS(…
)
• GTC-GTC protocol mirrors transfer plugins
• Implemented as RPC calls
• (Fetches are actually pipelined)
36
Niraj Tolia © May 2006
Rabin Fingerprinting
Hash 1
Hash 2
File Data
Rabin Fingerprints
4
7
8
2
Natural Boundary
8
Natural Boundary
Given Value - 8
37
Niraj Tolia © May 2006
Rabin Fingerprinting: Examples of Edits
1. Original File
2. Addition in chunk
– Changes only one hash
Figure from “A Low-bandwidth Network File System”
3. Addition creating a new
breakpoint
4. Deletion changing size of
chunk
38
Niraj Tolia © May 2006
DOT Objects Naming
• Objects represented by an OID
• Divided into “chunks”, each with a descriptor
• Each OID corresponds to a list of descriptors
• Data is fetched using descriptor lists
• Supports partial transfers
39
Niraj Tolia © May 2006
Innovation in Data Transfer is Hard
• Imagine: You have a novel data transfer technique
• Say… Bittorrent, a P2P protocol for sharing large files
• How do you deploy?
1. Update HTTP. Talk to IETF. Modify Apache, IIS,
Firefox, Netscape, Opera, IE, Lynx, Wget, …
2. Update SMTP. Talk to IETF. Modify Sendmail, Postfix,
Exchange, Mail.app, Eudora, …
3. Give up in frustration
40
Niraj Tolia © May 2006
DOT’s Modular Architecture
Application
(1) Application
API
Transfer
Plugin
DOT
(3) Storage Plugin
API
Storage
Plugin
Network
Transfer
Plugin
(2) Transfer
Plugin API
Transfer
Plugin
Local Storage
41
Niraj Tolia © May 2006
DOT Plugins
• Multipath plugin
• List of sub-plugins
• Balances load
• Portable storage plugin
• Sender: Copies new data onto USB flash device
• Receiver: Scans USB flash device for blocks
– naïve filesystem layout, unoptimized
• but effective
42
Niraj Tolia © May 2006
Portable Storage Results
Device Inserted
.. 1126s
(~ 19 min)
43
Niraj Tolia © May 2006
DOT email chunk caching evaluation
• Step 1: Caching analysis (infinite cache)
• SMTP default: What was really sent
• DOT body: Whole-body only caching, headers
sent separately
• Rabin body: Headers sent separately, rabin
fingerprint chunking of body
• Rabin whole: Headers+body chunked together
• Easiest to implement for application. Just send
data…
• Step 2: Trace replay through Postfix
44
Niraj Tolia © May 2006
Related Work
• BEEP
• Proxy-based data interposition approaches
– RON, X-Bone, OCALA
• Other Content-Addressable Systems
– Bittorrent, DHTs, DTNs, EMC’s Centera
• Using Content-Addressability to save on data transfers
– CASPER, LBFS, Rhea et al., Spring et al.
• Portable Storage
– Lookaside Caching, BlueFS
• Other transfer protocols
– GridFTP, IBP, HTTP, etc.
45
Niraj Tolia © May 2006
Transfer plugin API
• get_descriptors( OID, hints )
• get_chunks( descriptor, hints )
• cancel_chunks( chunk_list )
• Hints specify a plugin + data
• gtc://sender.example.com:12000/
• dht://opendht.org/
• …
• Transfer plugin chaining is easy
• e.g., multipath plugin
46
Niraj Tolia © May 2006
Download