TRODS Transparent Recovery for Object Delivery Services Wyatt Lloyd

advertisement
Presented at DSN-DCCS 2011 in Hong Kong on 6/28/11
TRODS
Transparent Recovery for
Object Delivery Services
Wyatt Lloyd, Michael J. Freedman
Princeton University
2
3
4
Service
Server
Client
Connection
Recovered!
Server
Server
Server
5
Object Delivery Services
• Read-Only
• Static Content
• Webpages, Images, Videos
6
Work Now
• Can’t Modify Clients
7
Key Idea
• Coerce client to help
– To identify connections that need recovery
– To reliably store information
• Yet client is unmodified and unaware
– Exploit TCP spec to control client’s stack
8
Object Delivery Cluster
Service
Server
Load
Balancer
Liveness
Monitor
Server
Server
Server
9
Failure
Service
Server
Load
Balancer
Liveness
Monitor
Server
Server
Server
10
TRODS
Service
?
Client
Load
Balancer
Liveness
Monitor
Server
Server
Server
11
TRODS
Service
?
Client
Load
Balancer
Liveness
Monitor
Server
Server
Server
Store
12
Road to Recovery
Step
Technique
Redirect to live server ………………. Liveness monitor updates
load balancer
Induce client to send packet………Coerce client’s TCP stack
Continue Connection
Determine Phase………………… Use packet + stored info
Identify Object……………………. Stored Info
Find Offset…………………………..Use packet + stored info
13
Coercing Clients
• Always Leave A Packet Unacknowledged
Exploit TCP Spec for Recovery Initiation!
Client
Server
FIN/ACK
Request
ACK
SYN
Response
SYN/ACK
ACK
FIN 321
Retransmit Queue
Request
FIN/ACK
SYN
Retransmit Queue
Response
SYN/ACK
FIN 321
Always Something Here
14
Continuing the Connection
• Determine Phase:
1) TCP Setup
2) HTTP Setup
3) HTTP Download
TRODS Saves Info
4) TCP Teardown
15
Continuing the Download
• HTTP ObjectID
• Offset = TCP Ack – HTTP ObjectISN
HTTP ObjectISN
HTTP
S
N Resp
Y Header
TCP ISN
TCP Ack
HTTP
Object
16
Continuing the Download
• HTTP ObjectID
• Offset = TCP Ack – HTTP ObjectISN
HTTP ObjectISN
HTTP
S
N Resp
Y Header
TCP ISN
TCP Ack
HTTP
Object
17
Persistent Store
• Key-Value Store
+ Corner Cases Handled
+ Unlimited Objects
– Still Efficient (1 save only)
• TCP Timestamp
IP
KV
T
TCP S
Payload
+ Very Efficient (1 machine only)
– 1 Million Exploit
Object Limit
TCP Spec for Persistence!
– Corner Cases
18
Recover the Connection
• Initiate New Connection
– GET ObjectID …
– Range: bytes=Offset-
• Splice Connections Together
• Works with Unmodified Servers!
19
TRODS
1) Packet Manipulation
Server
IP
TCP
…
IP
TCP’
…
TCP
TRODS
IP
20
TRODS
1) Packet Manipulation
2) Protocol Inspection
Server
Response1
TCP
ObjISN
ObjID
TRODS
Request
IP
Request
21
TRODS
1) Packet Manipulation
2) Protocol Inspection
3) Blocks Connection
Server
TCP
ObjID
ObjISN
TRODS
Response1
IP
22
TRODS
1)
2)
3)
4)
Packet Manipulation
Protocol Inspection
Blocks Connection
State Injection
Server
IP
TCP
…
IP
T
TCPS
…
TCP
TRODS
IP
23
TRODS
1)
2)
3)
4)
5)
Packet Manipulation
Protocol Inspection
Blocks Connection
State Injection
Recovery Initiation
Server
TCP
?
TRODS
IP
Ack
24
Failure Walkthrough
Service
Server
Response
SYN/ACK
TCP 1
TROD
S
IP
ClientSYN
ACK
Request
Load
Balancer
ID
IS
N
Liveness
Monitor
Server
TCP
TROD
S
IP
Server
KV Store
TCP
TROD
S
IP
25
Failure Walkthrough
Service
Liveness
Monitor
!
Client
ACK
ACK
ACK
FIN
ACK
Load
Balancer
Response
Server
2
Response
3
Response
FIN 4
TCP
TROD
S
IP
Server
?
ID
IS
N
KV Store
TCP
TROD
S
IP
26
Related Work
• New Transport
– Trickles, SCTP, TCP Migrate, …
• TCP
– FT-TCP, ST-TCP, Backdoors, …
• HTTP
– CoRAL, …
27
Implementation
• Linux Kernel Module
• 3,000 lines of C
• ~CoRAL
– Optimistic subset of CoRAL
28
Experiments
• Additional Latency
– Normal
– Failure
• Throughput
– Lighttpd @ Princeton
– Apache @ Emulab
– Hybrid TS & KV Throughput
– Failure
29
Normal Case Latency
• TRODS-TimeStamp (TS)
– Median: + 0.009 ms
– 99th: + 0.012 ms
• TRODS-Key-Value (KV)
– Median: + 0.137 ms
– 99th: + 0.148 ms
30
Recovery Latency
1
~15%
0.8
CDF
~35%
0.6
0.4
~50%
0.2
0
~0
.2ms
20ms
200ms
3s
Additional Latency
Blink of an eye
31
ThroughPut Per Server
120 ops/s
30 ops/s
30 ops/s/server
Raw
Frontend
120 ops/s
30 ops/s
TPPS 20 ops/s/server
32
Requests / Sec / Server
9%
22500
20000
17500
15000
12500
10000
7500
5000
2500
Lighttpd
38%
KV/Server: 1/8
KV/Server: 1/4
Unmodified
TRODS-TS
TRODS-KV
~CoRAL
7%
KV/Server:
66%1/34
KV/Server: 1/2
1KB
2KB
4KB
8KB 16KB 32KB 64KB 128KB
Web Object Size
33
Apache
Normalized TPPS
1
0.8
0.6
0.4
Unmodified
TRODS-TS
TRODS-KV
0.2
FT-TCP(cold)
~CoRAL
FT-TCP(hot)
0
1KB
2KB
4KB
8KB
16KB
32KB
64KB
Web Object Size
34
Summary
• Recover Object Delivery Connections
Unmodified
• Exploit TCP Specification to Coerce^Clients
– To send recovery-starting packets
– To provide persistent storage
• Evaluation
– Low Latency
– High Throughput Per Server
35
Summary
• Recover Object Delivery Connections
Unmodified
• Exploit TCP Specification to Coerce ^Clients
– To send recovery-starting packets
– To provide persistent storage
• Evaluation
– Low Latency
– High Throughput Per Server
• Questions?
36
Download