Primary - SIGOPS

advertisement
Consistency-Based
Service Level Agreements
for Cloud Storage
Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla,
Mahesh Balakrishnan, Marcos K. Aguilera, Hussam Abu-Libdeh
Microsoft Research
“A foolish consistency is the
hobgoblin of little minds”
-- Ralph Waldo Emerson (1841)
“… and of large clouds”
-- Douglas Brian Terry (2013)
2
Today’s Cloud Storage Providers
• Replicate data widely
• Offer choice of strong or eventual
consistency
e.g. Amazon DynamoDB, Yahoo PNUTS,
Google App Engine, Oracle NoSQL, Cassandra, …
Microsoft Windows Azure
• Tradeoff consistency, availability and
performance
3
Problem
• Developers must choose consistency
• No single choice is best for all clients and situations
Client
Consistency
strong
eventual
U.S.
England
India
China
(secondary)
(primary)
(secondary)
(client only)
147.5
1.2
435.5
307.23
1.1
1.0
1.1
160.2
roundtrip times in milliseconds
4
Pileus key features
a cap cloud
• Replicated, partitioned key-value store
• Choice of consistency
• Consistency-based service level
agreements (SLAs)
5
Pileus System Model
API
secondary nodes
primary core
sync
replication
lazy
replication
BeginSession (SLA)
BeginTx (SLA)
Put (key, value)
Get (key, SLA)
returns value,
consistency
EndTx ()
EndSession ()
6
Read Consistency Guarantees
Strong Consistency
Causal Consistency
Bounded Staleness (t)
Read My Writes
Return value of latest Put.
Return value of latest causally
[COPS 2011]
preceding Put.
Return value that is stale by at most
[TACT 2002]
t seconds.
Return value of latest Put in client
session or a later value.
[Bayou 1994]
Monotonic Reads
Return same or later value as earlier
Get in client session.
Eventual Consistency
Return value of any Put.
7
Read Latencies
Client/
Consistency
U.S.
England
India
China
(secondary)
(primary)
(secondary)
(client only)
consistency
affects latency
strong
147.5
1.2
435.5
307.3
causal
146.3
1.0
client location
affects latency306.4
431.6
bounded(30)
75.1
1.0
234.6
241.9
read-my-writes
13.0
1.1
18.4
166.8
monotonic
1.1
1.0
1.1
160.2
eventual
1.1
1.0
1.1
160.2
roundtrip times in milliseconds
8
Consistency-based SLA
• Applications declare desired consistency/latency
Shopping Cart:
consistency
strong
latency
300 ms.
utility
1.0
2.
read my writes
300 ms.
0.5
3.
eventual
300 ms.
0.1
1.
9
SLA Enforcement: Client Monitoring
For each tablet:
Node
Primary?
A
yes
210
B
no
166
C
no
203
from
configuration
service
RTTs
measured on
Gets, Puts,
and pings
High Timestamp
returned from
Gets, Puts, and
pings
10
SLA Enforcement: Node Selection
On Get (key, SLA):
1. For each subSLA and node,
a. compute Platency
b. compute Pconsistency
c. compute Platency x Pconsistency x utility
2. Select node with maximum expected utility
3. Send Get operation to node
4. Measure RTT and update records
5. Return data and delivered consistency to caller
11
Experimental Setup
System configuration:
161
U.S.
149
England
436
308
287
India
China
181
Primary: England
Secondaries: U.S., India
Clients: U.S., England, India, China
Benchmark:
• YCSB with 50/50 Gets/Puts
• 500-op sessions
Node selection schemes:
•
•
•
•
Primary = get from primary
Random = get from random node
Closest = get from closest node
Pileus = get from node with highest
expected utility
Measurement:
• Average utility for Get operations
12
Experiment #1: SLA
Simplified shopping cart SLA:
consistency
latency
utility
1.
read my writes
300 ms.
1.0
2.
eventual
300 ms.
0.5
13
Average utility
per Get
Experiment #1: Delivered Utility
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
14
Experiment #1: Delivered Utility
Average utility
per Get
Primary selection works well
when close to the primary,
but poorly when distant
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
15
Experiment #1: Delivered Utility
Average utility
per Get
Random selection rarely
works well
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
16
Experiment #1: Delivered Utility
Average utility
per Get
100% Gets from
England;
100% meet top
subSLA
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
17
Experiment #1: Delivered Utility
Average utility
per Get
91% from U.S.;
9% from England;
100% meets top
subSLA
14.5 ms. avg. latency
vs.
148 ms. for primary
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
18
Experiment #1: Delivered Utility
Average utility
per Get
99.6% from U.S.;
0.4% from India;
96% meets top
subSLA
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
19
Average utility
per Get
Experiment #1: Delivered Utility
(secondary)
(primary)
(secondary)
(client only)
Pileus always
delivers the
Client datacenter
most utility!
20
Experiment #1: Delivered Utility
Average utility
per Get
9% fail to
meet readmy-write
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
21
Experiment #2: SLA
Password checking SLA:
consistency
latency
utility
1.
strong
150 ms.
1.0
2.
eventual
150 ms.
0.5
3.
strong
1000 ms.
0.25
22
Average utility
per Get
Experiment #2: Delivered Utility
(secondary)
(primary)
(secondary)
(client only)
Client datacenter
23
Conclusions: Main Contributions
Our Pileus system
• provides a broad choice of consistency guarantees
and range of delivered read latency
• allows declarative specification of desired consistency
and latency through consistency-based SLAs
• selects nodes to maximize expected utility while
adapting to varying conditions
Download