SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services Zhe Wu, Michael Butkiewicz, Dorian Perkins, Ethan Katz-Bassett, Harsha V. Madhyastha UC Riverside and USC Geo-distributed Services for Low Latency •2 Cloud Services Simplify Geo-distribution •3 Need for Geo-Replication Data uploaded by a user may be viewed/edited by users in other locations • Social networking (Facebook, Twitter) • File sharing (Dropbox, Google Docs) Geo-replication of data is necessary Isolated storage service in each cloud data center Application needs to handle replication itself •4 Geo-replication on Cloud Services Lots of recent work on enabling geo-replication • Walter(SOSP’11), COPS(SOSP’11), Spanner(OSDI’12), Gemini(OSDI’12), Eiger(NSDI’13)… • Faster performance or stronger consistency Added consideration on cloud services Minimizing cost •5 Outline Problem and motivation SPANStore overview Techniques for reducing cost Evaluation •6 SPANStore Key value store (GET/PUT interface) spanning cloud storage services Main objective: minimize cost Satisfy application requirements • Latency SLOs • Consistency (Eventual vs. sequential consistency) • Fault-tolerance •7 SPANStore Overview Data center A SPANStor e App Library Data center B Read/write data based on Metadata optimal replication policy lookups Data center C Return data/ACK request Data center D •8 SPANStore Overview SPANStore Characterization Application Input Inter-DC latencies Pricing policies Latency, consistency and fault tolerance requirements Data center B Data center A SPANStor e Data center C SPANStor e App Placement Manager workload Replication policy SPANStor e App Data center D SPANStor e App •9 Outline Problem and motivation SPANStore overview Techniques for reducing cost Evaluation •10 Questions to be addressed for every object: • Where to store replicas • How to execute PUTs and GETs Cloud Storage Service Cost Storage cost (the amount of data stored) + Request cost (the number of PUT and GET requests issued) = Storage service cost + Data transfer cost (the amount of data transferred out of data center) •12 R Latency bound = 100ms 6 S3-only 5 4 3 R R 2 R 1 0 So h ut c ifi c ifi 2 1 a ic er Am 3 c Pa c Pa ific AWS regions ia As ia As t2 es t1 es t c Pa W W s Ea ia As EU S S US U U # of data centers within bound Low Latency SLO Requires High Replication in Single Cloud Deployment •13 R R Latency bound = 100ms 6 5 4 3 S3-only S3+Azure+GCS R R R 2 R 1 0 t2 es t1 es t ica er Am h 3 ut ic So cif Pa 2 ia ic As cif Pa 1 ia ic As acif P W W s Ea ia As EU US US US # of data centers within bound Technique 1: Harness Multiple Clouds AWS regions •14 Price Discrepancies across Clouds Cloud region Storage price (GB) Data transfer price (GB) GET request price (10000 requests) PUT request price (1000 requests) S3 US West 0.095$ 0.12$ 0.004$ 0.005$ Azure Zone2 0.095$ 0.19$ 0.001$ 0.0001$ GCS 0.085$ 0.12$ 0.01$ 0.01$ … … … … … Leveraging discrepancies judiciously can reduce cost •15 Range of Candidate Replication Policies Strategy 1: single replica in cheapest storage cloud R High latencies •16 Range of Candidate Replication Policies Strategy 2: few replicas to reduce latencies High data transfer cost R R •17 Range of Candidate Replication Policies Strategy 3: replicated everywhere High storage cost ROptimal R replication policy depends on:R 1. application requirementsR PUT 2. workload properties High latencies& cost of PUTs •18 High Variability of Individual Objects Analyze predictability of Twitter workload CDF of analyzed hours 1 Error can be as of 20% 60% of hours havehave error high ashours 1000% error higher higher 100%than 50% User than 1 0.8 0.6 0.4 User 2 User 3 User 4 User 5 One user 0.2 0 0.01 0.1 1 Relative error 10 100 Estimate workload based on same hour in previous week •19 Technique 2: Aggregate Workload Prediction per Access Set Observation: stability in aggregate workload • Diurnal and weekly patterns Classify objects by access set: • Set of data centers from which object is accessed Leverage application knowledge of sharing pattern • Dropbox/Google Docs know users that share a file • Facebook controls every user’s news feed •20 Technique 2: Aggregate Workload Prediction per Access Set CDF of analyzed hours 1 Aggregate workload is more stable and predictable 0.8 0.6 All users User 1 User 2 User 3 User 4 User 5 0.4 0.2 0 0.01 0.1 1 10 100 Relative error Estimate workload based on same hour in previous week •21 Optimizing Cost for GETs and PUTs Use cheap (request + data transfer) data centers R R R GET R •22 Technique 3: Relay Propagation Asynchronous propagation (no latency constraint) R 0.2$/GB 0.12$/GB R R PUT R 0.25$/GB 0.19$/GB R 0.19$/GB •23 Technique 3: Relay Propagation Asynchronous propagation (no latency constraint) Synchronous propagation (bounded by latency SLO) R 0.12$/GB 0.2$/GB Violate SLO R R PUT R 0.25$/GB 0.19$/GB R 0.19$/GB •24 Summary Insights to reduce cost • Multi-cloud deployment • Use aggregate workload per access set • Relay propagation Placement manager uses ILP to combine insights Other techniques • Metadata management • Two phase-locking protocol • Asymmetric quorum set •25 Outline Problem and motivation SPANStore overview Techniques for reducing cost Evaluation •26 Evaluation Scenario • Application is deployed on EC2 • SPANStore is deployed across S3, Azure and GCS Simulations to evaluate cost savings Deployment to verify application requirements • Retwis • ShareJS •27 Simulation Settings Compare SPANStore against • Replicate everywhere • Single replica • Single cloud deployment Application requirements • Sequential consistency • PUT SLO: min SLO satisfies replicate everywhere • GET SLO: min SLO satisfies single replica •28 SPANStore Enables Cost Savings across Disparate Workloads Savings by price Savings bydiscrepancy relay propagation of PUT request #1: big objects, more GETs (Lots of data transfers from replicas) #2: big objects, more PUTs (Lots of data transfers to replicas) #3: small objects, more GETs (Lots of GET requests) #4: small objects, more PUTs (Lots of PUT requests) CDF of access sets 1 0.8 0.6 0.4 #1 #1 #2 #1 #2 #3 #1 #2 #3 #4 0.2 0 1 10 100 Savings Savings by pricew/by discrepancy reducing (cost w/ everywhere)/(cost SPANStore) of GET request data transfer •29 Deployment Settings Retwis • • • • Scale down Twitter workload GET: read timeline PUT: make post Insert: read follower’s timeline and append post to it Requirements: • Eventual consistency • 90%ile PUT/GET SLO = 100ms •30 SPANStore Meets SLOs CDF of operations SLO 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Insert SLO 90%ile GET PUT Insert 0 20 40 60 80 100 120 140 160 180 200 Latencies (ms) •31 Conclusions SPANStore • Minimize cost while satisfying latency, consistency and fault-tolerance requirements Use multiple cloud providers for greater data center density and pricing discrepancies Judiciously determine replication policy based on workload properties and application needs •32