PROGRAMMING SUPPORT AND ADAPTIVE CHECKPOINTING FOR HIGH-THROUGHPUT DATA SERVICES WITH LOG-BASED RECOVERY

PROGRAMMING SUPPORT AND ADAPTIVE CHECKPOINTING FOR HIGH-THROUGHPUT DATA SERVICES WITH LOG-BASED RECOVERY Jingyu Zhou Shanghai Jiao Tong Univ Jiesheng Wu Microsoft DSN 2010 Caijie Zhang Google Inc. Hong Tang Yahoo! Tao Yang UC Santa Barbara Backgrounds  Many large-scale data-mining and offline applications in Google, Yahoo, Microsoft, Ask.com, etc. require  High data parallelism/throughput  Data persistence. But not so stringent availability  E.g., URL property service (UPS) at Ask.com search offline mining platform  Hundreds of app. modules access UPS Examples of high-throughput data services for web mining/search Internet Web documents Data/ info service Crawler Crawler Crawler Document DB Document DB Document DB 10-50 billion URLs Data/info service Data mining Data mining job job … Data mining job e.g. URL property service. 100K-500K/s Existing Approaches for Highperformance and Persistence  Database systems  suffer from high overhead, limits its performance while supporting general features  Need more machine resources  Related work and well-known techniques for high availability  Data replication  Log-based recovery  Checkpointing Challenges and Focus of this work  System design with careful selection and integration of fault-tolerant techniques for high throughput computing.  Trade off in availability, but allow some down time.  Low cost: logging/checkpoint.  Fine-grain for minimum service disruption.  Local data recovery. Periodic remote backup.  Programming support  Lightweight, services. simplifying construction of robust data SLACH: Selective Logging & Adaptive CHeckpointing  Targeted data services  Request-driven thread model.  In-memory objects.  Data independence.  Similar to key-value stores in BigTable/Dynamo, but higher throughput. Architecture of SLACH Main Techniques  Selective operation logging  Only log write operations    (oid, op_type, parameters, timestamp) Write-ahead log, i.e., write then apply operations Object-level checkpoint to avoid service disruptions with adaptive load control Ckpt objects one-by-one. Still allow concurrent access of other objects  Perform checkpointing when load is low to amortize cost of checkpointing   Light weight API while supporting legacy code. Object-level Checkpoints Adaptive Checkpointing Control  Goal is to balance ckpt. cost and recovery speed  Ckpt. less frequently-> larger logs -> lengthy recovery  Ckpt. too often -> higher overhead  Ideally,  High server load -> ckpt. less frequently  Low server load -> ckpt. more frequently  Adjust between a Low Watermark (LW) & High Watermark (HW) of service loads  Loadcurr = α×loadprev+(1-α)×sample Adaptive Checkpointing Frequency  Ckpt. threshold between LB and UB  LB, UB are log size parameters, determined by app. Threshold  LB  F(load)  (UB  LB), where  SLACH Programming Support  Application developers Call SLACH function log() to log an object operation  Define 3 callback functions:  1) what to checkpoint (call SLACH’s ckpt() for each selected object,  2) recover one object from a checkpoint,  3) replay a log operation.   SLACH Provide functions log() and ckpt().  Call user’s checkpoint callback fun during checkpoint.  Call a user’s recover function during checkpoint recover.  Call a user’s replay function when recovering from a log.  SLACH API for Applications class SLACH::API { public: /* register ckpt. policy and parameters */ void register_policy(const Policy& p); /* log one write operation */ void log(int64_t obj_id, int op, ...); /* checkpoint one object */ void ckpt(int64_t obj_id, const void* addr, uint32_t size); }; SLACH Interface class SLACH::Application { … protected: /* application checkpoint callback function */ virtual void ckpt_callback()=0; /* callback of loading one object checkpoint*/ virtual void load_one_callback(int64_t obj_id, const void *addr,uint32_t size)=0; /* callback of replaying one operation log */ virtual void replay_one_callback(int64_t obj_id, int op, const para_vec& args)=0; }; An Example: Application-level code struct Item { double price; int quantity; }; class MyService : public SLACH::Application { private: Item obj[1000]; SLACH::API slach_; /* SLACH API */ static const int OP_PRICE=0;/* an op type */ public: void update_price(int id, double p) { slach_.log(id, OP_PRICE, &p, sizeof(p)); obj[id].price = p; } Application objects being accessed Log selected object update operation An Example: Call-back functions SLACH calls this user function during checkpointing. void ckpt_callback() { for (int i=0; i<1000 ; i++) slach_.ckpt(i, &obj[i], sizeof(obj[i])); } SLACH calls this when recovering an object from a checkpoint . void load_one_callback(int64_t id, const void *p, uint32_t size) { memcpy(&obj[id], p, size); } SLACH calls this when recovering an object by log replaying void replay_one_callback(int64_t id, int op, const para_vec& args) { switch (op) { case OP_PRICE: obj[id].price = *(double*)args[0].second; break; // ... } SLACH Implementation and Applications   Part of Ask.com middleware infrastructure in C++ for data mining and search offline platform Application samples: UPS (URL property service) for recording property of all URLs crawled/collected.  HIS (Host information service) for recording property of all hosts crawled on the web.  20-80% of write traffic. Running on a cluster of hundreds of machines. In production for last 3 years.   Significantly reduced development time (1-2 months vs. few days). Characteristics of UPS/HIS   Perfor. characteristics of UPS/HIS per partition. Data Max. Read Max. Write UPS 1.9GB 110K Req/s 56K Req/s HIS 2.1GB 58K Req/s 16K Req/s Parameters for adaptive ckpt. Control UPS HIS 0.8 α Moving avg. 0.8 LB/UB low/upper b. 1M-8M entries 0.3M-1.8M LW/HW L/H watermark 20%-85% 35%-85% β Scaling 3 6 w Sampling win. 5s 5s Evaluation     Impact of logging overhead System behavior during checkpointing Effectiveness of adaptive checkpoint control Performance comparison of hash table implementation using SLACH and BerkeleyDB  Evaluation Setting  Benchmarks  UPS (URL property service)  HIS (Host-level property service)  Persistent Hash Table (PHT)  Metric: throughput loss percent SuccessfulRe quests LossPercent  (1 ) 100 TotalRe quests  Hardware: a 15 node cluster, gigabit link Selective Logging Overhead of UPS • Base: logging is disabled • Log: selective logging is enabled Negligible impact when server load < 40%. System Performance During Checkpointing (100% server load) During ckpt, 8.9% throughput drop During ckpt, 57.6% increase of response time Effectiveness of Adaptive Threshold Controller – Performance Comparison in UPS • Fixed threshold policy, 8M has lower runtime overhead – less frequent ckpt • Adaptive approach has comparable performance as fixed policy of 8M. Effectiveness of Threshold Controller – Recovery Speed • Fixed threshold -> fixed log size -> same recovery time • Adaptive approach: small log for light load (less recovery time), large log for higher load (more recovery time) SLACH is better for all value sizes, because 1. BDB incurs more per-operation overhead 2. BDB involves more disk I/Os PHT vs. Berkeley DB 30-B value, SLACH is 5.3 times higher SLACH ckpt has less overhead 1. BDB ckpt is not async 2. SLACH fuzzy ckpt still allow access Conclusions  SLACH contributions  A lightweight programming framework for very highthroughput, persistent data services Simplify application construction while meeting reliability demands  Selective logging to enhance performance   System design with careful integration of multiple techniques Dynamic adjust ckpt. frequency to meet throughput demands  Fine-grained ckpt without service disruptions   Evaluation of integrated scheme in production applications. Data and Failure Models  Data independence and object-oriented access model  Key-value store as in Dynamo/BigTable, but with much higher throughput demand per machine  Each object is a continuous memory block  Middleware  infrastructure can handle noncontiguous ones Fail-stop  Focus on local recovery due to app. failures  OS/Hardware failure can be dealt with remote ckpt.  Implemented, but not the scope of this paper

PROGRAMMING SUPPORT AND ADAPTIVE CHECKPOINTING FOR HIGH-THROUGHPUT DATA SERVICES WITH LOG-BASED RECOVERY

Related documents

Products

Support

PROGRAMMING SUPPORT AND ADAPTIVE CHECKPOINTING FOR HIGH-THROUGHPUT DATA SERVICES WITH LOG-BASED RECOVERY

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib