stdchk : A Checkpoint Storage System for Desktop Grid Computing – UBC

stdchk: A Checkpoint Storage System for Desktop Grid Computing Samer Al-Kiswany – UBC Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University of British Columbia Oak Ridge National Laboratory 1 Checkpointing Introduction Checkpointing uses: fault tolerance, debugging, or migration. Typically, an application running for days on hundreds of nodes (e.g. a desktop gird ) saves checkpoint images periodically. ... C C C ICDCS ‘08 C 2 Deployment Scenario ICDCS ‘08 3 The Challenge Although checkpointing is necessary:  It is a pure overhead from the performance point of view. Most of the time spent writing to the storage system.  Generates a high load on the storage system Requirement: High performance, scalable, and reliable storage system optimized for checkpointing applications. Challenge: Low cost, transparent support for checkpointing at filesystem level. ICDCS ‘08 4 Checkpointing Workload Characteristics  Write intensive application ( bursty ). e.g., a job running on hundreds of nodes.  periodically checkpoints 100s of GB of data.  Write once, rarely read during application execution.  Potentially high similarity between consecutive checkpoints.  Applications specific checkpoint image life span. When it is safe to delete the image ? ICDCS ‘08 5 Why Checkpointing-Optimized Storage System?  Optimizing for checkpointing workload can bring valuable benefits:  High throughput through specialization.  Considerable storage space and network effort saving. through transparent support for incremental checkpointing.  Simplified data management by exploiting the particulaities of checkpoint usage scenarios.  Reduce the load on a share file-system  Can be built atop scavenged resources – low cost. ICDCS ‘08 6 stdchk A checkpointing optimized storage system built using scavenged resources. ICDCS ‘08 7 Outline  stdchk architecture  stdchk features  stdchk system evaluation ICDCS ‘08 8 stdchk Architecture Manager (Metadata management) Benefactors (Storage nodes) Client (FS interface) ICDCS ‘08 9 stdchk Features  High-throughput for write operations  Support for transparent incremental checkpointing  Simplified data management  High reliability through replication  POSIX file system API – as a result using stdchk does not require modifications to the application. ICDCS ‘08 10 Optimized Write Operation Alternatives Write procedure alternatives:  Complete local write  Incremental write  Sliding window write ICDCS ‘08 11 Optimized Write Operation Alternatives Write procedure alternatives:  Complete local write  Incremental write  Sliding window write Compute Node Application stdchk stdchk FS Interface Disk ICDCS ‘08 12 Optimized Write Operation Alternatives Write procedure alternatives:  Complete local write  Incremental write  Sliding window write Compute Node Application stdchk stdchk FS Interface Disk ICDCS ‘08 13 Optimized Write Operation Alternatives Write procedure alternatives:  Complete local write  Incremental write  Sliding window write Compute Node Application stdchk stdchk FS Interface Memory Disk ICDCS ‘08 14 Write Operation Evaluation Testbed: 28 machines Each machine has : two 3.0GHz Xeon processors, 1 GB RAM, two 36.5GB SCSI disks. ICDCS ‘08 15 Achieved Storage Bandwidth Complete Local Write Sliding-Window Write NFS Linear (iperf) Incremental Write iperf Local I/O Write Throughput (MB/s) . Sliding120 Window write achieves high 100 bandwidth (110 80 MBps) 60 Saturates the 1 Gbps link 40 20 0 1 2 4 Stripe Width 8 The average ASB over a 1 Gbps testbed. ICDCS ‘08 16 stdchk Features  High throughput write operation  Transparent incremental checkpointing  Checkpointing optimized data management  POSIX file system interface – no required modification to the application ICDCS ‘08 17 Transparent Incremental Checkpointing Incremental checkpointing may bring valuable benefits:  Lower network effort.  Less storage space used. But : How much similarity is there between consecutive checkpoints ? How can we detect similarities between checkpoints? Is this fast enough? ICDCS ‘08 18 Similarity Detection Mechanism – Compare-by-Hash Hashing Checkpoint T0 X X T0 Y Y Z Z ICDCS ‘08 19 Similarity Detection Mechanism – Compare-by-Hash Will store T1 Hashing Checkpoint T1 X W T0 Y Y T1 Z Z W ICDCS ‘08 20 Similarity Detection Mechanism  How to divide the file into blocks?  Fixed-size blocks + compare-by-Hash (FsCH)  Content-based blocks + compare-by-Hash (CbCH) ICDCS ‘08 21 FsCH Insertion Problem B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 Checkpoint i B6 Checkpoint i+1 Result: Lower similarity detection ratio. ICDCS ‘08 22 Content-based Compare-by-Hash (CbCH) offset B1 B2 B3 B4 Checkpoint i m bytes Hashing k bits HashValue HashValue HashValue ==0K 0=? ?0 ? KK ICDCS ‘08 23 Content-based Compare-by-Hash (CbCH) B1 B2 B3 B4 Checkpoint i B1 BX B3 B4 Checkpoint i+1 Result: Higher similarity detection ratio. But: Computationally intensive. ICDCS ‘08 24 Evaluating Similarity Between Consecutive Checkpoints The Applications : BMS* and BLAST Checkpointing interval: 1, 5 and 15 minutes Type Number of checkpoints Avg. Checkpoint size Application level 100 2.4 MB System level - BLCR  1200 450 MB Virtual machine level - Xen  400 1 GB * Checkpoints by Pratul Agarwal (ORNL) ICDCS ‘08 25 Similarity Ratio and Detection Throughput Technique  Interval  FsCH 1MB CbCH nooverlap m=20B, k=14b BMS BLAST App BLCR 1 min 5 min 0.0% [108] 23.4% [109] 0.0% [28.4] 82% [26.6] Xen 15 min 5 or 15 min 6.3% [113] 0.0% [110] 70% [26.4] 0.0% 0.0% [28.4] The table presents the average rate of detected similarity and the throughput in MB/s (in brackets) for each heuristic. But: Using the GPU, CbCH achieves over 190 MBps throughput !! - StoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed Storage Systems, S. Al-Kiswany, A. Gharaibeh, E. SantosNeto, G. Yuan, M. Ripeanu, HPDC, 2008. ICDCS ‘08 26 Compare-by-Hash Results FsCH slightly degrades achieved bandwidth. But reduces the storage space used and network effort by 24% Write Throughput (MB/s) . 120 100 80 60 40 no-detection FsCH 20 0 64 128 256 File System Interface Write Buffer size (MB) Achieved Storage Bandwidth ICDCS ‘08 27 Outline  stdchk architecture  stdchk features  stdchk overall system evaluation ICDCS ‘08 28 stdchk throughput (MB/s) . stdchk Scalability 450 Steady Nodes Join 400 Nodes Leave 350 stdchk sustains 300 high loads : 250  Number of nodes 200  Workload 150 100 50 0 0 50 100 150 Time (s) 200 250 300 7 clients: Each client writes 100 files (100MB each). Total of 70GB. stdchk pool of 20 benefactor nodes. ICDCS ‘08 29 Experiment with Real Application Application : BLAST Execution time: > 5 days Checkpointing interval : 30s Stripe width : 4 benefactors Client machine: two 3.0GHz Xeon processors, SCSI disks. Checkpointing time (s) Data size (TB) Total execution time (s) Local disk stdchk 22,733 16,497 27.0% 3.55 1.14 69.0% 462,141 455,894 ICDCS ‘08 Improvement 1.3% 30 Summary stdchk : A checkpointing optimized storage system built using scavenged resources. stdchk features:  High throughput write operation  Saves considerable disk space and network effort.  Checkpointing optimized data management  Easy to adopt – implements a POSIX file system interface  Inexpensive - built atop scavenged resources Consequently, stdchk:  Offloads the checkpointing workload from the shared FS.  Speeds up the checkpointing operations (reduces checkpointing overhead) ICDCS ‘08 31 Thank you netsyslab.ece.ubc.ca ICDCS ‘08 32

stdchk : A Checkpoint Storage System for Desktop Grid Computing – UBC

Related documents

Products

Support

stdchk : A Checkpoint Storage System for Desktop Grid Computing – UBC

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib