Using Similarity in Content and Access Pa5erns to Improve Space Efficiency and Performance in Storage Systems Xing Lin PhD Defense July 22, 2015 Photos Stored in Facebook Reference “Finding a needle in Haystack: Facebook’s photo storage”, in USENIX OSDI ‘10 2 Photos Stored in Facebook Reference “Finding a needle in Haystack: Facebook’s photo storage”, in USENIX OSDI ‘10 3 Photos Stored in Facebook Number of photos uploaded every week Reference “Finding a needle in Haystack: Facebook’s photo storage”, in USENIX OSDI ‘10 4 Photos Stored in Facebook Number of photos uploaded every week 1 billion (60 terabytes) Reference “Finding a needle in Haystack: Facebook’s photo storage”, in USENIX OSDI ‘10 5 Photos Stored in Facebook Number of photos uploaded every week 1 billion (60 terabytes) Total number of photos stored by 2010 260 billion (20 petabytes) Reference “Finding a needle in Haystack: Facebook’s photo storage”, in USENIX OSDI ‘10 6 Amazon S3 7 Amazon S3 5 objects for each star in the galaxy 8 Amazon S3 5 objects for each star in the galaxy (Assume) average object size is 100 KB, total data size is 200 PB 9 How Much Data Does This Plane Generate per Flight? BOEING 787 10 How Much Data Does This Plane Generate per Flight? BOEING 787 11 Exponencal Increase of Digital Data 12 Exponencal Increase of Digital Data Efficient Storage Solucons 13 Data Reduccon Techniques • Compression: find redundant strings and replace with compact encodings – 2x reduccon – LZ ([Ziv and Lempel 1997]), lz4, gzip, bzip2, 7z, xz, … 14 Data Reduccon Techniques • Compression: find redundant strings and replace with compact encodings – 2x reduccon – LZ ([Ziv and Lempel 1997]), lz4, gzip, bzip2, 7z, xz, … • Deduplica0on: find duplicate chunks and store unique ones – 10x reduccon for backups – Venc ([Quinlan FAST02]), DataDomain FileSystem ([Zhu FAST08]), iDedup ([Srinivasan FAST12]), … 15 Limitacons • Compression: search redundancy in string level – Does not scale for deteccng redundant strings across a large range 16 Limitacons • Compression: search redundancy in string level – Does not scale for deteccng redundant strings across a large range • Deduplica0on: interleave metadata with data – Frequent metadata changes introduce many unnecessary unique chunks 17 Proposed Solucons • Migratory Compression: detect similarity in block level to group similar blocks (Chap2, FAST14) 18 Proposed Solucons • Migratory Compression: detect similarity in block level to group similar blocks (Chap2, FAST14) • Deduplica0on: – Separate metadata from data, to store same type of data together (Chap3, HotStorage15) 19 Proposed Solucons • Migratory Compression: detect similarity in block level to group similar blocks (Chap2, FAST14) • Deduplica0on: – Separate metadata from data, to store same type of data together (Chap3, HotStorage15) – Use deduplicacon for efficient disk image storage (Chap4, TridentCom15) 20 Proposed Solucons • Migratory Compression: detect similarity in block level to group similar blocks (Chap2, FAST14) • Deduplica0on: – Separate metadata from data, to store same type of data together (Chap3, HotStorage15) – Use deduplicacon for efficient disk image storage (Chap4, TridentCom15) • Differen0al IO Scheduling: schedule same type of IO requests for predictable and efficient performance (Chap5, HotCloud12) 21 Thesis Statement Similarity in content and access pa5erns can be uclized to improve space efficiency, by storing similar data together and performance predictability and efficiency, by scheduling similar IO requests to the same hard drive. 22 Outline ü Introduccon ü Migratory Compression ü Improve deduplicacon by separacng metadata from data ü Using deduplicacon for efficient disk image deployment ü Performance predictability and efficiency for Cloud Storage Systems ü Conclusion 23 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures 24 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures • Metrics - Compression Factor (CF) = original size / compressed size - Throughput = original size / (de)compression cme 25 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures • Metrics - Compression Factor (CF) = original size / compressed size - Throughput = original size / (de)compression cme Compressor MAX Window size Techniques gzip 64 KB LZ; Huffman coding bzip2 900 KB Run-­‐length encoding; Burrows-­‐ Wheeler Transform; Huffman coding 7z 1 GB Variant of LZ77; Markov chain-­‐based range coder 26 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures • Metrics - Compression Factor (CF) = original size / compressed size - Throughput = original size / (de)compression cme Compressor MAX Window size Techniques gzip 64 KB LZ; Huffman coding bzip2 900 KB Run-­‐length encoding; Burrows-­‐ Wheeler Transform; Huffman coding 7z 1 GB Variant of LZ77; Markov chain-­‐based range coder 27 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures • Metrics - Compression Factor (CF) = original size / compressed size - Throughput = original size / (de)compression cme MAX Techniques Window size gzip 64 KB bzip2 7z 900 KB 1 GB LZ; Huffman coding Run-­‐length encoding; … Markov chain-­‐ based range coder Compression Tput (MB/s) Compr essor Example Throughput vs. CF 14 gzip 12 10 8 bzip2 6 4 7z 2 0 2.0 2.5 3.0 3.5 4.0 4.5 Compression Factor (X) 28 5.0 Background on Compression • Compression: finds redundant strings within a window size and encodes with more compact structures • Metrics - Compression Factor (CF) = original size / compressed size - Throughput = original size / (de)compression cme MAX Techniques Window size gzip 64 KB bzip2 7z 900 KB 1 GB LZ; Huffman coding Run-­‐length encoding; … Markov chain-­‐ based range coder Compression Tput (MB/s) Compr essor Example Throughput vs. CF 14 gzip 12 10 8 bzip2 6 4 7z 2 0 2.0 2.5 3.0 3.5 4.0 4.5 Compression Factor (X) The larger the window, the beEer the compression but slower. Fundamental reason: finding redundancy in string level does not scale to large windows 29 5.0 Migratory Compression • Problem: finding redundancy in string level does not scale to large windows • Key idea: group by similarity in block level, enabling standard compressors to find repeated strings with small windows 30 Migratory Compression • Problem: finding redundancy in string level does not scale to large windows • Key idea: group by similarity in block level, enabling standard compressors to find repeated strings with small windows • Migratory compression (MC): coarse-­‐grained reorganizacon to group similar blocks to improve compressibility – A generic pre-­‐processing stage for standard compressors – In many cases, improve both compressibility and throughput – Effeccve for improving compression for archival storage 31 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) 32 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ C … (GBs) A” B’ … window 33 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ C … (GBs) A” B’ … Standard compressor 34 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ C … (GBs) A” B’ … compress … … Standard compressor 35 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ C … (GBs) A” B’ … compress … … Standard compressor mzip 36 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ … (GBs) A” B’ … transform compress … C A A’ A” B B’ C … … Standard compressor mzip 37 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ … (GBs) A” B’ … transform compress … C A … Standard compressor A’ A” B B’ C … compress mzip 38 Migratory Compression • Compress a single, large file (mzip) - Tradiconal compressors are unable to exploit redundancy across a large range of data (e.g., many GB) A B A’ … (GBs) A” B’ … transform compress … C A … Standard compressor A’ A” B C B’ … compress … mzip 39 mzip Example Block ID A 1 B 2 C 3 A’ 4 D 5 B’ … File 0 … 40 mzip Example Block ID Similarity detector A 1 B 2 C 3 A’ 4 D 5 B’ … File 0 … 41 mzip Example migrate recipe 0 3 1 5 2 4 . . Block ID Similarity detector A 1 B 2 C 3 A’ 4 D 5 B’ … File 0 … 0 2 4 1 5 3 . . restore recipe 42 mzip Example migrate recipe 0 3 1 5 2 4 . . Block ID A 1 B 2 C 3 A’ 4 D 5 B’ … File 0 … 0 2 4 1 5 3 . . restore recipe 43 mzip Example migrate recipe 0 3 1 5 2 4 . Migrate Block ID A 0 1 B A’ 1 2 C B 2 3 A’ B’ 3 4 D C 4 5 B’ D 5 … … 0 2 4 1 5 3 . Reorganized File A … 0 … File . . restore recipe 44 mzip Example migrate recipe 0 3 1 5 2 4 . Migrate Block ID A 0 1 B A’ 1 2 C B 2 3 A’ B’ 3 4 D C 4 5 B’ D 5 … … 0 2 4 1 5 3 . Reorganized File A … 0 … File . . restore recipe 45 mzip Example migrate recipe 0 3 1 5 2 4 . Migrate Block ID A 0 1 B A’ 1 2 C B 2 3 A’ B’ 3 4 D C 4 5 B’ D 5 … … Reorganized File A … 0 … File . Restore 0 2 4 1 5 3 . . restore recipe 46 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) 47 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk 48 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 49 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 Maximal Maximal Value 3 Value 2 Maximal Value 4 50 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 Maximal Maximal Value 3 Value 2 Maximal Value 4 Similar chunk 51 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 Maximal Maximal Value 3 Value 2 Maximal Value 4 Similar chunk 52 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 Maximal Maximal Value 3 Value 2 Maximal Value 4 Similar chunk Super-­‐feature ([Shilane FAST12]) 53 Detect Similar Blocks Similarity feature: hash ([Broder 1997]) Chunk Maximal Value 1 Maximal Maximal Value 3 Value 2 Maximal Value 4 Similar chunk Super-­‐feature A match on any super-­‐feature ([Shilane FAST12]) is good enough 54 Efficient Data Reorganizacon • Problem: requires lots of random IOs 55 Efficient Data Reorganizacon • Problem: requires lots of random IOs • Hard drives: does not perform well • SSD: provides good random IO performance 56 Efficient Data Reorganizacon • • • • Problem: requires lots of random IOs Hard drives: does not perform well SSD: provides good random IO performance Mulc-­‐pass: helps considerably for hard drives – convert random IOs into mulcple scans of input files 57 Evaluacon • How much can compression be improved? – Compression Factor (CF): original size / compressed size • What is the complexity (runtime overhead)? – Compression throughput: original size / runtime • More in the paper – How does SSD or HDD affect data reorganization performance? For HDD, does multi-pass help? – How does MC perform, compared with delta compression? – What are the configuration options for MC? 58 Migratory Compression -­‐ Evaluacon Dataset: backup of worksta0on1 (engineering desktop) Compression Tput (MB/s) 25 20 gzip 15 10 5 0 0 1 2 3 4 5 6 Compression Factor (X) 7 8 9 59 Migratory Compression -­‐ Evaluacon Dataset: backup of worksta0on1 (engineering desktop) Compression Tput (MB/s) 25 gz(mc) 20 gzip 15 10 5 0 0 1 2 3 4 5 6 Compression Factor (X) 7 8 9 60 Migratory Compression -­‐ Evaluacon Dataset: backup of worksta0on1 (engineering desktop) Compression Tput (MB/s) 25 gz(mc) 20 gzip 15 10 bz(mc) 7z(mc) bzip2 5 7z 0 0 1 2 3 4 5 6 Compression Factor (X) 7 8 9 61 Migratory Compression -­‐ Evaluacon Dataset: backup of worksta0on1 (engineering desktop) Compression Tput (MB/s) 25 gz(mc) 20 gzip 15 rzip -­‐ intra-­‐file dedupe -­‐ bzip2 remainder rz(mc) rzip 10 bz(mc) 7z(mc) bzip2 5 7z 0 0 1 2 3 4 5 6 Compression Factor (X) 7 8 9 62 Migratory Compression -­‐ Evaluacon Dataset: backup of worksta0on1 (engineering desktop) 25 Compression Tput (MB/s) MC improves both CF and compression throughput ü Deduplicacon ü Re-­‐organizacon gz(mc) 20 gzip 15 rzip -­‐ intra-­‐file dedupe -­‐ bzip2 remainder rz(mc) rzip 10 bz(mc) 7z(mc) bzip2 5 7z 0 0 1 2 3 4 5 6 Compression Factor (X) 7 8 9 63 Compression Factor vs. Compression Throughput Compression Tput (MB/s) Exchange2 25 gzip 20 bzip2 15 7z rzip 10 gz(mc) 5 bz(mc) 7z(mc) 0 0 2 4 6 Compression Factor (X) 8 rz(mc) Compression Factor vs. Compression Throughput Compression Tput (MB/s) Exchange2 MC improves CF but slightly reduces compression throughput 25 gzip 20 bzip2 15 7z rzip 10 gz(mc) 5 bz(mc) 7z(mc) 0 0 2 4 6 Compression Factor (X) 8 rz(mc) Migratory Compression -­‐ Summary • More results in the paper – – – – Decompression performance Default vs. maximal compression Memory vs. SSD vs. hard drives Applicacon of MC for archival storage Migratory Compression: Coarse-­‐grained Data Reordering to Improve Compressibility Xing Lin, Guanlin Lu, Fred Douglis, Philip Shilane, and Grant Wallace FAST ‘14: Proceedings of the 12th USENIX Conference on File and Storage Technologies, Feb. 2014 66 • More results in the paper – – – – Decompression performance Default vs. maximal compression Memory vs. SSD vs. hard drives Applicacon of MC for archival storage Throughput Migratory Compression -­‐ Summary 15 10 5 0 2.0 3.0 4.0 CF 5.0 • Migratory Compression – Improves exiscng compressors, in both compressibility and frequently run0me – Redraw the performance-­‐compression curve! Migratory Compression: Coarse-­‐grained Data Reordering to Improve Compressibility Xing Lin, Guanlin Lu, Fred Douglis, Philip Shilane, and Grant Wallace FAST ‘14: Proceedings of the 12th USENIX Conference on File and Storage Technologies, Feb. 2014 67 Outline ü Introduccon ü Migratory Compression ü Improve deduplicaKon by separaKng metadata from data ü Using deduplicacon for efficient disk image deployment ü Performance predictability and efficiency for Cloud Storage Systems ü Conclusion 68 Deduplicacon Idea: idencfy duplicate data blocks and store a single copy 69 Deduplicacon Idea: idencfy duplicate data blocks and store a single copy v1.0 Input Input v2.0 70 Deduplicacon Idea: idencfy duplicate data blocks and store a single copy Chunks A B C D Input E F G H v1.0 A B C D Input E F’ G H v2.0 71 Deduplicacon Idea: idencfy duplicate data blocks and store a single copy Chunks A B C D Input E F G H v1.0 A B C D Input E F’ G H v2.0 Stored on disk A B C D E F G H F’ 72 Deduplicacon Idea: idencfy duplicate data blocks and store a single copy Chunks A B C D Input E F G H v1.0 A B C D Input E F’ G H v2.0 Stored on disk A B C D E F G Eliminate nearly all of v2.0 on disk H F’ 73 Expected Deduplicacon What if we have many versions? 74 Expected Deduplicacon What if we have many versions? v1.0 Input v2.0 Input 75 Expected Deduplicacon What if we have many versions? Dedup raco = original_size/post_dedup_size v1.0 Input v2.0 Input ~2 x 76 Expected Deduplicacon What if we have many versions? Dedup raco = original_size/post_dedup_size v1.0 Input v2.0 Input ~2 x Input ~3 x v3.0 77 Expected Deduplicacon What if we have many versions? Dedup raco = original_size/post_dedup_size v1.0 Input v2.0 Input ~2 x ~3 x v3.0 Input … 78 Great (Deduplicacon) Expectacons • In Reality – 308 versions of Linux source code: 2 x – Other examples of awful deduplicacon 79 Great (Deduplicacon) Expectacons • In Reality – 308 versions of Linux source code: 2 x – Other examples of awful deduplicacon • ContribuKons – Idencfy and categorize barriers to deduplicacon – Solucons • EMC Data Domain (industrial experience) • GNU tar (academic research) 80 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata 81 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata File1 File2 File3 File4 File Metadata File5 File Data 82 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata File1 File2 File3 File4 File5 v1.0 File1 File2 File3 File4 File5 v2.0 83 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata chunks File1 File2 File3 File4 File5 v1.0 File1 File2 File3 File4 File5 v2.0 chunks 84 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata – GNU tar, EMC NetWorker, Oracle RMAN backup chunks File1 File2 File3 File4 File5 v1.0 File1 File2 File3 File4 File5 v2.0 chunks 85 Metadata Impacts Deduplicacon • Case 1: metadata changes – The input is an aggregate of many small user files, interleaved with file metadata – GNU tar, EMC NetWorker, Oracle RMAN backup – Videos suffer from a similar problem ([Dewakar HotStorage15]) chunks File1 File2 File3 File4 File5 v1.0 File1 File2 File3 File4 File5 v2.0 chunks 86 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block 87 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws 88 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format 89 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 90 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 91 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 v2.0 92 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 v2.0 93 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 Chunks v2.0 Chunks 94 Metadata Impacts Deduplicacon • Case 2: metadata locacon changes – The input is encoded in (fixed-­‐size) blocks and metadata is inserted for each block – Data insercon/delecon leads to metadata shiws – Tape format v1.0 Chunks Locacons of block markers are shiwed, leading to different chunks v2.0 Chunks 95 Solucon: Separate Metadata from Data 96 Solucon: Separate Metadata from Data • Three approaches: – Recommended: design deduplicacon-­‐friendly formats • Case study: EMC NetWorker 97 Solucon: Separate Metadata from Data • Three approaches: – Recommended: design deduplicacon-­‐friendly formats • Case study: EMC NetWorker – Transparent: applicacon-­‐level post-­‐processing • Case study: GNU tar 98 Solucon: Separate Metadata from Data • Three approaches: – Recommended: design deduplicacon-­‐friendly formats • Case study: EMC NetWorker – Transparent: applicacon-­‐level post-­‐processing • Case study: GNU tar – Last resort: format-­‐aware deduplicacon • Case studies: 1) virtual tape library (VTL) 2) Oracle RMAN backup 99 Solucon: Separate Metadata from Data • Three approaches: – Recommended: design deduplicacon-­‐friendly formats • Case study: EMC NetWorker – Transparent: applicacon-­‐level post-­‐processing • Case study: GNU tar – Last resort: format-­‐aware deduplicacon • Case studies: 1) virtual tape library (VTL) 2) Oracle RMAN backup 100 Applicacon-­‐level Post-­‐processing • tar (tape archive) – Collects files into one archive file – File system archiving, source code distribucon, … • GNU tar format – A sequence of entries, one per file – For each file: a file header and data blocks – Header block: file path, size, modificacon cme 101 Applicacon-­‐level Post-­‐processing • tar (tape archive) – Collects files into one archive file – File system archiving, source code distribucon, … • GNU tar format – A sequence of entries, one per file – For each file: a file header and data blocks – Header block: file path, size, modificacon cme File 1 File 2 Header block Data blocks 102 Applicacon-­‐level Post-­‐processing • tar (tape archive) – Collects files into one archive file – File system archiving, source code distribucon, … • GNU tar format – A sequence of entries, one per file – For each file: a file header and data blocks – Header block: file path, size, modificacon cme File 1 File 2 Header block Data blocks 103 Metadata Changes with GNU tar 104 Metadata Changes with GNU tar SHA1s linux-2.6.39.3/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.4/README a735c31cef6d19d56de6824131527fdce04ead47 105 Metadata Changes with GNU tar SHA1s linux-2.6.39.3/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.4/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.3/README -rw-rw-r-- 1 root root 17525 Jul 9 2011 linux-2.6.39.4/README -rw-rw-r-- 1 root root 17525 Aug 3 2011 106 Metadata Changes with GNU tar SHA1s linux-2.6.39.3/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.4/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.3/README -rw-rw-r-- 1 root root 17525 Jul 9 2011 linux-2.6.39.4/README -rw-rw-r-- 1 root root 17525 Aug 3 2011 File 1 File 2 Header block Version 1 Data blocks Modified Header block Version 2 107 Metadata Changes with GNU tar SHA1s linux-2.6.39.3/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.4/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.3/README -rw-rw-r-- 1 root root 17525 Jul 9 2011 linux-2.6.39.4/README -rw-rw-r-- 1 root root 17525 Aug 3 2011 File 1 File 2 Header block Data blocks Version 1 Chunk1 Chunk2 Chunk3 Modified Header block Chunk1 and Chunk2 in version 2 are different. Version 2 Chunk1 Chunk2 Chunk3 108 Metadata Changes with GNU tar SHA1s linux-2.6.39.3/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.4/README a735c31cef6d19d56de6824131527fdce04ead47 linux-2.6.39.3/README -rw-rw-r-- 1 root root 17525 Jul 9 2011 linux-2.6.39.4/README -rw-rw-r-- 1 root root 17525 Aug 3 2011 File 1 File 2 Header block Data blocks Version 1 Chunk1 Chunk2 Chunk3 Modified Header block Chunk1 and Chunk2 in version 2 are different. Version 2 Chunk1 Chunk2 Chunk3 2 x for 308 versions of Linux 109 Migratory tar (mtar) tar File 1 File 2 Header block Data blocks 110 Migratory tar (mtar) tar File 1 File 2 Header block Data blocks Migrate 2 Number of header blocks mtar 111 Migratory tar (mtar) tar File 1 File 2 Header block Data blocks Restore Migrate 2 Number of header blocks mtar 112 mtar -­‐ Evaluacon • 9 GNU sowware and Linux kernel source code • Many versions: 13 ~ 308 113 mtar -­‐ Evaluacon • 9 GNU sowware and Linux kernel source code • Many versions: 13 ~ 308 16 tar mtar Improvements N.N Deduplication Ratio (X) 14 12 5.3 10 8 6 4 2 0 1.7 1.5 1.5 1.5 1.3 1.2 1.1 1.2 ke sh reu sk a lk b c bc am ba co fdi gc gd gli sta Datasets 1.1 tar lin ux 114 mtar -­‐ Evaluacon • 9 GNU sowware and Linux kernel source code • Many versions: 13 ~ 308 16 tar mtar Improvements N.N Deduplication Ratio (X) 14 12 5.3 10 8 6 4 2 0 Improvements: 1. Across all datasets 2. Huge: 1.5-­‐5.3× 1.7 1.5 1.5 1.5 1.3 1.2 1.1 1.2 ke sh reu sk a lk b c bc am ba co fdi gc gd gli sta Datasets 1.1 tar lin ux 115 More in the Paper • Design deduplicaKon-­‐friendly formats – Case study: EMC NetWorker • Applicacon-­‐level post-­‐processing – Case study: GNU tar • Format-­‐aware DeduplicaKon – Case studies: 1) virtual tape library 2) Oracle RMAN backup 116 Summary • Metadata impacts deduplicacon – Metadata changes more frequently, introducing many unnecessary unique chunks • Solucon: separate metadata from data – Up to 5× improvements in deduplicacon Metadata Considered Harmful … to Deduplicacon Xing Lin, Fred Douglis, Jim Li, Xudong Li, Robert Ricci, Stephen Smaldone and Grant Wallance HotStorage ’15: 7th USENIX Workshop on Hot Topics in Storage and File Systems 117 Outline ü Introduccon ü Migratory Compression ü Improve deduplicacon by separacng metadata from data ü Using deduplicaKon for efficient disk image deployment ü Performance predictability and efficiency for Cloud Storage Systems ü Conclusion 118 Introduccon • Cloud providers or network testbeds maintain a large number of operacng system images (disk images) – Amazon Web Service (AWS): 37,000+ images – Emulab: 1020+ 119 Introduccon • Cloud providers or network testbeds maintain a large number of operacng system images (disk images) – Amazon Web Service (AWS): 37,000+ images – Emulab: 1020+ • Disk image: a snapshot of a disk’s content – Several to hundreds of GBs. – Similarity across disk images ([Jin, SYSTOR12]) 120 Introduccon • Cloud providers or network testbeds maintain a large number of operacng system images (disk images) – Amazon Web Service (AWS): 37,000+ images – Emulab: 1020+ • Disk image: a snapshot of a disk’s content – Several to hundreds of GBs. – Similarity across disk images ([Jin, SYSTOR12]) Requirements: efficient image storage 121 What is Image Deployment? The process of distribucng and installing a disk image at mulcple compute nodes 122 What is Image Deployment? The process of distribucng and installing a disk image at mulcple compute nodes Image server Compute Nodes 123 What is Image Deployment? The process of distribucng and installing a disk image at mulcple compute nodes Image server Compute Nodes 124 What is Image Deployment? The process of distribucng and installing a disk image at mulcple compute nodes Requirements: efficient image transmission and installacon Image server Compute Nodes 125 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) 126 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Image server Switch Compute Nodes 127 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Image server Switch Compute Nodes 128 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Install at full disk bandwidth Image server Switch Compute Nodes 129 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Install at full disk bandwidth Limited bandwidth Image server Switch Compute Nodes 130 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Transfer compressed data Install at full disk bandwidth Limited bandwidth Image server Switch Compute Nodes 131 Exiscng Systems: Frisbee and Venc • Frisbee: an efficient image deployment system ([Hibler ATC03]) • Venc: a deduplicacng storage system ([Quinlan FAST02]) Store compressed data (compression is slow) Transfer compressed data Install at full disk bandwidth Limited bandwidth Image server Switch Compute Nodes 132 Integrate Frisbee with Deduplicacng Storage Frisbee Deduplicacng storage system Efficient Image Deployment Efficient Image Storage 133 Integrate Frisbee with Deduplicacng Storage Frisbee Efficient Image Deployment ? Deduplicacng storage system Efficient Image Storage How to achieve efficient image deployment and storage simultaneously, by integracng these two systems? 134 Requirements for the Integracon Use compression Use filesystem to skip unallocated data Image installacon at full disk bandwidth Independent installable Frisbee chunks Requirements Solucons 135 Requirements for the Integracon Use compression Compress each chunk Use filesystem to skip unallocated data Aligned Fixed-­‐size Chunking Image installacon at full disk bandwidth Careful seleccon of chunk size Independent installable Frisbee chunks Pre-­‐computacon of Frisbee chunk headers Requirements Solucons 136 Requirements for the Integracon Use compression Compress each chunk Use filesystem to skip unallocated data Aligned Fixed-­‐size Chunking Image installacon at full disk bandwidth Careful seleccon of chunk size Independent installable Frisbee chunks Pre-­‐computacon of Frisbee chunk headers Requirements Solucons 137 Use Compression – Store Frisbee Images Disk Compressed Frisbee image Frisbee image creacon tool Disk 138 Use Compression – Store Frisbee Images Chunks Parccon into chunks Disk Compressed Frisbee image Frisbee image creacon tool Disk 139 Use Compression – Store Frisbee Images Venc Store Chunks Parccon into chunks Disk Compressed Frisbee image Frisbee image creacon tool Disk 140 Use Compression – Store Frisbee Images Distribute and deploy Fetch Venc Store Chunks Parccon into chunks Disk Compressed Frisbee image Frisbee image creacon tool Disk 141 Use Compression – Store Frisbee Images Distribute and deploy Compressed data does not deduplicate well Fetch Venc Store Chunks Parccon into chunks Disk Compressed Frisbee image Frisbee image creacon tool Disk 142 Use Compression – Store Frisbee Images Distribute and deploy Compressed data does not deduplicate well Fetch Venc Store Chunks ✔ efficient image deployment ✗ efficient image storage Parccon into chunks Disk Compressed Frisbee image Frisbee image creacon tool Disk 143 Use Compression – Store Raw Chunks 144 Use Compression – Store Raw Chunks Chunks Parccon into chunks Disk 145 Use Compression – Store Raw Chunks Venc Store Chunks Parccon into chunks Disk 146 Use Compression – Store Raw Chunks Fetch Venc Store Chunks Parccon into chunks Disk 147 Use Compression – Store Raw Chunks Distribute and deploy Compressed Disk Frisbee image creacon tool Fetch Venc Store Chunks Parccon into chunks Disk 148 Use Compression – Store Raw Chunks Distribute and deploy Compressed Disk Frisbee compression will become the bo5leneck Frisbee image creacon tool Fetch Venc Store Chunks Parccon into chunks Disk 149 Use Compression – Store Raw Chunks Distribute and deploy Compressed Disk Frisbee compression will become the bo5leneck Frisbee image creacon tool Fetch ✗ efficient image deployment ✔ efficient image storage Venc Store Chunks Parccon into chunks Disk 150 Use Compression – Store Compressed Chunks 151 Use Compression – Store Compressed Chunks Chunks Parccon into chunks Disk 152 Use Compression – Store Compressed Chunks Compression Chunks Parccon into chunks Disk 153 Use Compression – Store Compressed Chunks Venc Store Compression Chunks Parccon into chunks Disk 154 Use Compression – Store Compressed Chunks Distribute and deploy Fetch Venc Store Compression Chunks Parccon into chunks Disk 155 Use Compression – Store Compressed Chunks Distribute and deploy Fetch Venc • No compression in image deployment • Compression of two idenccal chunks => same compressed chunk Store Compression Chunks Parccon into chunks Disk 156 Use Compression – Store Compressed Chunks Distribute and deploy Fetch Venc Store • No compression in image deployment • Compression of two idenccal chunks => same compressed chunk ✔ efficient image deployment ✔ efficient image storage Compression Chunks Parccon into chunks Disk 157 Efficient Image Storage • Dataset: 430 Linux images – Filesystem size: 21 TB (allocated 2 TB) Format Size (GB) Frisbee Images 233 Venc 32KB 75 158 Efficient Image Storage • Dataset: 430 Linux images – Filesystem size: 21 TB (allocated 2 TB) Format Size (GB) Frisbee Images 233 Venc 32KB 75 3× reduccon 159 End-to-end image deployment (seconds) Efficient Image Deployment 30 Baseline Frisbee Venti-based Frisbee 25 20 15 10 5 0 1 8 Number of clients 16 End-­‐to-­‐End Image Deployment Performance 160 End-to-end image deployment (seconds) Efficient Image Deployment 30 Baseline Frisbee Venti-based Frisbee 25 Similar performance In image deployment 20 15 10 5 0 1 8 Number of clients 16 End-­‐to-­‐End Image Deployment Performance 161 More in the Paper • Aligned Fixed-­‐size Chunking (AFC) • Image retrieve cme based on chunk sizes – Smaller chunk size: be5er deduplicacon – Larger chunk size: be5er image deployment performance • Image deployment performance with staging nodes 162 Summary Compress each chunk Aligned Fixed-­‐size Chunking Careful seleccon of chunk size ✔ efficient image deployment ✔ efficient image storage Pre-­‐computacon of Frisbee chunk headers Using Deduplicacng Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, and Robert Ricci TridentCom ’15: 10th internaconal conference on testbeds and research infrastructures for the development of networks & communices 163 Outline ü Introduccon ü Migratory Compression ü Improve deduplicacon by separacng metadata from data ü Using deduplicacon for efficient disk image deployment ü Performance predictability and efficiency for Cloud Storage Systems ü Conclusion 164 Introduccon • Cloud Compucng is becoming popular – Amazon Web Service (AWS), Microsow Azure ... • Virtualizacon enables efficient resource sharing – Mulcplexing and consolidacon; economics of scale 165 Introduccon • Cloud Compucng is becoming popular – Amazon Web Service (AWS), Microsow Azure ... • Virtualizacon enables efficient resource sharing – Mulcplexing and consolidacon; economics of scale • Sharing introduces interference – Random and sequencal workloads [Gulac VPACT ‘09] – Read and write workloads [Ahmad WWC ‘03] 166 Introduccon • Cloud Compucng is becoming popular – Amazon Web Service (AWS), Microsow Azure ... • Virtualizacon enables efficient resource sharing – Mulcplexing and consolidacon; economics of scale • Sharing introduces interference – Random and sequencal workloads [Gulac VPACT ‘09] – Read and write workloads [Ahmad WWC ‘03] • Targecng environments: a large number of mixed workloads 167 Random Workloads’ Impact on Disk Bandwidth Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones High disk bandwidth = high disk uclizacon 168 Random Workloads’ Impact on Disk Bandwidth 160 Aggr. Bandwidth (MB/s) 140 SR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads High disk bandwidth = high disk uclizacon 169 Random Workloads’ Impact on Disk Bandwidth 160 Aggr. Bandwidth (MB/s) 140 SR RR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads High disk bandwidth = high disk uclizacon 170 Random Workloads’ Impact on Disk Bandwidth 160 Aggr. Bandwidth (MB/s) 140 SR RR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads High disk bandwidth = high disk uclizacon 171 Random Workloads’ Impact on Disk Bandwidth 160 Aggr. Bandwidth (MB/s) 140 SR RR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads High disk bandwidth = high disk uclizacon 172 Random Workloads’ Impact on Disk Bandwidth 160 SR Aggr. Bandwidth (MB/s) 140 RR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 10 MB/s 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads High disk bandwidth = high disk uclizacon 173 Random Workloads’ Impact on Disk Bandwidth 160 SR Aggr. Bandwidth (MB/s) 140 RR Workloads: -­‐ All sequencal -­‐ 1 sequencal, adding more random ones 120 100 80 60 40 20 10 MB/s 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Number of co-­‐located workloads Random workloads are harmful for disk bandwidth High disk bandwidth = high disk uclizacon 174 Differencal I/O Scheduling (DIOS) • Opportunity: replicacon is commonly used in cloud storage systems – Google Filesystem, Ceph, sheepdog, etc. • Idea: dedicate each disk to serve one type of requests 175 Differencal I/O Scheduling (DIOS) • Opportunity: replicacon is commonly used in cloud storage systems – Google Filesystem, Ceph, sheepdog, etc. • Idea: dedicate each disk to serve one type of requests • System implica0ons (tradeoff) – High bandwidth for disks serving sequencal requests – Lower performance for random workloads 176 Architecture Disk1 Disk2 177 Architecture Disk1 1 2 Disk2 3 File(a) 178 Architecture Disk1 1 2 1 Disk2 3 2 1 2 3 3 File(a) 179 Architecture Disk1 1 2 1 Disk2 3 2 File(a) 1 3 2 A 3 B C File(b) 180 Architecture Disk1 Disk2 1 2 3 1 2 3 A B C A B C 1 2 File(a) 3 A B C File(b) 181 Architecture Disk1 Disk2 1 2 3 1 2 3 A B C A B C Random access 1 2 File(a) 3 A B C File(b) 182 Architecture Disk1 Disk2 1 2 3 1 2 3 A B C A B C Random access 1 Sequencal access 2 File(a) 3 A B C File(b) 183 Architecture Primary replica Secondary replica Disk1 Disk2 1 2 3 1 2 3 A B C A B C Random access 1 Sequencal access 2 File(a) 3 A B C File(b) 184 Architecture Primary replica Secondary replica Disk1 Disk2 1 2 3 1 2 3 A B C A B C ✔ High bandwidth Random access 1 Sequencal access 2 File(a) 3 A B C File(b) 185 Implementacon • Ceph: distributed storage system that implements replicacon - Use randomness in seleccng replicas • DIOS: modified Ceph with determiniscc replica seleccons and request type-­‐aware scheduling 186 Evaluacon – Sequencal Workloads Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 187 Evaluacon – Sequencal Workloads 300 Ceph Aggr. Bandwidth (MB/s) 250 200 150 Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 100 50 0 0 RR 10 RR 20 RR 40 RR 188 Evaluacon – Sequencal Workloads 300 Ceph DIOS Aggr. Bandwidth (MB/s) 250 200 150 Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 100 50 0 0 RR 10 RR 20 RR 40 RR 189 Evaluacon – Sequencal Workloads 300 Ceph DIOS Aggr. Bandwidth (MB/s) 250 200 150 Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 100 2.5 × 50 6 × 0 0 RR 10 RR 20 RR 40 RR 190 Evaluacon – Sequencal Workloads 300 Ceph DIOS Aggr. Bandwidth (MB/s) 250 200 150 Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 100 2.5 × 50 6 × 0 0 RR 10 RR 20 RR 40 RR 191 Evaluacon – Sequencal Workloads 300 Ceph DIOS Aggr. Bandwidth (MB/s) 250 200 150 Workload: 20 sequencal, mixing with different numbers of random workloads. Total data disks: 6 100 2.5 × 50 6 × DIOS: • Consistent performance • Higher bandwidth (disk ucl) 0 0 RR 10 RR 20 RR 40 RR 192 Evaluacon – Random Workloads 800 700 Aggr. IOPS 600 500 400 300 200 100 Ceph 0 10 RR 20 RR 40 RR 193 Evaluacon – Random Workloads 800 700 Aggr. IOPS 600 500 400 300 200 100 Ceph 0 10 RR 20 RR DIOS 40 RR 194 Evaluacon – Random Workloads 800 700 Aggr. IOPS 600 Half of disks are serving random workloads 500 400 300 200 100 Ceph 0 10 RR 20 RR DIOS 40 RR 195 Evaluacon – Random Workloads 800 700 Aggr. IOPS 600 Half of disks are serving random workloads 500 400 High disk uclizacon for the other half of disks 300 200 100 Ceph 0 10 RR 20 RR DIOS 40 RR 196 Summary • It is more efficient and leads to more predictable performance, by dedicacng a disk to serve a single type of read requests. • Future direccons – Write workloads – Extension to 3-­‐way replicacon (load balance) Towards Fair Sharing of Block Storage in a Mulc-­‐tenant Cloud Xing Lin, Yun Mao, Feifei Li, Robert Ricci HotCloud ’12: 4th USENIX Workshop on Hot Topics in Cloud Compucng, June 2012 197 Outline ü Introduccon ü Migratory Compression ü Improve deduplicacon by separacng metadata from data ü Using deduplicacon for efficient disk image deployment ü Performance predictability and efficiency for Cloud Storage Systems ü Conclusion 198 Conclusion • Problems: – Limitacons in tradiconal compression and deduplicacon – Performance interference for cloud storage • Use similarity in content and access pa5erns – Migratory compression: find similarity in block level and do re-­‐organizacon – Deduplicacon: store same type of data blocks together – Differencal IO scheduling: schedule same type of requests to same disk 199 Thesis Statement Similarity in content and access pa5erns can be uclized to improve space efficiency, by storing similar data together and performance predictability and efficiency, by scheduling similar IO requests to the same hard drive. 200 Acknowledgements • Advisor, for always supporcng my work – Robert Ricci • CommiEee members, for providing feedback – Rajeev Balasubramonian – Fred Douglis – Feifei Li – Jacobus Van der Merwe • Flux members – Eric Eide, Mike Hibler, David Johnson, Gary Wong, … • EMC/Data Domain colleagues – Philip Shilane, Stephen Smaldone, Grant Wallace, … • Families and friends, for kindly support 201 Acknowledgements • Advisor, for always supporcng my work – Robert Ricci (HotCloud12, HotStorage15, TridentCom15) • CommiEee members, for providing feedback – Rajeev Balasubramonian – Fred Douglis – Feifei Li – Jacobus Van der Merwe (WDDD11) (FAST14, HotStorage15) (HotCloud12) (SOCC15) • Flux members – Eric Eide, Mike Hibler, David Johnson, Gary Wong, … • EMC/Data Domain colleagues – Philip Shilane, Stephen Smaldone, Grant Wallace, … • Families and friends, for kindly support 202 Thank You ! 203 Backup Slides 204 VF -­‐ Related Work • Deduplicacon analysis for virtual machine images [Jin SYSTOR ‘09] – Benefit: 70~80% of blocks are duplicates – Only analysis • LiveDFS: deduplicacng filesystem for storing disk images for OpenStack [Ng middleware ‘11] – Focus on spacal locality and metadata prefetching – Scales linearly and only 4 VMs; no compression • Emustore: an early a5empt for this project [Pullakandam Master Thesis 2011] – Compression during image deployment – Used regular fixed-­‐size chunking 205 Disk Image Deployment Pipeline • Compression factor is 3.18X for this image • Available network bandwidth: 500 Mb/s (60 MB/s) Read compressed image from server disk Transfer compressed image across network Decompress compressed image Write image into disk 206 Disk Image Deployment Pipeline • Compression factor is 3.18X for this image • Available network bandwidth: 500 Mb/s (60 MB/s) Read compressed image from server disk 480 (150) Transfer compressed image across network 172.84 (54.26) Decompress compressed image Write image into disk 149.10 71.07 Effeccve throughput (uncompressed data) (MB/s) 207 Disk Image Deployment Pipeline • Compression factor is 3.18X for this image • Available network bandwidth: 500 Mb/s (60 MB/s) Read compressed image from server disk 480 (150) Transfer compressed image across network 172.84 (54.26) Decompress compressed image Write image into disk 149.10 71.07 Effeccve throughput (uncompressed data) (MB/s) 208 Disk Image Deployment Pipeline • Compression factor is 3.18X for this image • Available network bandwidth: 500 Mb/s (60 MB/s) Read compressed image from server disk 480 (150) Transfer compressed image across network 172.84 (54.26) Decompress compressed image Write image into disk 149.10 71.07 Effeccve throughput (uncompressed data) (MB/s) 209 Disk Image Deployment Pipeline • Compression factor is 3.18X for this image • Available network bandwidth: 500 Mb/s (60 MB/s) Read compressed image from server disk 480 (150) Transfer compressed image across network Decompress compressed image Write image into disk 149.10 71.07 172.84 (54.26) Effeccve throughput (uncompressed data) (MB/s) Compression 29.68 210 Chunk Boundary Shiw Problem Image A Image A’ 1 2 a b c 3 4 5 6 7 8 … 3 4 5 6 7 8 … 211 Chunk Boundary Shiw Problem Image A Image A’ 1 2 a b c 3 4 5 6 7 8 … 3 4 5 6 7 8 … Regular fixed-­‐size (4 blocks) chunking 1 2 3 4 5 6 7 8 … a b c 3 4 5 6 7 8 … 212 Chunk Boundary Shiw Problem Image A Image A’ 1 2 a b c 3 4 5 6 7 8 … 3 4 5 6 7 8 … Regular fixed-­‐size (4 blocks) chunking 1 2 3 4 5 6 7 8 … a b c 3 4 5 6 7 8 … 213 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 214 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 215 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 216 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 217 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 218 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … 219 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … Aligned Fixed-­‐size (4 blocks) chunks 220 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … Aligned Fixed-­‐size (4 blocks) chunks Image A 1 2 3 4 5 6 7 8 … 221 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … Aligned Fixed-­‐size (4 blocks) chunks Image A A’ 1 a b c 2 3 4 5 6 7 8 … 3 4 5 6 7 8 … 222 Aligned Fixed-­‐size Chunking (AFC) Image A 1 2 3 BlockID 0 1 9 Image A’ a b c 3 4 5 6 7 8 … 4 5 6 7 8 … Aligned Fixed-­‐size (4 blocks) chunks Image A A’ 1 a b c 2 3 4 5 6 7 8 … 3 4 5 6 7 8 … 223 AFC -­‐ Evaluacon 4000 Variable-disk Variable-stream Fixed-disk Fixed-stream AFC Chunking time (s) 3500 3000 2500 2000 1500 1000 500 0 5 5.5 6 6.5 7 7.5 8 8.5 Deduplication ratio (X) 9 9.5 10 224