Cost-effective Hybrid Storages Flash Group, Cao Qingling Motivation • Dry up things with the least money! Motivation • High cost, low density, low reliability. • Replacement as HDD is not recommended, especially when the data volume is very large. • Cache as HDD, make up the gap between RAM and HDD. Hurt the lifetime. • A permanent store at the same level as HDD, store some special data. Introduction • Put forward a Hybrid Hbase with SSD. • Storing system component of Hbase in SSD, which at the same level as HDD. • Perform quantitative assessment, Hybrid Hbase perform 1.5-2 times better. HBase • Column-based key-value store. • Each region server has a write-ahead log(WAL). • First write WAL and then in-memory memstore. • Region is a horizontal division. • A region could split. Data on disks is stored as Log-structured merge(LSM) trees. HBase System Component Zookeeper: • Clients contact it for -ROOT- table. • Master contacts it to know available region servers. • Region servers contact with it in a heartbeat keepalive mechanism. • Zookeeper is I/O intensive. Catalog Tables: • -ROOT- and .META. Tables. • Mostly read intensive and are not updated frequently. HBase System Component Write-ahead-log(WAL): • Any write is first done on the WAL. • The size grows with: i) WAL committed; ii) write rate; iii) the size of key-value pair. Temporary Storage: • Used when a region is split or merged. • Sequentially read or write. Assessment Price: 1:10 1% of the database size. Gain more than 10% performance. Experimental Evaluation • Experiment: Intel processor(4 cores and 4 threads at 3 GHz) with 8 GB RAM, Western Digital 1TB HDD, Kingston 128 GB SSD. • Yahoo! Cloud Serving Benching(YCSB). • Workloads: 100w queries on database with 6000w records. Record size is 1KB. Totally 72 regions. Experimental Evaluation Experimental Evaluation Introduction • Approximate membership query data structure(AMQ). Bloom Filter. • Larger than RAM, performance decays. • Quotient Filter:better data locality, squential operations, available delete, dynamically resized, space-saving. • Buffered Quotient Filter(BQF) and Cascade Filter(CF) designed for flash. Introduction • Approximate membership query data structure(AMQ). Bloom Filter. • Larger than RAM, performance decays. • Quotient Filter:better data locality, squential operations, available delete, dynamically resized, space-saving. • Buffered Quotient Filter(BQF) and Cascade Filter(CF) designed for flash. Quotient Filter Fingerprint: f = fq2r + fr. • fr = f mod 2r • fq = • T[fq] = fr Quotient Filter run Physical Storage • is_occupied: check if fq = i, namely if T[i] has data. • is_shifted: if fr belongs to slot i. • is_continuation: if blongs to the same run with i-1. Quotient Filter • Check if f in the QA: step1: step2: to the beginning of the cluster. step3: to the start of the run. step4: search f. • Insert a f. • Delete a f. Quotient Filters on Flash • Buffered Quotient Filter - BQF: one QF as the buffer, another on SSD. - Optimized for lookup performance. • Cascade Filter - Optimized for insertion. - Offer a trade off between lookup and insertion. Quotient Filter on Flash • Cascade Filter - Based on cache-oblivious lookahead arrary(COLA). Evaluation Conclusions • Bloom Filter has wide use in key-value storage. • Change the way of thinking. • Gain inspiration from traditional algorithms of database. • Design corresponding hybrid system by applications. Bloom Filter • Initial state: • Insert: H(1), H(b). • Can not expend, support no delete, poor data locality.