ppt

advertisement
Memory –efficient Data Management
Policy for Flash-based Key-Value Store
Wang Jiangtao
2013-4-12
Outline
• Introduction
• Related work
• Two works
– BloomStore[MSST2012]
– TBF[ICDE2013]
• Summary
Key-Value Store
• KV store efficiently supports simple operations: Key
lookup & KV pair insertion
– Online Multi-player Gaming
– Data deduplication
– Internet services
3
Overview of Key-Value Store
• KV store system should provide high access
throughput (> 10,000 key lookups/sec)
• Replaces traditional relational DBs for its
superior scalability & performance.
– prefer to use KV store for its simplicity and better
scalability
• Popular management (index + storage) solution
for large volume of records
– often implemented through an index structure,
mapping Key-> Value
Challenge
• To meet high throughput demand, the
performance of index access and KV pair (data)
access is critical
– index access : search the KV pair associated with a given
“key”
– KV pair access: get/put the actual KV pair
• Available memory space limits the maximum
number of stored KV pairs
• Using in-RAM index structure can only address
index access performance demand
DRAM must be Used Efficiently
• 1 TB of data
• 4 bytes of DRAM for key-value pair
1000
Index size(GB)
32 B( Data deduplication) => 125 GB!
100
168 B(Tweet) => 24 GB
10
1 KB(Small image) => 4 GB
1
10
100
1000
Per Key-value pair size (bytes)
10000
6
Existing Approach to Speed up Index &
KV pair Accesses
• Maintain the index structure in RAM to map
each key to its KV pair on SSD
– RAM size can not scale up linearly to flash size
• Keep the minimum index structure in RAM,
while storing the rest of the index structure in
SSD
– On-flash index structure should be designed
carefully
Space is precious
 random writes are slow and bad for flash life (wear out)
Outline
• Introduction
• Related work
• Two works
– BloomStore[MSST2012]
– TBF[ICDE2013]
• Summary
Bloom Filter
• Bloom Filter利用位数组表示一个集合,并判断一个元素是否属于这
个集合。初始状态时,m位的位数组的每一位都置为0,Bloom Filter使
用k个相互独立的哈希函数,它们分别将集合中的每个元素映射到
{1,…,m}的范围中。对任意一个元素x,第i个哈希函数映射的位置hi(x)
就会被置为1(1≤i≤k)。注意,如果一个位置多次被置为1,那么只有
第一次会起作用,后面几次将没有任何效果。
• 错误率
• Bloom Filter参数选择
– 哈希函数的个数k、位数组大小m、元素的个数n
– 降低错误率
FlashStore[VLDB2010]
• Flash as a cache
• Components
–
–
–
–
Write buffer
Read cache
Recency bit vector
Disk-presence
bloom filter
– Hash table index
• Cons
– 6 bytes of RAM per
key-value pair
SkimpyStash[SIGMOD2011]
• Components
– Write buffer
– Hash table
 Bloom filter
 using linked list
 a pointer to the beginning of the
linked list of flash
• Storing the linked lists on flash
– Each pair have a pointer to
earlier keys in the log
• Cons
– Multiple flash page reads for a
key lookup
– High garbage collection cost
Outline
• Introduction
• Related work
• Two works
– BloomStore[MSST2012]
– TBF[ICDE2013]
• Summary
MSST2012
Introduction
• Key lookup throughput is the bottleneck for data
application
• Keep an in-RAM large-sized hash table
• Move index structure to secondary storage(SSD)
– Expensive random write
– High garbage collection cost
– Bigger storage space
BloomStore
• BloomStore Design
– An extremely low amortized RAM overhead
– Provide high key lookup/insertion throughput
• Componets
– KV Pair write buffer
– Active bloom filter
 a flash page for write
buffer
– Bloom filter chain
 many flash pages
– Key-range partition
 a flash “block”
BloomStore architecture
KV Store Operations
• Key Lookup
– Active Bloom filter
– Bloom filter chain
– Lookup cost
Parallel lookup
• Key Lookup
– Read the entire BF chain
– Bit-wise AND resultant row
– High read throughput
h1(ei)
h1(ei)
..
.
h1(ei)
Bit-wise AND
ei is found
Bloom filters in parallel
KV Store Operations
• KV pair Insertion
• KV pair Update
– Append a new key-value pair
• KV pair Deletion
– Insert a null value for the key
Experimental Evaluation
• Experiment setup
– 1TB SSD(PCIe)/32GB(SATA)
• Workload
Experimental Evaluation
• Effectiveness of prefilter
– Per KV pair is 1.2 bytes
• Linux Workload
• Vx Workload
Experimental Evaluation
• Lookup Throughput
– Linux Workload
 H=96(BF chain length)
 m=128(the size of a BF)
– Vx Workload
 H=96(BF chain length)
 m=64(the size of a BF)
 A prefilter
ICDE2013
Motivation
• Using flash as a extension cache is cost-effective
• The desired size of RAM-cache is too large
– Caching policy is memory-efficient
• Replacement algorithm achieves comparable
performance with existing policies
• Caching policy is agnostic to the organization of
data on SSD
Defects of the existing policy
• Recency-based caching algotithm
– Clock or LRU
– Access data structure and index
Defects of the existing policy
• Recency-based caching algotithm
– Clock or LRU
– Access data structure and index
System view
• DRAM buffer
BF
– An in-memory data structure to
maintain access information (BF)
– No special index to locate keyvalue pair
• Key-value store
– Provide a iterator operation to
traverse
– Write through
Key-Value cache prototype architecture
Bloom Filter with deletion(BFD)
• BFD
– Removing a key from SSD
– A bloom filter with deletion
– Resetting the bits at the corresponding hash-value
in a subset of the hash functions
X1
0
1
0
0
1
0
1
0
1
0
1
0
Delete X1
0
1
0
0
0
0
1
0
0
0
1
0
Bloom Filter with deletion(BFD)
• Flow chart
• Tracking recency information
• Cons
– False positive
polluting the cache
– False negative
Poor hit ratio
Two Bloom sub-Filters(TBF)
•
•
•
•
Flow chart
Dropping many elements in bulk
Flip the filter periodically
Cons
– Keeping rarely-accessed objects
polluting the cache
– traversal length per eviction
Traversal cost
• Key-Value Store Traversal
– unmarked on insertion
– marked on insertion
longer stretches of marked
objects
False positive
Evaluation
• Experiment setup
– two 1 TB 7200 RPM SATA
disks in RAID-0
– 80 GB FusionioDrive PCIE
X4
– a mixture of 95% read
operations and 5% update
– Key-value pairs:200
million(256B)
• Bloom filter
– 4 bits per marked object
– a byte per object in TBF
– hash function:3
Outline
• Introduction
• Related work
• Two works
– BloomStore[MSST2012]
– TBF[ICDE2013]
• Summary
Summary
• KV store is particularly suitable for some
special applications
• Flash will improve the performance of KV
store due to its faster access
• Some index structure need to be redesign to
minimize the RAM size
• Don’t just treat flash as disk replacement
33
Thank You!
Download