here

advertisement
MonetDB:
A column-oriented DBMS
Ryan Johnson
CSC2531
The memory wall has arrived
• CPU performance
+70%/year
• Memory performance
latency: -50%/decade
bandwidth: +20%/year (est.)
• Why?
– DRAM focus on capacity (+70%/year)
– Physical limitations (pin counts, etc.)
– Assumption that caches "solve” latency problem
DBMS spends 95% of time waiting for memory
The problem: data layouts
• Logical layout: 2-D relation
=> Unrealizable in linear address space!
• N-ary storage layout, aka “slotted pages”
...
– Easy row updates, strided access to columns
=> Low cache locality for read-intensive workloads
“NSM layouts considered harmful”
Coping with The Wall
• Innovation: decompose all data vertically
– Columns stored separately, rejoined at runtime
• Binary Association Table (BAT) replaces Relation
– List of (recordID, columnValue) pairs
– Compression and other tricks ==> 1 byte/entry
BAT + clever algos => cache locality => Winner!
Exploring deeper
•
•
•
•
•
Performance study (motivation)
Physical data layouts
Cache-optimized algorithms
Evaluating MonetDB performance
Implications and lingering questions
NSM: access latency over time
Read one column
(record size varies with x)
Latency increases ~10x as
accesses/cache line  1
(slope changes at L1/L2 line size)
Efficient physical BAT layout
• Idea #1: “virtual OID”
– Optimizes common case
– Dense, monotonic OIDs
– All BATs sorted by OID
Joining two BAT on
OID has O(n) cost!
How to handle gaps?
• Idea #2: compression
– Exploits small domains
– Boosts cache locality,
effective mem BW
Out-of-band values?
Can’t we compress
NSM also?
Cache-friendly hash join
• Hash partitioning: one
pass but trashes L1/L2
– #clusters > #cache lines
Recall:
CPU is cheap
compared to
memory access
• Radix-partitioning: limit
active #partitions by
making more passes
Great, but how well does it work?
• Three metrics of interest
– L1/L2 misses (= suffer latency of memory access)
– TLB misses (even more expensive than cache miss)
– Query throughput (higher is better)
• Should be able to explain throughput using
other metrics
– Given model makes very good predictions
=> Memory really is (and remains!) the bottleneck
A few graphs
Radix clustering behavior as
cardinality varies
Radix-clustered HJ
vs. other algorithms
Big win: stability as cardinalities vary
Implications and discussion points
• Cache-friendly really matters (even w/ I/O)
– Traditional DBMS memory-bound
• Vertically decomposed data: superior density
– Data brought to cache only if actually needed
– Compression gives further density boost
• Questions to consider...
– Queries accessing many columns?
– What about inserts/updates (touch many BAT)?
– What about deletes/inserts (bad for
compression)?
Implications and discussion points
• Cache-friendly really matters (even w/ I/O)
– Traditional DBMS memory-bound
• Vertically decomposed data: superior density
– Data brought to cache only if actually needed
– Compression gives further density boost
• Questions to consider...
– Queries accessing many columns?
– How to make a good query optimizer?
– Performance of transactional workloads?
• Update-intensive, concurrency control, ...
– What about inserts (bad for compression)?
Download