OLTP on the NVM SDV: YMMV - H

advertisement
OLTP on NVM:
YMM
@andy_pavl
o
The Last Six Months
PDL Retreat
October 2013
?
PDL Visit Day
May 2014
GO
OD
Prison Life
Washing
Dishes
Not Fighting
EVI
L
Cafeteria
Thievery
Shankings
NVM OLTP
DRA
M
Lightweight
CC
Logical
SSD/HD
D
Heavyweight
CC
ARIES
Overview
• Understand the
performance
characteristics of NVM to
develop an optimal DBMS
architecture for OLTP
workloads.
9
OLTP Workloads
• Transactions have three
defining characteristics:
– Short-lived.
– Small footprint.
– Repetitive.
10
Intel NVM Emulator
• Instrumented motherboard
that slows down access to
the memory controller.
• Two execution interfaces:
– NUMA (NVM-only)
– PMFS (DRAM+NVM)
11
NUMA Interface – NVM-Only
• Virtual CPU where all
memory access uses the
NVM portion of DRAM.
• No change to application
code.
12
PMFS Interface – DRAM+NVM
• Special filesystem
designed for byteaddressable NVM.
• Avoids overhead of
traditional filesystems.
13
DBMS Architectures
• Disk-oriented.
• Main memory-oriented.
14
Disk-oriented DBMS
• Pessimistic assumption
that the data a txn needs
is not in memory
• Based on the design
assumptions made in the
1970s.
– Ingres (Berkeley)
15
Application
Buffer Pool
Primary
Storage
ARIES
LOG
Memory-oriented DBMS
• Assume that all data fits
in memory. Avoid the
overhead of concurrency
control + recovery.
– SmallBase (AT&T)
– Hekaton (Microsoft)
– H-Store/VoltDB (Me &
17
Application
Primary Storage
CMD
LOG
Snapshots
Experimental Evaluation
• Compare the DBMS
architectures on the two
NVM interfaces.
• Yahoo! Cloud Serving
Benchmark:
– 10 million records (~10GB)
– 8x database / memory
19
Evaluated Systems
• NVM-Only
– H-Store (v2014)
– MySQL (v5.5)
• NVM+DRAM
– H-Store + Anti-Caching
(v2014)
– MySQL (v5.5)
20
YCSB //
NUMA Interface (NVM-Only)
Read-Only Workload
2x Latency Relative to DRAM
H-Store
MySQL
200,000
150,000
100,000
50,000
0
1
TXN/SEC
2
3
4
SKEW AMOUNT (HIGH→LOW)
5
21
YCSB //
PMFS Interface (NVM+DRAM)
Read-Only Workload
2x Latency Relative to DRAM
Anti-Caching
MySQL
200,000
150,000
100,000
50,000
0
1
TXN/SEC
2
3
4
SKEW AMOUNT (HIGH→LOW)
5
22
YCSB //
NUMA Interface (NVM-Only)
Write-Heavy Workload
2x Latency Relative to DRAM
H-Store
MySQL
50,000
40,000
30,000
20,000
10,000
0
1
TXN/SEC
2
3
4
SKEW AMOUNT (HIGH→LOW)
5
23
YCSB //
PMFS Interface (NVM+DRAM)
Write-Heavy Workload
2x Latency Relative to DRAM
Anti-Caching
MySQL
50,000
40,000
30,000
20,000
10,000
0
1
TXN/SEC
2
3
4
SKEW AMOUNT (HIGH→LOW)
5
24
Discussion
• NVM latency did not make
a big difference in
performance.
• Logging is major
bottleneck in DBMS
performance on NVM.
• MySQL wastes NVM space.
25
Database Research @
CMU
Database Research @ CMU
•
•
•
•
N-Store
H-Store + Anti-Caching
ThomasDB
MongoDB
27
N-STORE
N-Store
• NVM-only Architecture.
• Hybrid OLTP/OLAP DBMS:
– High-performance txn
processing.
– Low-latency analytical
operations.
– Instant recovery.
29
N-Store – Shadow Paging
1
2
3
4
X
Master
Page Table
X
X
X
DB Root
1
2
3
4
Shadow
Page Table
30
H-STORE
H-Store + Anti-Caching
• Allows DBMS to mange DBs
that are larger than
amount of DRAM.
• Reducing memory overhead
of evicted data.
• Using ML to reduce disk
I/O.
32
MAIN
TUPLE TABLE
{
Ø
EVICTED
TUPLE TABLE
<TUPLE DATA>
<BLOCK> <OFFSET>
<TUPLE DATA>
<BLOCK> <OFFSET>
<TUPLE DATA>
<BLOCK> <OFFSET>
<TUPLE DATA>
HEADER
LRU
INDEXES
ANTI-CACHE
<TUPLE DATA>
<TUPLE DATA>
<TUPLE DATA>
<TUPLE DATA>
CHAIN
ANTI-CACHE BLOCKS
33
H-Store + Anti-Caching
• Bloom filter usage
tracking.
• Multi-faceted indexes.
• Cold storage
approximations.
• Semantic block
clustering.
34
ThomasDB
ThomasDB
• High-performance, lowoverhead incremental
computation platform.
• Maintains mapping between
function invocations and
data.
36
ThomasDB – Data Marts
PUBLIC
DATABASE
PRIVATE
DATABASE
Update!
UDF
UDF
⨝
Materialized View
ANALYTICS PROGRAM
37
ThomasDB – Preliminary Results
MAP FUNCTION OVER WIKIPEDIA CORPUS
38
MongoDB
Distributed Document Database Design
• Automatic tool selects
the near-optimal physical
design:
– Sharding Keys
– Denormalization
– Indexes
40
Vigilante DBA
• Automated framework for
finding DB applications
and fixing them.
• Build a large catalog of
database applications
from public sources.
41
Summary
• Lots of database stuff in
the works.
• Always looking for
industry collaborators.
42
END
@ANDY_PAVLO
db.cs.cmu.edu
This talk is in compliance with the Federal Bureau
of Prisons (FBP) Act (1930) Pub. L. No. 71-218, 46
Stat. 325. All reference herein to any specific
commercial products, process, or service by trade
name, trademark, manufacturer, or otherwise, does
not necessarily constitute or imply its
endorsement, recommendation, or favoring by the
FBP. The views and opinions of authors expressed
herein do not necessarily state or reflect those
of the FBP, and shall not be used for evaluation
by including, but not limited, to FBP parole
board, its directors, and other corrections
officers.
Regulations are contained in Title 28, Chapter V
of the Code of Federal Regulations (CFR). Contact
with parolees for research questions is regulated
under 28 C.F.R. 540. All complaints about the
Download