*** 1

advertisement
Phase Change Memory Aware Data
Management and Application
Jiangtao Wang
Outline
• Introduction
• Integrating PCM into the Memory Hierarchy
− PCM for main memory
− PCM for auxiliary memory
• Conclusion
Phase change memory
•An emerging memory technology
•Memory(DRAM)
−Read/write speeds and Byte-addressable
−Lower Idle power
•Storage(SSD & HDD)
−Non-volatile
−high capacity (high density)
Phase change memory
DRAM
PCM
NAND Flash
Page size
64B
64B
2KB
Page read latency
20-50ns
~50ns
~25us
Page write latency
20-50ns
~1us
~500us
Endurance
∞
106-108
104-105
Idle power
~100mW/GB
~1mW/GB
1-10mW/GB
Density
1x
2-4x
4x
•Cons:
−Asymmetry read/write latency
−Limited write endurance
Phase change memory
Read operation
100us
1ms
HDD
10us
FLASH
Write operation
1us
PCM
DRAM
100ns
HDD
FLASH
PCM
DRAM
10ns
10ms
Outline
• Introduction
• Integrating PCM into the Memory Hierarchy
− PCM for main memory
− PCM for auxiliary memory
• Conclusion
Integrating PCM into the Memory Hierarchy
• PCM for main memory
– Replacing DRAM with PCM to achieve larger main
memory capacity
• PCM for auxiliary memory
– PCM as a write buffer for HDD/SSD DISK
Buffering dirty page to minimize the disk write I/Os
– PCM as secondary storage
Storing log records
PCM for main memory
CPU
CPU
CPU
L1/L2 Cache
L1/L2 Cache
L1/L2 Cache
Memory Controller
Memory Controller
DRAM Cache
[ISCA’09]
[ICCD’11]
[DAC’09]
[CIDR’11]
Memory Controller
DRAM
Write buffer
Phase Change Memory
Phase Change Memory
Phase Change Memory
HDD/SSD Disk
HDD/SSD Disk
HDD/SSD Disk
(a)PCM-only memory
(b)DRAM as a cache memory
(c)DRAM as a write buffer
PCM for main memory
Challenges with PCM
• Major disadvantage – Writes
Compared to read operation ,PCM writes incur higher energy consumption、
higher latency and limited endurance
Read latency
20~50ns
Write latency
~1us
Read energy
1 J/GB
Write energy
6 J/GB
Endurance
106~108
Reducing PCM writes is an important goal of data management on PCM !
PCM for main memory
Optimization on PCM write
[ISCAS’07]
[ISCA’09]
[MICRO’09]
• Optimization: data comparison write
• Goal: write only modified bits rather than entire cache line
• Approach: read-compare-write
CPU cache
0 1 0 1 1 0 0
1 0
1 0 1 1 0 1 0
1 1 1
0
0 1 0 1 1 0 1 1 0 1 1 0 1 1 1 0
read
PCM
0 1 0 1 1 0 0
1 0
1 0 1 1 0 1 0
1 1 1
0
PCM for main memory
PCM-friendly algorithms
Rethinking Database Algorithms for Phase
Change Memory(CIDR2011)
• Motivation
Choosing PCM-friendly database algorithms and data structures
to reduce the number of writes
PCM for main memory
PCM-friendly DB algorithms
• Prior design goals for DRAM
− Low computation complexity
− Good CPU cache performance
− Power efficiency (more recently)
• New goal for PCM
− minimizing PCM writes
− Low wear , energy and latency
− Finer-grained access granularity:bits,words,cache line
• Two core database techniques
− B+-Tree Index
− Hash Joins
PCM-friendly DB algorithms
B+-Tree Index
• B+ -Tree
– Records at leaf nodes
– High fan out
– Suitable for file systems
• For PCM
– Insertion/deletion incur a lot of write operations
num
5
keys
Insert 3
2 4 7 8 9
num
6
keys
2 3 4 7 8 9
incurs 11 writes
pointers
– K keys and K pointers in a node: 2(K/2)+1=K+1
pointers
PCM-friendly DB algorithms
B+-Tree Index
• PCM-friendly B+-Tree
– Unsorted: all the non-leaf and leaf nodes unsorted
– Unsorted leaf: sorted non-leaf and unsorted leaf
– Unsorted leaf with bitmap :sorted non-leaf and unsorted
leaf with bitmaps
num
5
keys
num
8 2 9 4 7
1011
1010
pointers
Unsorted leaf node
keys
8
2 9 4
7
pointers
Unsorted leaf node with bitmap
PCM-friendly DB algorithms
B+-Tree Index
• Unsorted leaf
– Insert/delete incurs 3 writes
num
5
num
keys
Delete 2
8 2 9 4 7
4
keys
8 7 9 4
pointers
pointers
• Unsorted leaf with bitmap
– Insert incurs 3 writes; delete incurs 1 write
num
1011
1010
num
keys
8
2 9 4
pointers
7
Delete 2
1001
1010
keys
8
2 9 4
pointers
7
Experimental evaluation
B+-Tree Index
• Simulation Platform
–
–
–
–
Cycle-accurate X86-64 simulator: PTLSim
Extended the simulator with PCM support
Modeled data comparison write
CPU cache(8MB), B+-Tree (50 million entrys,75% full,1GB)
• Three workloads:
– Inserting 500K random keys
– Deleting 500K random keys
– Searching 500K random keys
Experimental evaluation
B+-Tree Index
• Node size 8 cache lines; 50 million entries, 75% full;
Energy
Total wear
3E+8
Execution time
16
5E+9
2E+8
1E+8
4E+9
12
10
cycles
energy (mJ)
num bits modified
14
8
6
4
insert
delete
search
0
2E+9
1E+9
2
0E+0
3E+9
insert
delete
search
0E+0
insert
Unsorted schemes achieve the best performance
• For insert intensive workload: unsorted-leaf
• For insert & delete intensive workload : unsorted-leaf with bitmap
delete
search
PCM-friendly DB algorithms
Hash Joins
• Simple
Hash Join algorithms
Two representative
– Simple Hash Join
– Cache Partitioning
Hash Table
########
R
Build Phase
########
Probe Relation
########
########
•Problem – too many cache misses
– Build and probe hash table(exceeds CPU cache size)
– Small record size
S
PCM-friendly DB algorithms
Hash Joins
• Cache Partitioning
Join Phase
R
Partition Phase
•Problems :Too many writes!
R1
S1
R2
S2
R3
S3
R4
S4
Partition Phase
S
PCM-friendly DB algorithms
Hash Joins
• Virtual Partitioning(PCM-friendly DB algorithms)
Partition phase
S’1
R’1
R
Virtual
partitioning
R’2
S’1
R’3
S’1
R’4
S’1
Virtual
Partitioning
Store record ID
S
PCM-friendly DB algorithms
Hash Joins
• Virtual Partitioning (PCM-friendly DB algorithms)
Join phase
Hash table
########
R’1
Build
########
Probe
S’1
########
########
R
S
• Good CPU cache performance
• Reducing writes
Experimental evaluation
Hash Join
• Relations R and S are in main memory(PCM)
• R(50MB) joins S(100MB) (2 matches per R record)
• Varying record size from 20B to 100B
Total wear
Execution time
PCM energy
1E+10
8E+9
30
cycles
energy (mJ)
40
20
10
0
6E+9
4E+9
2E+9
20B 40B 60B 80B 100B
record size
0E+0
20B 40B 60B 80B 100B
record size
PCM for auxiliary memory
CPU
L1/L2 Cache
CPU
L1/L2 Cache
Memory Controller
DRAM
DRAM
PCM write
buffer
SSD/HDD
PCM
HDD/SSD Disk
PCM as a write buffer for HDD/SSD DISK
PCM as secondary storage
[DAC’09]
[CIKM’11]
[TCDE’10]
[VLDB’11]
PCM for auxiliary memory
• PCM as a write buffer for HDD/SSD DISK
PCMLogging: Reducing Transaction Logging
Overhead with PCM(CIKM2011)
• PCM as secondary storage
– Accelerating In-Page Logging with Non-Volatile
Memory(TCDE2010)
– IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
PCM for auxiliary memory
• PCM as a write buffer for HDD/SSD DISK
PCMLogging: Reducing Transaction Logging
Overhead with PCM(CIKM2011)
• Motivation
Buffering dirty page and transaction logging to
minimize disk I/Os
PCM forPCMBasic
auxiliary memory
• •PCMBasic
Two schemes
– PCMBasic
– PCMLogging
•Cons:
−Data redundancy
−Space management on PCM
DRAM
Dirty pages
Write log
PCM
Buffer
pool
DISK
Log pool
PCMLogging
• PCMLogging
– Eliminate explicit logs (REDO and UNDO log)
– Integrate implicit logs into buffered updated(shadow pages)
DRAM
P
MetaData
P1
P1
P1
P2
P2
…
DISK
MetaData
PCM
PCMLogging
• Overview
– DRAM
• Mapping Table(MT):map logial page to physical page
– PCM
• Page format
• FreePageBitmap
• ActiveTxList
Page Content
XXXXXXXXXX
MetaData
XID
PID
PCMLogging
•Overview
PCMLogging
• PCMLogging Operation
Two additional data structures in the main memory to support
undo memory
• Transaction Table(TT)
Record all in-progress transaction and their corresponding
dirty pages in DRAM and PCM
• Dirty Page Table(DPT)
Keep track of the previous version for each PCM “overwritten”
by an in-progress transaction
PCMLogging
• Flushing Dirty pages to PCM
– Add XID to ActiveTxList before writing dirty page to PCM
– If page P exists in the PCM, do not overwrite and create an out-ofplace P’
T3 update P5
PCMLogging
•• Abort
Commit
–– discard
flush allitsitsdirty
dirtypages
pagesand restore previous data
–– Modify
Modifymetadata:
metadata:
PCMLogging
• Tuple-based Buffering
– In the PCM
• the buffer slots be managed in the unit of tuples,
• To manage the free space, employ a slotted directory instead of a
bitmap
– In the DRAM
• Mapping Table, we still keep track of dirty pages, but maintain the
mappings for the buffered tuples in each dirty page
– Merge tuples with the corresponding page of the disk
• read/write request
• move committed tuples from PCM to the external disk
Experimental evaluation
•
•
•
•
Simulator based on DiskSim
TPC-C benchmark
DRAM 64MB
Tuple-based
(PL=PCMLogging)
PCM for auxiliary memory
• PCM as secondary storage
– Accelerating In-Page Logging with Non-Volatile
Memory(TCDE2010)
– IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
• Motivation
IPL scheme with PCRAM can improve the performance of flash
memory database systems by storing frequent log records in
PCRAM
Design of Flash-Based DBMS: An In-Page Logging Approach(SIGMOD2007)
In-Page Logging
• Introduction
– Updating a single record may result in invalidating the
current page
– Sequential logging approaches incur expensive
merge operation
– Co-locate a data page and its log records in the same
physical block
Design of Flash-Based DBMS: An In-Page Logging Approach(SIGMOD2007)
In-Page Logging
update
Database Buffer
in-memory
data page
(8K)
log
in-memory
log sector
(512B)
Flash Memory
…
15 data pages
Physical block(128K)
log
…
log
Log region(8K)
16 sectors(512B)
In-Page Logging
update
Database Buffer
in-memory
data page
(8K)
in-memory
log sector
(512B)
log
Flash Memory
…
+
log
merge
…
…
log
log
…
log
In-Page Logging
• Cons
– Units of write log is a sector(512B)
– Only SLC-type NAND flash supports partial programming
– The amount of log records for a page is usually small
In-Page Logging
• Pros
–
–
–
–
log records can be flushed in a finer granularity
the low latency of flushing log records
PCRAM is faster than flash memory for small reads
SLC or MLC flash memory can be used for IPL policy.
Experimental evaluation
Accelerating In-Page Logging with Non-Volatile
Memory(TCDE2010)
•
•
•
•
A trace-driven simulation
Implement an IPL module to the B+-tree based Berkeley DB
Million key-value records insert/search
Log sector in memory(128B/512B)
Experimental evaluation
IPL-P: In-Page Logging with PCRAM (VLDB2011 demo)
• Hardware platform
– PCRAM(512M,the granularity:128B)
– Intel X25-M SSD (USB interface)
• Workload
– Million key-value records insert/search/update
– B+-tree based Berkeley DB
– Page size :8KB
Outline
• Introduction
• Integrating PCM into the Memory Hierarchy
− PCM for main memory
− PCM for auxiliary memory
• Conclusion
Conclusion
• PCM is expected to play an important role in the
memory hierarchy
• It is important to consider read/write asymmetry of
PCM when design PCM-friendly algorithms
• Integrating PCM into Hybrid memory might be more
practical
• If we use PCM as main memory,we had to revise some
system application(e.g. Main Memory Database
Systems )to address PCM-specific challenges.
Thank You!
Download