Slides (ppt - Carnegie Mellon University

advertisement
Threshold Voltage Distribution in MLC NAND
Flash: Characterization, Analysis, and Modeling
Yu Cai1, Erich F. Haratsch2, Onur Mutlu1, and Ken Mai1
1. DSSC, ECE Department, Carnegie Mellon University
2. LSI Corporation
3/20/2013
Evolution of NAND Flash Memory


Aggressive scaling
MLC technology
Increasing capacity
Acceptable low cost
High speed
Low power consumption
Compact physical size
E. Grochowski et al., “Future technology challenges for NAND flash and HDD products”,
Flash Memory Summit 2012
2
Challenges: Reliability and Endurance

P/E cycles (required)
Complete write of
drive 10 times per
day for 5 years
(STEC)
> 50k P/E cycles

P/E cycles (provided)
A few thousand
E. Grochowski et al., “Future technology challenges for NAND flash and HDD products”,
Flash Memory Summit 2012
3
Solutions: Future NAND Flash-based Storage
Architecture
Noisy
Memory
Signal
Processing
Raw Bit
Error Rate
• Read voltage adjusting
• Data scrambler
• Data recovery
• Shadow program
Error
Correction
BER < 10-15
• BCH codes
• Reed-Solomon codes
• LDPC codes
• Other Flash friendly codes
Need to understand NAND Flash Error Patterns/Channel Model
Need to design efficient DSP/ECC and smart error management
4
NAND Flash Channel Modeling
Write
(Tx)
Noisy NAND
Read
(Rx)
Simplified NAND Flash channel model based on dominant errors
Write
Additive White
Gaussian Noise
 Erase operation
 Program page operation
Cell-to-Cell
Interference
 Neighbor page program
5
Time-variant
Retention
 Retention
Read
Testing Platform
USB Board
PCI-e Board
HAPS-52 Motherboard
Virtex-5 FPGA
(NAND Controllers)
Flash Board
6
Flash Chip
Characterizing Cell Threshold w/ Read Retry
Erased State
Programmed States
#cells
11
REF1
0V
REF2
REF3
P1
P2
10
00
0100
i-2 i-1 i i+1 i+2
P3
01
Read Retry
 Read-retry feature of new NAND flash
 Tune read reference voltage and check which Vth region of cells
 Characterize the threshold voltage distribution of flash cells in
programmed states through Monte-Carlo emulation
7
Vth
Programmed State Analysis
P3 State
P2 State
P1 State
8
Parametric Distribution Learning
 Parametric distribution

Closed-form formula, only a few number of parameters to be stored
 Exponential distribution family
Distribution parameter vector
 Maximum likelihood estimation (MLE) to learn parameters
Observed testing data
Likelihood
Function
Goal of MLE: Find distribution parameters to maximize likelihood function
9
Selected Distributions
10
Distribution Exploration
RMSE
P1 State
P2 State
P3 State
Beta
Gamma
Gaussian
Log-normal Weibull
19.5%
20.3%
22.1%
24.8%
28.6%
Distribution can be approx. modeled as Gaussian distribution
11
Noise Analysis
 Signal and additive noise decoupling
 Power spectral density analysis of P/E noise
Flat in frequency domain
 Auto-correlation analysis of P/E noise
Spike at 0-lag point in time domain
12
Approximately can be
modeled as white noise
Independence Analysis over Space
 Correlations among cells in different locations are low (<5%)
 P/E operation can be modeled as memory-less channel
 Assuming ideal wear-leveling
13
Independence Analysis over P/E cycles
 High correlation btw threshold in same location under P/E cycles
 Programming to same location modeled as channel w/ memory
14
Cycling Noise Analysis
P1 State
P2 State
As P/E cycles increase ...
Distribution shifts to the right
Distribution becomes wider
15
P3 State
Cycling Noise Modeling
Mean value (µ) increases with P/E cycles
Exponential model
Standard deviation value (σ) increases with P/E cycles
Linear model
16
SNR Analysis
 SNR decreases linearly with P/E cycles
 Degrades at ~ 0.13dB/1000 P/E cycles
17
Conclusion & Future Work
 P/E operations modeled as signal passing thru AWGN channel
 Approximately Gaussian with 22% distortion
 P/E noise is white noise
 P/E cycling noise affects threshold voltage distributions
 Distribution shifts to the right and widens around the mean value
 Statistics (mean/variance) can be modeled as exponential correlation with
P/E cycles with 95% accuracy
 Future work
 Characterization and models for retention noise
 Characterization and models for program interference noise
18
Backup Slides
19
Hard Data Decoding
 Read reference voltage can affect the raw bit error rate
f(x)
g(x)
f(x)
g(x)
Vth
Vth
v0
BER1  

vref
vref
v1
v0 v’ref v1
f ( x)dx  
vref

BER2  
g ( x)dx

v 'ref
f ( x)dx  
v 'ref

g ( x)dx
 There exists an optimal read reference voltage
 Optimal read reference voltage is predictable

Distribution sufficient statistics are predictable (e.g. mean, variance)
20
Soft Data Decoding
 Estimate soft information for soft decoding (e.g. LDPC codes)
f(x)
log likelihood ratio
(LLR)
g(x)
LLR( y )  log(
Vth
v0
vref v1
High
High
Confidence
Confidence
Low Confidence
 Closed-form soft information for AWGN channel

Assume same variance to show a simple case
21
Sensed threshold
voltage range
p( x  1 | y )
)
p( x  0 | y )
Non-Parametric Distribution Learning
 Non-parametric distribution
Kernel Function
 Histogram estimation
Volume of a hypercube
Count the number of K of
of side h in D dimensions points falling within the h region
 Kernel density estimation
Smooth Gaussian
Kernel Function
 Summary


Pros: Accurate model with good predictive performance
Cons: Too complex, too many parameters need to be stored
22
Probability Density Function (PDF)
P1 State
P2 State
P3 State
 Probability density function (PDF) of NAND flash memory
estimation using non-parametric kernel density methodology
23
Download