Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation Yu Cai1 Onur Mutlu1 Erich F. Haratsch2 Ken Mai1 1 Carnegie Mellon University 2 LSI Corporation Flash Challenges: Reliability and Endurance P/E cycles (provided) A few thousand P/E cycles (required) Writing the full capacity of the drive 10 times per day for 5 years (STEC) E. Grochowski et al., “Future technology challenges for NAND flash and HDD products”, Flash Memory Summit 2012 > 50k P/E cycles 2 NAND Flash Memory is Increasingly Noisy Write Noisy NAND Read 3 Future NAND Flash-based Storage Architecture Noisy Memory Signal Processing Raw Bit Error Rate Lower High Error Correction Uncorrectable BER < 10-15 Our Goals: Model NAND Flash as a digital communication channel Design efficient reliability mechanisms based on the model 4 NAND Flash Channel Model Write (Tx Information) Noisy NAND Read (Rx Information) Simplified NAND Flash channel model based on dominant errors Write Additive White Gaussian Noise Erase operation Program page operation Cell-to-Cell Interference Neighbor page program Cai et al., “Threshold voltage distribution in MLC NAND Flash Memory: Characterization, Analysis, and Modeling”, DATE 2013 ? Time Variant Retention Read Retention Cai et al., “Flash Correct-and-Refresh: Retention-aware error management for increased flash memory lifetime”, ICCD 2012 5 Outline Background on Program Interference Characterization of Program Interference Modeling and Predicting Program Interference Mitigation of Program Interference Read Reference Voltage Prediction Conclusions 6 How Current Flash Cells are Programmed Programming 2-bit MLC NAND flash memory in two steps ER (11) 1 Vth 0 ER (11) 1 Temp (0x) 0 0 ER (11) LSB Program P1 (10) Vth 1 P2 (00) P3 (01) MSB Program Vth 7 Basics of Program Interference (n+1,j-1) (n+1,j) (n+1,j+1) WL<2> ∆Vxy ∆Vy ∆Vxy MSB:6 WL<1> ∆Vx MSB:4 Victim Cell ∆Vx (n,j) WL<0> ∆Vxy ∆Vy ∆Vxy (n-1,j-1) LSB:3 LSB:1 MSB:2 LSB:0 (n-1,j) (n-1,j+1) Traditional model of victim cell threshold voltage changes when neighbor cells are programmed Vvictim (2Cx Vx Cy Vy 2CxyVxy ) / Ctotal 8 Previous Work Summary No previous work experimentally characterized and modeled threshold voltage distributions under program interference Previous modeling work Assumes linear correlation between the program interference induced threshold voltage change of the victim cell and the threshold voltage changes of the aggressor cells Coupling capacitance and total capacitance of each flash cell are the key coefficients of the model, which are process and design dependent random variables Their exact capacitance values are difficult to determine Previously proposed model cannot be realistically applied in flash controller 9 Outline Background on Program Interference Characterization of Program Interference Modeling and Predicting Program Interference Mitigation of Program Interference Read Reference Voltage Prediction Conclusions 10 Characterization Hardware Platform Cai et al., “FPGA-Based Solid-State Drive Prototyping Platform”, FCCM 2011 11 Characterization Studies Bitline to bitline program interference Wordline to wordline program interference Program in page order Program out of page order 12 Bitline to Bitline Program Interference Victim Cell P1 State Wordline ( N+1 ) P2 State P3 State Wordline (N) Wordline ( N-1 ) L X R L= { P0, P1, P2, P3 } R= { P0, P1, P2, P3 } Vth distributions of victim cells under 16 ( 4 x 4) different neighbor values {L, R} almost overlap Bitline to bitline program interferences are small 13 WL to WL Interference with In-Page-Order Programming Wordline ( N+1 ) Wordline (N) Wordline ( N-1 ) Victim Word-line Program interference increases the threshold voltage of victim cells and causes threshold voltage distributions shift to the right and become wider Program interference depends on the locations of aggressor cells in a block Direct neighbor wordline program interference is the dominant source of interference Neighbor bitline and far-neighbor wordline interference are orders of magnitude lower 14 WL to WL Interference with Out-of-Page-Order Programming Wordline ( N+1 ) Wordline (N) Wordline ( N-1 ) Victim Word-line The amount of program interference depends on the programming order of pages in a block In-page-order programming likely causes the least amount of interference Out-of-page-order programming causes much more interference 15 Comparison under Various Program Interference Signal-to-noise ratio comparison Out-of-page-order Programming 16 Data Value Dependence of Program Interference 80 MSB Page programmed in aggressor cell Victim <10> Victim <00> Victim <01> Victim Vth Increase 60 40 20 0 aggressor <11> aggressor<10> aggressor<00> aggressor <01> The amount of program interference depends on the values of both the aggressor cells and the victim cells 17 Outline Background on Program Interference Characterization of Program Interference Modeling and Predicting Program Interference Mitigation of Program Interference Read Reference Voltage Prediction Conclusions 18 Linear Regression Model Feature extraction for Vth changes based on characterization Threshold voltage changes on aggressor cell Original state of victim cell Enhanced linear regression model Vvictim(n, j ) jK nM before ( x , y ) V ( x , y ) V neighbor 0 victim (n, j ) y j K x n 1 Y X (vector expression) Maximum likelihood estimation of the model coefficients arg min( X Y 2 2 1) 19 Model Coefficient Analysis Direct above cell dominance Direct diagonal neighbor second Far neighbor interference exists Victim cell’s Vth has negative affect 20 Model Accuracy Evaluation (x,y)=(measured before interference, measured after interference) Ideal if no interference With Systematic Deviation (x,y)=(measured before interference, predicted with model) Ideal if prediction is 100% accurate Without Systematic Deviation 21 Distribution of Program Interference Noise Program interference noise follows multi-modal Gaussian-mixture distribution 22 Program Interference vs P/E Cycles Program interference noise distribution does not change significantly with P/E cycles 23 Outline Background on Program Interference Characterization of Program Interference Modeling and Predicting Program Interference Mitigation of Program Interference Read Reference Voltage Prediction Conclusions 24 Optimum Read Reference for Flash Memory Read reference voltage can affect the raw bit error rate f(x) g(x) f(x) State-A State-B g(x) State-A State-B Vth Vth v0 BER1 vref vref v1 v0 v’ref v1 f ( x)dx vref g ( x)dx BER2 v 'ref f ( x)dx v 'ref g ( x)dx There exists an optimal read reference voltage Predictable if the statistics (i.e. mean, variance) of threshold voltage distributions are characterized and modeled 25 Optimum Read Reference Voltage Prediction Learning function (periodically, every ~1k P/E cycles) Program known data pattern and test Vth Program aggressor neighbor cells and test victim Vth after interference Optimum read reference voltage prediction Default read reference voltage + Program interference noise mean Raw bit error rate Evaluation Results 30% lifetime improvement 32k-bit BCH Code (acceptable BER = 2x10-3) No read reference voltage prediction With read reference voltage prediction Read reference voltage prediction can reduce raw BER and increase the P/E cycle lifetime Outline Background of Program Interference Program Interference Characterization Modeling and Predicting Program Interference Read Reference Voltage Prediction to Mitigate Program Interference Conclusions 28 Key Findings and Contributions Methodology: Extensive experimentation with real 2Y-nm MLC NAND Flash chips Amount of program interference is dependent on Location of cells (programmed and victim) Data values of cells (programmed and victim) Programming order of pages Our new model can predict the amount of program interference with 96.8% prediction accuracy Our new read reference voltage prediction technique can improve flash lifetime by 30% 29 Program Interference in MLC NAND Flash Memory: Characterization, Modeling, and Mitigation Yu Cai1 Onur Mutlu1 Erich F. Haratsch2 Ken Mai1 1 Carnegie Mellon University 2 LSI Corporation