Testing Semiconductor Memories Cheng-Wen Wu 吳誠文 Lab for Reliable Computing Dept. Electrical Engineering National Tsing Hua University Outline Introduction RAM functional fault models and test algorithms RAM fault-coverage analysis Cocktail-March for testing wordoriented memories Testing multi-port RAMs Testing CAMs Testing flash memories mbist1.10 Cheng-Wen Wu, NTHU 2 Introduction Memory testing is a more and more important issue RAMs are key components for electronic systems Memories represent about 30% of the semiconductor market Embedded memories are dominating the chip yield Memory testing is more and more difficult Growing density, capacity, and speed Emerging new architectures and technologies Embedded memories: access, diagnostics & repair, heterogeneity, custom design, power & noise, scheduling, compression, etc. Cost drives the need for more efficient test methodologies IFA, fault modeling and simulation, test algorithm development and evaluation, diagnostics, DFT, BIST, BIRA, BISR, etc. Test automation is required Failure analysis, fault simulation, ATG, and diagnostics BIST/BIRA/BISR generation mbist1.10 Cheng-Wen Wu, NTHU 3 Typical RAM Production Flow Wafer Marking Final Test mbist1.10 Full Probe Test Laser Repair Post-BI Test Burn-In (BI) Visual Inspection QA Sample Test Cheng-Wen Wu, NTHU Packaging Pre-BI Test Shipping 4 Scope of RAM Testing Parametric Test: DC & AC Reliability Screening Long-cycle testing Burn-in: static & dynamic BI Functional Test Device characterization Failure analysis Fault modeling Simple but effective (accurate & realistic?) Test algorithm generation Small number of test patterns (data backgrounds) High fault coverage Short test time mbist1.10 Cheng-Wen Wu, NTHU 5 RAM Models Behavior Level Verilog/VHDL Function Level Verilog/VHDL/Block diagram Normally not synthesizable Circuit Level Spice/Schematic Layout Level GDS-II/Geometry ☞Who should provide the model? mbist1.10 Cheng-Wen Wu, NTHU 6 Memory Function Model Example mbist1.10 Cheng-Wen Wu, NTHU 7 RAM Fault Models (Static) Address-Decoder Fault (AF) No cell accessed by certain address Multiple cells accessed by certain address Certain cell not accessed by any address Certain cell accessed by multiple addresses Stuck-At Fault (SAF) Cell (line) SA0 or SA1 Transition Fault (TF) Cell fails to transit from 0 to 1 or 1 to 0 mbist1.10 Cheng-Wen Wu, NTHU 8 RAM Fault Models (Static) Bridging Fault (BF) Short between cells AND type or OR type Stuck-Open Fault (SOF) Cell not accessible due to broken line Neighborhood Pattern Sensitive Fault (NPSF) Active (Dynamic) NPSF Passive NPSF W Static NPSF mbist1.10 Cheng-Wen Wu, NTHU N BC S E 9 RAM Fault Models (Static) Coupling Fault (CF) State Coupling Fault (CFst) Coupled (victim) cell is forced to 0 or 1 if coupling (aggressor) cell is in given state Inversion Coupling Fault (CFin) Transition in coupling cell complements (inverts) coupled cell Idempotent Coupling Fault (CFid) Coupled cell is forced to 0 or 1 if coupling cell transits from 0 to 1 or 1 to 0 mbist1.10 Cheng-Wen Wu, NTHU 10 RAM Fault Models (Dynamic) Recovery Fault (RF) Sense Amplifier Recovery Fault (SARF) Sense amp saturation after reading/writing long run of 0 or 1 Write Recovery Fault (WRF) Write followed by reading/writing at different location resulting in reading/writing at same location Write-after-write recovery fault Read-after-write recovery fault Results in functional faults---detected at high speed (e.g., GALROW/GALCOL) Disturb Fault (DF) Victim cell forced to 0 or 1 if we read or write aggressor cell (may be the same cell) mbist1.10 Cheng-Wen Wu, NTHU 11 RAM Fault Models (Dynamic) Data Retention Fault (DRF) DRAM Refresh Fault Refresh-Line Stuck-At Fault Leakage Fault Sleeping Sickness---loose data in less than specified hold time (typically tens of ms) SRAM Leakage Fault Static Data Losses---defective pull-up Checkerboard pattern triggers max leakage BIST good for sync with refresh mechanism mbist1.10 Cheng-Wen Wu, NTHU 12 Test Time Complexity (100MHz) S ize N lo g N 1M 0 .0 1 s 0 .1 s 0 .2 s 11s 3h 16M 0 .1 6 s 1 .6 s 3 .9 s 11m 33d 64M 0 .6 6 s 6 .6 s 17s 1 .5 h 1 .4 3 y 256M 2 .6 2 s 26s 1 .2 3 m 12h 23y 1G 1 0 .5 s 1 .8 m 5 .3 m 4d 366y 4G 42s 7m 2 2 .4 m 32d 57c 2 .8 m 28m 1 .6 h 255d 915c mbist1.10 Cheng-Wen Wu, NTHU N 2 10N 16G N 1 .5 N 13 RAM Test Algorithm A test algorithm (or simply test) is a finite sequence of test elements A test element contains a number of memory operations (access commands) Data pattern (background) specified for the Read operation Address (sequence) specified for the Read and Write operations A march test algorithm is a finite sequence of march elements A march element is specified by an address order and a number of Read/Write operations mbist1.10 Cheng-Wen Wu, NTHU 14 Classical Test Algorithms Zero-One Algorithm [Breuer & Friedman 1976] Also known as MSCAN For SAF Solid background (pattern) Complexity is 4N { ( w 0 ); ( r 0 ); ( w1); ( r1)} mbist1.10 Cheng-Wen Wu, NTHU 15 Classical Test Algorithms Checkerboard Algorithm Zero-one algorithm with checkerboard pattern Complexity is 4N For SAF and DRF 1 0 1 mbist1.10 0 1 0 1 0 1 Cheng-Wen Wu, NTHU 16 Classical Test Algorithms Galloping Pattern (GALPAT) Complexity is 4N**2---only for characterization All AFs,TFs, CFs, and SAFs are located 1. Write background 0; 2. For BC = 0 to N-1 { Complement BC; For OC = 0 to N-1, OC != BC; { Read BC; Read OC; } Complement BC; } 3. Write background 1; 4. Repeat Step 2; mbist1.10 Cheng-Wen Wu, NTHU BC 17 Classical Test Algorithms Sliding (Galloping) Row/Column/Diagonal Based on GALPAT, but instead of a bit, a complete row, column, or diagonal is shifted Complexity is 4N**1.5 1 1 1 1 1 mbist1.10 Cheng-Wen Wu, NTHU 18 Classical Test Algorithms Butterfly Algorithm Complexity is 5NlogN 1. Write background 0; 2. For BC = 0 to N-1 { Complement BC; dist = 1; While dist <= mdist /* mdist < 0.5 col/row length */ 6 { Read cell @ dist north from BC; Read cell @ dist east from BC; 1 Read cell @ dist south from BC; 9 4 5 ,1 0 2 Read cell @ dist west from BC; 3 Read BC; dist *= 2; } 8 Complement BC; } 3. Write background 1; repeat Step 2; mbist1.10 Cheng-Wen Wu, NTHU 7 19 Classical Test Algorithms Moving Inversion (MOVI) Algorithm [De Jonge & Smeulders 1976] For functional and AC parametric test Functional (13N): for AF, SAF, TF, and most CF { ( w 0 ); ( r 0 , w 1, r 1 ); ( r 1, w 0 , r 0 ); ( r 0 , w 1, r 1 ); ( r 1, w 0 , r 0 )} Parametric (12NlogN): for Read access time 2 successive Reads @ 2 different addresses with different data for all 2-address sequences differing in 1 bit Repeat T2~T5 for each address bit GALPAT---all 2-address sequences mbist1.10 Cheng-Wen Wu, NTHU 20 Classical Test Algorithms Surround Disturb Algorithm Examine how the cells in a row are affected when complementary data are written into adjacent cells of neighboring rows 1. For each cell[p,q] /* row p and column q */ { Write 0 in cell[p,q-1]; Write 0 in cell[p,q]; Write 0 in cell[p,q+1]; Write 1 in cell[p-1,q]; Read 0 from cell[p,q+1]; Write 1 in cell[p+1,q]; Read 0 from cell[p,q-1]; Read 0 from cell[p,q]; } 2. Repeat Step 1 with complementary data; mbist1.10 Cheng-Wen Wu, NTHU 1 0 0 0 1 21 Classical Test Algorithms Zero-one and checkerboard algorithms do not have sufficient coverage Other algorithms are too time-consuming for large RAM Test time is the key factor of test cost Complexity ranges from N2 to NlogN Need linear-time test algorithms with small constants March test algorithms mbist1.10 Cheng-Wen Wu, NTHU 22 March Tests Zero-One (MSCAN) { ( w 0 ); ( r 0 ); ( w1); ( r1)} Modified Algorithmic Test Sequence (MATS) [Nair, Thatte & Abraham 1979] OR-type address decoder fault { ( w 0 ); ( r 0 , w1); ( r1)} AND-type address decoder fault { ( w1); ( r1, w 0 ); ( r 0 )} MATS+ [Abadir & Reghbati 1983] For both OR- & AND-type AFs and SAF { ( w 0 ); ( r 0 , w 1); ( r1, w 0 )} mbist1.10 Cheng-Wen Wu, NTHU 23 March Tests Marching 1/0 [Breuer & Friedman 1976] For AF, SAF, and TF { ( w 0 ); ( r 0 , w 1, r 1); ( r 1, w 0 , r 0 ); ( w 1); ( r 1, w 0 , r 0 ); ( r 0 , w 1, r 1)} MATS++ [Goor 1991] Also for AF, SAF, and TF Complete and irredundant { ( w 0 ); ( r 0 , w1); ( r1, w 0 , r 0 )} mbist1.10 Cheng-Wen Wu, NTHU 24 March Tests March X For AF, SAF, TF, & CFin { ( w 0 ); ( r 0 , w1); ( r1, w 0 ); ( r 0 )} March C [Marinescu 1982] For AF, SAF, TF, & all CFs---redundant { ( w 0 ); ( r 0 , w 1); ( r1, w 0 ); ( r 0 ); ( r 0 , w 1); ( r 1, w 0 ); ( r 0 )} March C- [Goor 1991] Also for AF, SAF, TF, & all CFs---irredundant { ( w 0 ); ( r 0 , w 1); ( r1, w 0 ); ( r 0 , w 1); ( r1, w 0 ); ( r 0 )} mbist1.10 Cheng-Wen Wu, NTHU 25 March Tests Limitations Sequential faults in address decoders RF NPSF (9N-2) for 2-CF [Marinescu 1982] (2NlogN+11N) for 3-CF [Cockburn 1994] Solutions Address sequence variation Hopping Pseudorandom mbist1.10 Cheng-Wen Wu, NTHU 26 Coverage of March Tests SAF TF AF SOF C F in C F id C F st M AT S + + M arch X M arch Y M arch C 1 1 1 1 1 1 1 1 1 1 1 1 1 .002 1 .002 .75 1 1 1 .375 .5 .5 1 .5 .625 .625 1 ☞ Extended March C- (11N) has a 100% coverage of SOF mbist1.10 Cheng-Wen Wu, NTHU 27 Testing Word-Oriented RAM Background bit is replaced by background word MATS++: { ( wa ); ( ra , wa ' ); ( ra ' , wa , ra )} Conventional method is to use logm+1 different backgrounds for m-bit words m=8: 00000000, 01010101, 00110011, and 00001111 Apply the test algorithm logm+1=4 times, so complexity is 4*6N/8=3N mbist1.10 Cheng-Wen Wu, NTHU 28 Cocktail-March Algorithms Motivation: Repeating the same algorithm for all logm+1 backgrounds has redundancy Different algorithm targets different faults Approach: Use multiple backgrounds in a single algorithm run Merge and forge different algorithms and backgrounds into a single algorithm Good for word-oriented memories mbist1.10 Cheng-Wen Wu, NTHU 29 March-CW Algorithm: March C- for solid background (0000) Then a 5N March for each of other standard backgrounds (0101, 0011): Result: { ( wa , wa ' , ra ' , wa , ra )} Complexity is (10+5logW)N, where W is word length and N is word count Test time is reduced by 39% if W=4, as compared with extended March C Improvement increases as W increases mbist1.10 Cheng-Wen Wu, NTHU 30 Comparison (Full Coverage) mbist1.10 Cheng-Wen Wu, NTHU 31 Testing NPSF NPSF test approaches Tiling Multi-background march Easy BIST implementation 5-cell neighborhood B S E W N B S E W N B E B ase cell: B D eleted neighborhood cells: N , E, W , S N eighborhood cells: B and N , E, W , S mbist1.10 Cheng-Wen Wu, NTHU 32 NPSF Models Static NPSF (SNPSF) BC forced to a certain state due to a certain deleted neighborhood (DN) pattern Passive NPSF (PNPSF) BC frozen due to a certain DN pattern Active NPSF (ANPSF) BC content changes due to a change in DN pattern Change: a transition in one DN cell, with other DN cells & BC containing a certain pattern Assumptions: Single NPSF Address scramble table is available Memory is bit-oriented Word-oriented memory is tested as multiple bit-oriented ones mbist1.10 Cheng-Wen Wu, NTHU 33 Test Strategy Multi-Background March To generate all neighborhood patterns Solid BG (FC < 30%) Another BG 0 0 0 0 0 0 0 0 0 0 0 0 0 X 0 0 0 0 0 0 0 0 X 1 1 1 1 1 1 1 1 X 0 0 0 0 1 1 1 1 X 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 X 1 1 0 1 0 1 0 0 X 0 0 1 0 0 1 0 0 X 1 1 0 1 1 0 1 1 X 0 0 1 0 mbist1.10 Cheng-Wen Wu, NTHU 34 Testing PNPSF March 17N: ( w a ); ( w b , ra , w a ); ( rb , w a ); (ra , w b ); ( ra ,w b ); (rb , w a ); M arch E lem ents (rb , w a , ra , w b ); (ra ) NW BES A lg1: (w 0); (w 1, r1, w 0); (r0); 0 0 00 A lg2: (w 1); (w 0, r0, w 1); (r1); 1 1 11 A lg3: (w 0); (w 1); (r1); 11 00 A lg4: (w 1); (w 0); (r0); 11 00 A lg5: (w 0); (w 1); (r1); 00 11 A lg6: (w 1); (w 0); (r0); 00 11 mbist1.10 Cheng-Wen Wu, NTHU 35 Data Background Generation Data backgrounds BG1: all zero BG2: Ar[0], LSB of row address BG3: Ar[1], second bit of row address BG4: Ar[0]Ar[1] 00 01 10 11 00 0 0 0 0 01 0 0 0 0 10 0 0 0 0 11 0 0 0 0 B G .1 mbist1.10 00 01 10 11 00 0 0 0 0 01 1 1 1 1 10 0 0 0 0 11 1 1 1 1 B G .2 00 01 10 11 00 0 0 0 0 01 0 0 0 0 10 1 1 1 1 11 1 1 1 1 B G .3 Cheng-Wen Wu, NTHU 00 01 10 11 00 0 0 0 0 01 1 1 1 1 10 1 1 1 1 11 0 0 0 0 B G .4 36 Testing ANPSF (w0) ; (w1) ; (w0) ; ( r0 ,w1,w0) ; (r0) ( r1 ,w0,w1) ; ( r0 ,w1) ; (r1) ( r1 ,w0) ; 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 (r0) 1 0 0 0 1 1 1 0 March 12N: (w a); (w b, ra, w a); (rb, w a); mbist1.10 (ra, w b); (rb, w a, w b); (ra) Cheng-Wen Wu, NTHU 37 Time Complexity 00 01 10 11 00 01 10 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 01 10 11 00 01 10 11 0 1 0 1 BG.1 0 0 0 0 0 0 0 0 1 1 1 1 BG.5 0 1 0 1 1 0 1 0 00 01 10 11 0 1 0 1 BG.2 1 1 1 1 00 01 10 11 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 00 01 10 11 00 01 10 11 0 0 0 0 BG.3 00 01 10 11 00 01 10 11 00 01 10 11 1 0 1 0 00 01 10 11 1 0 1 0 BG.6 0 0 0 0 1 1 1 1 1 1 1 1 BG.7 0 0 0 0 1 1 1 1 BG.4 00 01 10 11 00 01 10 11 1 1 1 1 0 0 0 0 00 01 10 11 00 01 10 11 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 BG.8 12 N/BG X 8 BG = 96N Detects all NPSFs mbist1.10 Cheng-Wen Wu, NTHU 38 Multi-Port Memories Popular architectures k-port (k > 1) n-read-1-write FIFO Address A Port A Data A Control A Storage Address B Port B Data B Control B mbist1.10 Cheng-Wen Wu, NTHU 39 2-Port Topology BL A W LA BL A BL A BL A 3 Interport W L short W LA W LB 1 W LB Interport BL short W LA 2 W LA BL B mbist1.10 BL B BL B Cheng-Wen Wu, NTHU W LB W LB BL B 40 Inter-Port Word-Line Short F au lt-F ree Port A Port B Address 1 Address 2 F au lty Cell 1 Address 1 Cell 1 Address 2 Cell 2 Address 3 Cell 3 Cell 2 * Functional test complexity: O(N3) mbist1.10 Cheng-Wen Wu, NTHU 41 Inter-Port Bit-Line Short F au lt-F ree Port A Port B Address Address F au lty Address Cell Address Cell Address Cell Address Cell Cell Cell * Functional test complexity: O(N2) mbist1.10 Cheng-Wen Wu, NTHU 42 Address Scrambling Logical Addr A B C D A - word line select B - I/O select C - bit line select Physical Addr 7 6 5 4 3 2 1 0 8 7 row 6 5 4 3 2 1 0 D - bit position in a word column A d d ress A b it3 b it2 b it1 b it0 D ata w o rd A 3 mbist1.10 0 Cheng-Wen Wu, NTHU 43 Reading Neighboring Cells Read neighboring cells to detect inter-port faults: rN, rS, rE, and rW N W B S mbist1.10 E 0 0 0 1 1 1 0 0/1 1 1 1/0 0 1 1 1 0 0 0 Cheng-Wen Wu, NTHU 44 TAGS-PS March Test Section 1 Port 1 Section 2 Single-port Test Algorithm Multi-port AF Test Section 3 Inter-port Test Port 2 Port m (a) Port 1 Port 2 Single-port Test Algorithm Multi-port AF Test Inter-port MPF Test Port m (b) mbist1.10 Cheng-Wen Wu, NTHU 45 Dual-Port RAM Test Section 1 Section 2 Section 3 Port 1 w0 r0 w 1 r 1 w 0 r0 w 1 r 1 r 1 w 0 r0 r0 r0 w 1 r 1 w 0 - - Port 2 - - - - - - r0 w 1 r 1 w 0 - - - - - - - - - - - - r0 w 1 r 1 w 0 N S r 1r 0 N (a) Section 1 Section 2 Port 1 w0 r0 w 1 r 1 w 0 r0 w 1 r 1 r 1 w 0 r0 Port 2 - r 1r 0 N S N S r 0r 1 - - - - - - r0 r0 w 1 r 1 w 0 - - - - - r0 w 1 r 1 w 0 - - - - Section 3 (b) mbist1.10 Cheng-Wen Wu, NTHU S r 0r 1 46 Compacted Dual-Port RAM Test S e c tio n 1 P o rt 1 w0 P o rt 2 - r0 w 1 N r 1r S 0 r1w0 N r 0r S 1 - - r1 r0 w 1 - - - r0 r0 r1w0 - - S e c tio n 2 (c ) P o rt 1 w0 P o rt 2 - r0 w 1 N r 1r S 0 r1w0 N r 0r S 1 - - - - r0 r0 w 1 r1w0 (d ) * Time complexity: 10N mbist1.10 Cheng-Wen Wu, NTHU 47 Four-Port RAM Test AF Test Port 1 w0 r0 w 1 Port 2 - r 1r Port 3 - r 1r Port 4 - r 1r r 1w 0 - - - - - - - - - - - - r 1w 0 - - - - - - - - r 1w 0 - - - - r0 w 1 r 1w 0 N S r N r S r0 w 1 N S r N r S r 1r N S r N r S r 1r 0 0 0 0 0 0 1 1 1 S N r S r N r0 w 1 S N r S r N r 1r 0 0 0 0 1 1 N S 0 r N 0 r S 1 Inter-port Test * Time complexity: 17N mbist1.10 Cheng-Wen Wu, NTHU 48 Testing 6-Read-1-Write RAM M0 M1 P o rt 1 w0 - w1 P o rt 2 - r0 r P o rt 3 - r P o rt 4 M2 - M3 w0 - w1 r N - r N r1 r N r N - r N - S r0 r N r S r N r N - r N - r S r r N r S r N r N - r N - S r1 r S r0 r N r1 r N r0 - r1 - S r1 r S r N r1 r N r0 - r1 - r1 r S r - r0 r S r N r r r S r N P o rt 5 - P o rt 6 - r0 r P o rt 7 - r r w0 - S 1 - N r N w1 r r0 r 1 - N S N w0 M6 r1 r r1 r 1 - M5 N S N M4 0 0 0 0 0 0 0 0 1 1 1 1 1 1 T e st fo r p o rts o f d ista n c e 1 S 1 S 1 S 1 r 0 0 0 0 0 0 0 0 1 1 1 1 1 1 T e st fo r p o rts o f d ista n c e 2 1 1 1 1 0 0 0 0 T e st fo r p o rts o f d ista n c e 3 * Time complexity: 13N mbist1.10 Cheng-Wen Wu, NTHU 49 Flash Memory Testing Testing nonvolatile memories: Masked ROM---exhaustive; pseudorandom PROM (OTP) & EPROM---dummy row EEPROM & flash memory---dummy row? Testing flash memory core is hard Customized core and I/O Isolation (accessibility) Reliability issues: disturbances, over program/erase, under program/erase, data retention, cell endurance, etc. Long program/erase time mbist1.10 Cheng-Wen Wu, NTHU 50 Flash Memory Overview Flash memory can be programmed and erased electrically Has the advantages of EPROM and EEPROM A stacked gate transistor with both the control gate (CG) and floating gate (FG): Control gate Floating gate Drain Source n+ n+ P-Si mbist1.10 D G S Cheng-Wen Wu, NTHU 51 Flash Memory Program & Erase • Program(1 to 0): channel hot-electron (CHE) injection or FowlerNordheim (FN) electron tunneling • Erase (0 to 1): FN electron tunneling • By the entire chip or large blocks (flash erasure) • Different products have different program/erase mechanisms Program mbist1.10 Erase Cheng-Wen Wu, NTHU 52 Flash Memory Cell Types Stacked-gate Split-gate Select-gate Operations: Read, Program, Erase (Flash Erase) As opposed to Read and Write in RAM mbist1.10 Cheng-Wen Wu, NTHU 53 Programming Scheme Comparison C H E In je c tio n C h a n n e l F N Tu n n e lin g H ig h p o w e r (d u a l e x te rn a l s u p p lie s ) L o w p o w e r (s in g le e x te rn a l s u p p ly ) L o w o x id e fie ld s tre s s H ig h o x id e fie ld s tre s s F a s te r p ro g ra m o p e ra tio n (b y te p ro g ra m lim ite d b y p o w e r) S lo w e r p ro g ra m o p e ra tio n (im p ro v e d b y pa g e p ro g ra m ) mbist1.10 Cheng-Wen Wu, NTHU 54 NOR-Array Structure mbist1.10 Cheng-Wen Wu, NTHU 55 NAND-Array Structure Select (drain) WL 1 WL 2 WL 3 WL 4 WL 16 Select (source) BL i mbist1.10 Cheng-Wen Wu, NTHU 56 Disturbance Example (I) Program Disturbance BL 0 WL SL BL 1 BL 2 SL BL 3 0V 0 0V 6V Drain-Disturb on "Programmed Cell" WL 10V 1 0V 10V 0V 6V Programming WL 10V 0V 0V Gate-Disturb on "Erased Cell" 2 NOR-Type Common Ground – Standard (Stacked Gate) mbist1.10 Cheng-Wen Wu, NTHU 57 Disturbance Example (II) Read Disturbance BL 0 WL 0 WL 1 SL BL 1 BL 2 SL BL 3 5V 0V 1V Soft-Program on "Selected Cell" WL 2 mbist1.10 Cheng-Wen Wu, NTHU 58 Disturbance Example (III) Program Disturbance BL 0 SSL BL 2 BL 1 3.3V 0V 3.3V 3.3V Gate-Disturb on "Erased Cell" 0V WL 10V 0 2.8V WL 1 0V 18V 18V Program '1' Program '0' 2.8V WL 2 GSL mbist1.10 10V 0V 10V 2.8V Gate-Disturb on "Programmed Cell" 0V 0V Cheng-Wen Wu, NTHU 0V 59 Disturbance Example (IV) Read Disturbance BL 0 BL 1 Erase Disturbance BL 2 5V BL 0 0.7V 5V SSL 5V WL 0 5V 5V WL 0 21V 0V 21V 0V 21V WL 1 0V 0V WL 1 21V 0V 21V 0V 21V WL 2 21V 0V 21V 0V 21V V th =+2V WL 2 5V SSL BL 2 BL 1 Floating Floating V th =-3V 5V soft-program GSL 5V mbist1.10 5V GSL Cheng-Wen Wu, NTHU Floating Floating 60 Gate Program Disturb Fault (GPDF) Conditions: G 1.Victim cell initial value is a logic ‘1’ 2.Aggressor “10” (program) Control Gate S Floating Gate D Victim “10” (program) V(L) Source Drain V(H) V(H) Substrate V(L) B V(Gd) mbist1.10 Cheng-Wen Wu, NTHU 61 Gate Erase Disturb Fault (GEDF) Conditions: G 1.Victim cell initial value is a logic ‘0’ 2.Aggressor “10” (program) Control Gate S Floating Gate Source D Victim “01” (erase) V(L) Drain V(H) V(H) Substrate V(L) B V(Gd) mbist1.10 Cheng-Wen Wu, NTHU 62 Drain Program Disturb Fault (DPDF) Conditions: 1.Victim cell initial value is a logic ‘1’ 2.Aggressor “10” (program) Victim “10” (program) V(H) • During programming, erased cells on unselected rows on a bit-line that is being programmed may have a fairly deep depletion region formed under them V(H) • Electrons entering this depletion region can be accelerated by the electric field and injected over the oxide potential barrier to adjacent floating gates V(L) V(Gd) mbist1.10 Cheng-Wen Wu, NTHU 63 Drain Erase Disturb Fault (DEDF) Conditions: 1.Victim cell initial value is a logic ‘0’ G 2.Aggressor “10” (program) Victim “01” (erase) Control Gate S Floating Gate Source D V(L) Drain V(H) V(H) Substrate V(L) B V(Gd) mbist1.10 Cheng-Wen Wu, NTHU 64 Read Disturb Fault (RDF) Conditions: 1. Occurs on the selected cell G 2. Cell initial value is logic ‘1’ Control Gate S Floating Gate Source Soft-Program Drain Substrate B mbist1.10 D • During the read operation, hot carriers can be injected from the channel into the FG even if at low gate voltages Cheng-Wen Wu, NTHU 65 Over Erase Fault (OEF) Flash memory erase mechanism is not self-limiting Threshold voltage can be low enough to turn the cell into a depletion-mode transistor Fault behavior: An unselected cell in the same bit-line has excessive source-drain leakage current Reading that cell leads to incorrect value (like DEDF) Cannot be programmed correctly (like TF) mbist1.10 Cheng-Wen Wu, NTHU 66 Basic RAM Faults for Flash Memory Address-Decoder Fault (AF) Stuck-At Fault (SAF) Transition Fault (TF) Stuck-Open Fault (SOF) Bridging Fault (BF) Coupling faults need not be considered! Replaced by disturb faults mbist1.10 Cheng-Wen Wu, NTHU 67 Reliability Consideration Reliability characteristics of floating-gate ICs depend on Circuit density, circuit design, and process integrity Memory array type and cell structure Reliability stressing and testing must then be oriented toward determining the relevant failure rates for the particular array under consideration mbist1.10 Cheng-Wen Wu, NTHU 68 Data Retention Fault Retention time: the time from data storage to the time at which a verifiable error is detected from any cause Intrinsic retention times exceed millions of years in the operating temperature range Months at 300°C 1 million years at 150 °C 120 million years at 55 °C Data Retention Fault (DRF) Static leakage Built-in data retention test circuit mbist1.10 Cheng-Wen Wu, NTHU 69 Cell Endurance Fault Endurance: a measure of the ability to meet datasheet specifications as a function of accumulated program/erase cycles Endurance limit is a result of damage to the dielectric around the floating gate caused by electric stresses In many flash devices, the end of endurance is generally caused by hot electron trapping in the charge transport oxide Cell Endurance Fault (CEF) Threshold window shift due to increased program/erase cycles Built-in stress test circuit mbist1.10 Cheng-Wen Wu, NTHU 70 Composite Failure Rate Determination 125°C dynamic life stress The 125°C dynamic life stress is the standard MOS memory continuous dynamic read in a burn-in chamber Endurance test The endurance test is the repeated data complementing of floating-gate devices, possibly at temperature extremes Extended data retention stress This test is constituted by a high-temperature bake with a charge polarity that is opposite to the equilibrium state on the floating gate mbist1.10 Cheng-Wen Wu, NTHU 71 Typical Test Modes (Characterization) Stress (row/column) Reverse tunneling stress Punch through stress Tox stress DC stress Mass program Weak erase Leak (thin-oxide, bit-line, etc.) Cell current; cell Vt Margin Etc. mbist1.10 Cheng-Wen Wu, NTHU 72 Test Patterns A RAM test pattern definition includes both the data pattern and the address pattern The time to read a pattern is the same as the time to write a pattern For flash memories, however, the address and data pattern definitions must be segregated It has long write times relative to the read times Typical data patterns: Solid, checkerboard, random, etc. Typical address patterns: Address increment/decrement, address complement, column/diagonal galloping, etc. mbist1.10 Cheng-Wen Wu, NTHU 73 Testing GPDF Flash Program the first column Read all cells except the first column Flash Program any column except the first Read the first column *Assume reading and programming are done column-wise Source: Saluja, et al., Int. Conf. VLSI Design, 2000 mbist1.10 Cheng-Wen Wu, NTHU 74 Testing GEDF Flash Program all cells Read all cells except the last column Program any column except the last Read the last column *Assume reading and programming are done column-wise Source: Saluja, et al., Int. Conf. VLSI Design, 2000 mbist1.10 Cheng-Wen Wu, NTHU 75 Test Coverage: Previous Results Fault DCP DCE DD EF GF SAF 50% 50% 50% 100% 100% TF 12.5% 50% 50% 87.5% 62.5% AF 40% 0% 0% 44.5% 40% SOF 0% 0% 0% 12.5% 6.2% CFst 25% 25% 25% 50% 50% GPDF 33.3% 0% 0% 100% 33.3% GEDF 0% 100% 75% 100% 100% DEDF 0% 75% 100% 100% 100% DPDF 0% 0% 0% 0% 0% Source: Saluja, et al., Int. Conf. VLSI Design, 2000 mbist1.10 Cheng-Wen Wu, NTHU 76 March-Based Flash Test: March-FT {(f); (r1,w0,r0); (r0); (f); (r1,w0,r0); (r0)} This Flash memory is NOR type (Stacked gate). Memory size (N) : 65536 Test length : 2(chip erase time) + 131072(word program time) + 393216(word read time) Test time : 7.207173 sec SAF : 100% (131072 / 131072) TF : 100% (131072 / 131072) Flash Type = NOR SOF : 100% (65536 / 65536) Gate Type = Stack AF : 100% (4294901760 / 4294901760) Row Number = 256 CFst : 100% (17179607040 / 17179607040) Col Number = 256 GPD : 100% (16711680 / 16711680) Word Length = 1 GED : 100% (16711680 / 16711680) Chip erase time = 3 sec DPD : 100% (16711680 / 16711680) Word program time = 9u sec DED : 100% (16711680 / 16711680) Word read time = 70n sec RD : 100% (65536 / 65536) OE : 100% (65536 / 65536) mbist1.10 Cheng-Wen Wu, NTHU P.S. 77 Test Length (Bit-Oriented) Notation: DCP 2(F) + 2r(P) + rc(R) F : Flash time P : Program time DCE (F) + (c+1)r(P) + rc(R) R : Read time DD (F) + (r+1)c(P) + rc(R) r : row number EF 2(F) + (rc+2r+c-2)(P) + (2rc+r+c-3)(R) GF 2(F) + (rc+2r+c-1)(P) + (2rc+c+r-2)(R) FT 2(F) + 2rc(P) + 6rc(R) c : column number mbist1.10 Cheng-Wen Wu, NTHU 78 Test Length (Word-Oriented) Word length = w: 2(F)+2rc(P)+6rc(R)+log(w)[2(F)+rc(P)+rc(R)] solid background standard background testing time testing time Solid: 0000 (1111) Standard: 0101 (1010), 0011 (1100) Ex: word length w = 4 6(F) + 4rc(P) + 8rc(R) mbist1.10 Cheng-Wen Wu, NTHU 79 Test Algorithm Generation by Simulation (TAGS) T(N) Test algorithms 2N (f); (r1) 3N (f); (w0); (r0) 4N (f); (r1,w0); (r0) 5N (f); (r1,w0,r0); (r0) 6N (f); (r1,w0,r0); (r0,w0) 7N (f); (r0); (r1,w0,r0); (r0,w0) 8N (f); (r1,w0); (f); (r1,w0,r0); (r0) 9N (f); (r1,w0); (r0); (f); (r1,w0,r0); (r0) 10N (f); (r1,w0,r0); (r0); (f); (r1,w0,r0); (r0) mbist1.10 Cheng-Wen Wu, NTHU 80 Embedded Memory Testing Memories are one of the most universal cores In Alpha 21264, cache RAMs represent 2/3 transistors and 1/3 area; in StrongArm SA110, the embedded RAMs occupy 90% area [Bhavsar, ITC-99] In average SOC, memory cores will represent more than 90% of the chip area by 2010 [ITRS 2000] Embedded memory testing is increasingly difficult High bandwidth (speed and I/O data width) Heterogeneity and plurality Isolation (accessibility) AC test, diagnostics, and repair BIST is considered the best solution mbist1.10 Cheng-Wen Wu, NTHU 81 Embedded RAM Test Support Te s t ru n P ro b e te s t P re -B I te s t BI Po s t-B I te s t F in a l te s t mbist1.10 Is o la tio n o n ly Is o la tio n & B IS T Te ste r Te ste r/B IS T Te ste r B IS T B I b o a rd B IS T Te ste r B IS T Te ste r Te ste r/B IS T Cheng-Wen Wu, NTHU 82 RAM BIST Approaches Methodology Processor-based BIST Programmable Hardwired BIST Fast Compact Interface Serial (scan, 1149.1) Parallel (embedded controller; hierarchical) Patterns (address sequence) March Pseudorandom mbist1.10 Cheng-Wen Wu, NTHU 83 Typical RAM BIST Architecture Controller Pattern RAM Generator Go/No-Go Comparator BIST Module RAM Controller mbist1.10 Cheng-Wen Wu, NTHU 84 Serial March (SMarch) { ( r x w 0 ) ( r 0 w 0 ) ; ( r 0 w 1) ( r 1 w1) ; ( r 1 w 0 ) ( r 0 w 0 ) ; c c c c c c ( r 0 w1 ) ( r 1 w1 ) ; ( r 1 w 0 ) ( r 0 w 0 ) ; ( r 0 w 0 ) ( r 0 w 0 ) } c c From March C- • Serial interface • One BIST for all (cascaded) • One-bit read/write at a time, but one pattern per cycle Addr • Slow • No diagnostics SI X Decoder • c c c c Memory Cell Array Y Decoder Transparent Serial Data-MUX c D SO c Q Source: Nadeau-Dostie et al., IEEE D&T, Apr. 1990 mbist1.10 Cheng-Wen Wu, NTHU 85 Syntest MBIST Algorithms: March C MOVI March C++ Checkerboard CE FSM OE WEB Shared controller for multiple RAMs Synthesizable RTL code ADR Control A Data Generator D Analyzer Q Pass BistFail Finish Source: Syntest mbist1.10 Cheng-Wen Wu, NTHU 86 NTHU/GUC EDO DRAM BIST mbist1.10 Cheng-Wen Wu, NTHU 87 DRAM Page-Mode Read-Write Cycle mbist1.10 Cheng-Wen Wu, NTHU 88 Programmable Memory BIST (PMBIST) mbist1.10 Cheng-Wen Wu, NTHU 89 PMBIST Architecture mbist1.10 Cheng-Wen Wu, NTHU 90 Controller and Sequencer Controller Microprogram Hardwired Shared CPU core IEEE 1149.1 TAP Sequencer (Pattern Generator) Counter LFSR LUT mbist1.10 Cheng-Wen Wu, NTHU 91 Controller mbist1.10 Cheng-Wen Wu, NTHU 92 Sequencer D Q D Q C o m b in a tio n L o gic # 0 D Q D Q C o m b in a tio n L o gic # 1 B IS T C o n tr o lle r D Q D Q R o w A d d r e ss C o u n te r C o lu m n A d d r e ss C o u n te r C o n tr o l C ou n t e r S ta te D Q eD R A M D Q eD R A M c o n tr o l sign a l F la gs MCK MCK C o m p ar a to r mbist1.10 Cheng-Wen Wu, NTHU 93 PMBIST Test Modes 1. Scan-Test Mode 2. RAM-BIST Mode 1.Functional faults 2.Timing faults (setup/hold times, rise/fall times, etc.) 3.Data retention faults 3. RAM-Diagnosis Mode 4. RAM-BI Mode mbist1.10 Cheng-Wen Wu, NTHU 94 PMBIST Controller Commands Bit 4 A ddressing order Bit 3 D ata type Bit 2, Bit 1, Bit 0 O perations 1: (increasing) 1: d = D B 000: EO T 0: (decreasing) 0: d = ~D B 001: Rd 010: W d 011: RdW ~d 100: 101: 110: 111: mbist1.10 (End of test) (REA D C ycle) (Early W R ITE C ycle) (REA D -W R ITE) C ycle ED O -PA G E-M O D E Wd (Early W R ITE C ycle RdW ~d (REA D -W R ITE) C ycle RdW ~dR~d (REA D Early W R ITE Cycle) Refresh Cheng-Wen Wu, NTHU 95 PMBIST Control Sequence mbist1.10 Cheng-Wen Wu, NTHU 96 BIST Area Overhead 3% Overhead Mem size 0.3% mbist1.10 Cheng-Wen Wu, NTHU 97 Processor-Based RAM BIST Processor mbist1.10 Cheng-Wen Wu, NTHU 98 On-Chip Processor-Based RAM BIST BIST program is stored in boot ROM during design phase, and memory BIST is done by executing BIST program Address bus DATAI bus DATAO bus Control bus BOOT ROM Embedded memory CPU core I/O port mbist1.10 Cheng-Wen Wu, NTHU 99 Testing RAM Core by On-Chip CPU 6502 assembly program that performs March Ctest algorithm data background M0: M1: mbist1.10 .org LDX LDA 0HFF00 #$$00 #$$55 STA INX CPX BNE LDX 0000,X (W0) #$$FF M0 #$$00 (R0W1) LDA CMP BNE LDA STA INX CPX BNE LDX . . . . 0000,X #$$55 ERROR #$$AA 0000,X (R0W1) #$$FF M1 #$$00 . write data background to memory March C- algorithm (R1W0) (R1W0) (R0) read from memory write data background to memory Cheng-Wen Wu, NTHU 100 Test Speed Consideration Processor-BIST speed is lower than dedicated BIST circuit Total clock cycles to implement MARCH C- is O(114N) Table 1. 6502 instruction cycles IM M ABX IM P REL LD A 2 4 - - LD X 2 - - - S TA - 4 - - IN X - - 2 - C PX 2 - - - BNE - - - 2~4 CM P 2 - - - mbist1.10 Cheng-Wen Wu, NTHU 101 NTHU Processor-Programmable BIST ADDR_cpu 0 ADDR A 1 DATAO_cpu 0 clock_cpu BIST core 1 mux_sel ctrl_bist 1 ctrl_cpu 1 DATAI_cpu control on-chip bus embedded CPU ADDR_bist DATAO_bist DATAO DATAI_bist DATAI_sys embedded memory control 0 0 DI DATAI DO BIST circuitry mux_sel = 0 in normal mode mux_sel = 1 in BIST mode mbist1.10 Cheng-Wen Wu, NTHU I/O circuitry 102 Advantages and Disadvantages Advantages Reuse of on-chip CPU core Might need modification Core March elements can be implemented in hardware, allowing different March algorithms to be executed via assembly programming Disadvantages Some address space will be occupied by PPBIST Area overhead mbist1.10 Cheng-Wen Wu, NTHU 103 PPBIST Implementation mbist1.10 Cheng-Wen Wu, NTHU 104 PPBIST Data Registers R e g is te r F u n c tio n R BG Sto re b a c k g ro u n d d a ta R AL Sto re lo w e s t a d d re s s R AH Sto re h ig h e s t a d d re s s R ME Sto re c u rre n t M a rc h e le m e n t R IR In s tru c tio n re g is te r R FLAG Sta tu s re g is te r R ED E rro n e o u s re s p o n s e o f d e fe c tiv e m e m o ry c e ll R EA A d d re s s o f d e fe c tiv e m e m o ry c e ll mbist1.10 Cheng-Wen Wu, NTHU 105 PPBIST Test Procedure CPU write data back ground CPU write start/stop address CPU write MARCH element instruction CPU write START instruction to wrapper BIST core (W0) BIST core (R0W1) BIST core (R1W0) BIST core (R0W1) compare error? BIST core (R1W0) yes no complete? yes write complete flag mbist1.10 Cheng-Wen Wu, NTHU BIST core (R0) write error flag write faulty address write faulty data no CPU take over 106 PPBIST Example Using 6502 6502 assembly program that performs March Ctest algorithm under the proposed BIST scheme START: LDA STA LDA STA LDA STA LDA STA LDA STA M0: LDA STA JSR M1: LDA . . . . . . mbist1.10 #$$55 0HFFE0 #$$00 0HFFE1 #$$00 0HFFE2 #$$FF 0HFFE3 #$$0F 0HFFE4 #$$00 0HFFE5 BIST #$$01 END: LDA STA JMP #$$04 0HFFE6 FINISH BIST: LDA STA LDA CMP BEQ CMP BNE RTS #$$00 0HFFE6 0HFFE7 #$$01 ERROR #$$FF LOOP ERROR: LDA STA JMP #$$03 0HFFEA FINISH LOOP: Cheng-Wen Wu, NTHU 107 PPBIST Example Addresses of the registers in the BIST experiment R e g is te r A d d re s s R e g is te r A d d re s s R BG FFE0 R IR FFE6 R AL FFE1 ~ FFE2 R FLAG FFE7 R AH FFE3 ~ FFE4 R ED FFE8 R ME FFE5 R EA FFE9 ~ FFEA March elements and the corresponding RME mbist1.10 M0 M1 M2 M3 M4 M5 0H 1H 2H 3H 4H 5H Cheng-Wen Wu, NTHU 108 Experimental Results Total test time in terms of clock cycles The sum of all the March elements' test time plus 30 clock cycles 10N clock cycles to perform March C- Test time of each March element : mbist1.10 M0 M1 M2 M3 M4 M5 1N 2N 2N 2N 2N 1N Cheng-Wen Wu, NTHU 109 Comparison of BIST Methodologies B IS T sch e m e Test tim e H /W o ve rh e a d R o u tin g o ve rh e a d In te g ra te d B IS T co re S h o rt Low H ig h O n -ch ip p ro ce sso r Ve ry lo n g Z e ro Z e ro O u rs S h o rt Ve ry lo w ze ro mbist1.10 Cheng-Wen Wu, NTHU 110 RAM BIST Compiler Use of RAM cores is increasing. SRAM, DRAM, flash RAM Multiple cores RAM BIST compiler is the trend. BRAINS (BIST for RAM in Seconds) Proposed BIST Architecture Memory Modeling Command Sequence Generation Configuration of the Proposed BIST mbist1.10 Cheng-Wen Wu, NTHU 111 BRAINS Outputs Synthesizable BIST design At-speed testing Programmable March algorithms Optional diagnosis support BISD Activation sequence Test bench Synthesis script mbist1.10 Cheng-Wen Wu, NTHU 112 BIST Synthesis Flow GUI Memory Library RTL RAM/BIST Description Parser Compile Engine Synthesis BIST Template mbist1.10 Netlist Cell Library Cheng-Wen Wu, NTHU 113 NTHU/GUC PMBIST Architecture Programmable Memory BIST MSO MRD MBO Controller Sequencer TPG Controls Test Collar MCK MBS MBC MBR MSI Comparator Address D Memory Q Normal Access mbist1.10 Cheng-Wen Wu, NTHU 114 PMBIST with Scan T est C o m m and /In fo rm a tio n S to rag e M o d ule S eria l d ata in S eria l d ata ou t M e m o ry C o m m an d C o m m an d B IS T co n tro l sig n als C o n tr oller H an dsha kin g E rro r S eq u e n cer A dd re ss T est P attern G e n era to r To M e m o ry E rro r Source: Cheng, et al., DFT00 mbist1.10 Cheng-Wen Wu, NTHU 115 Sequencer Address Generator Control Module Sequence Generator go Command Generator error info. mbist1.10 address Error Handling Module Cheng-Wen Wu, NTHU command error signature 116 State Diagram of Control Module BIST idle BIST idle BIST active BIST done BIST done BIST apply For SRAM mbist1.10 BIST apply For DRAM Cheng-Wen Wu, NTHU 117 DRAM Page-Mode Operation mbist1.10 Cheng-Wen Wu, NTHU 118 Memory Specification Techniques Memory Specifications I/O Specification Command Specification Task Specification Delay Constraint Specification AC Parameter Specification Support customized memories. mbist1.10 Cheng-Wen Wu, NTHU 119 I/O Specification Four parameters IO_type IO_width IO_latency IO_packet_length IO_type: input, output, or inout IO_width: port width (#bits), can be a constant or specified by user mbist1.10 Cheng-Wen Wu, NTHU 120 I/O Specification IO_latency: port latency mbist1.10 Cheng-Wen Wu, NTHU 121 I/O Modeling IO_packet_length: #bits packed within a clock cycle for the port mbist1.10 Cheng-Wen Wu, NTHU 122 Command Specification Specifies the memory’s instructions mbist1.10 Cheng-Wen Wu, NTHU 123 Task Specification Specifies a complete memory operation A task can be a single command or a sequence of commands. mbist1.10 Cheng-Wen Wu, NTHU 124 Delay Constraint Specification Specifies the minimal time interval between any two tasks mbist1.10 Cheng-Wen Wu, NTHU 125 AC Parameter Specification Specifies input and output delays Specified parameters will be inserted into the synthesis script. mbist1.10 Cheng-Wen Wu, NTHU 126 Memory Specification Example For ZBT SRAM: Method A: @latency D = 1; @task write = {write}; Method B: @latency D = 0; @task write = {pre_write, post_write}; The BIST circuit from method A is faster than the one from method B, but it has higher area overhead mbist1.10 Cheng-Wen Wu, NTHU 127 Sequence Generation For each March element, the compiler generates the command sequence according to the read task, write task, and minimum delay between the two tasks For example: task read = {A} task write = {B, C} minimum delay between read and write = 10ns clock period = 10 ns Then the (rw) element becomes {A, nop, B, C} One can also optimize the command sequence mbist1.10 Cheng-Wen Wu, NTHU 128 Fast Access Mode mbist1.10 Cheng-Wen Wu, NTHU 129 Diagnosis Support The BIST circuit scans out the error information (element, address, signature, and polarity) during the diagnosis mode. Assume address 20h stuck-at 64h: mbist1.10 Cheng-Wen Wu, NTHU 130 Multiple RAM Cores Controller and sequencer can be shared. Test pattern generator Ram Core A Test pattern generator Ram Core B Test pattern generator Ram Core C sequencer controller sequencer mbist1.10 Cheng-Wen Wu, NTHU 131 Experimental Results The Built-In Memory List DRAM EDO DRAM SDRAM DDR SDRAM SRAM Single-Port Synchronous SRAM Single-Port Asynchronous SRAM Two-Port Synchronous Register File Dual-Port Synchronous SRAM Micron ZBT SRAM BRAINS can support new memory architecture easily mbist1.10 Cheng-Wen Wu, NTHU 132 Experimental Results mbist1.10 Cheng-Wen Wu, NTHU 133 Experimental Results Four single-port SRAM BIST circuits share the same controller and sequencer. Size of the SRAM core: 8K x 16 Original BIST area for single-port SRAM: 1438 (gates) Total area = 1438 * 4 = 5752 (gates) mbist1.10 Shared gate count: 3350 Cheng-Wen Wu, NTHU 134 Experimental Results 8K x 16 single-port synchronous SRAM (0.25um) Area: Die size: 1780.74 x 755.07 um2 BIST area: 80.1 x 583.48 um2 Area overhead : 3.4% mbist1.10 Cheng-Wen Wu, NTHU 135 Experimental Results 2K x 32 two-port register file (0.25um) Die size: 1130.74 x 936.34 um2 BIST area: 77.88 x 620 um2 Area overhead: 4.5% mbist1.10 Cheng-Wen Wu, NTHU 136 Why RAM Diagnostics? Memory testing is more and more important Memories are key components Represent about 30% of the semiconductor market Dominate the chip area/yield Memory testing is more and more difficult Growing density, capacity, and speed Emerging new architectures & technologies Growing need for embedded memories Why diagnostics? Yield improvement Repair and/or design/process debugging mbist1.10 Cheng-Wen Wu, NTHU 137 Fault Model Subtypes mbist1.10 Cheng-Wen Wu, NTHU 138 NTHU-FTC BIST Architecture mbist1.10 Cheng-Wen Wu, NTHU 139 Test Mode In Test Mode it runs a fixed algorithm for production test and repair Only a few pins need to be controlled, and BGO reports the result (Go/No-Go) mbist1.10 Cheng-Wen Wu, NTHU 140 Fault Analysis Mode In Fault Analysis Mode, we can apply a longer March algorithm for diagnosis FSI captures the error information of the faulty cells EOP format: mbist1.10 Cheng-Wen Wu, NTHU 141 Error Catch and Analysis Locate the faulty cells Identify the fault types mbist1.10 Cheng-Wen Wu, NTHU 142 How to Identify Fault Type? RAM Circuit/Layout mbist1.10 Tester/BIST Output Cheng-Wen Wu, NTHU 143 March Dictionary March 11N E0 mbist1.10 E1 E2 E3 E4 E5 E6 E7 Cheng-Wen Wu, NTHU E8 E9 E10 144 March Signature and Error Map March Signature (Syndrome) Error Map mbist1.10 Cheng-Wen Wu, NTHU 145 MECA System mbist1.10 Cheng-Wen Wu, NTHU 146 Error Analyzer Tester/BIST data log Data log parser Error maps March Dictionary Fault analysis Fault maps mbist1.10 Cheng-Wen Wu, NTHU 147 Fault Analysis Derive analysis equations from the fault dictionary Convert error maps to fault maps by the equations mbist1.10 Cheng-Wen Wu, NTHU 148 Test Algorithm Generation Start from a base test: generated by TAGS or user-specified Generation options reduced to read-insertions mbist1.10 Cheng-Wen Wu, NTHU 149 Diagnostic Resolution Diagnostic resolution Diagnostic resolution # of distinguis hable faults # of detectable mbist1.10 Cheng-Wen Wu, NTHU faults 150 Experimental Results Proposed diagnosis framework has been applied to commercial embedded SRAMs Results for a 16Kx8 embedded SRAM (FS80A020) are shown Tester log from Credence SC212 is examined Address remapping (logical to physical) is applied mbist1.10 Cheng-Wen Wu, NTHU 151 The Total Error Bitmap mbist1.10 Cheng-Wen Wu, NTHU 152 Fault Bitmaps Idempotent Coupling Fault mbist1.10 Cheng-Wen Wu, NTHU Stuck-at 0 153 Redundancy and Repair Problem: We keep shrinking the feature size and increasing the chip density and size. How do we maintain the yield? Solutions: Fabrication Material, process, equipment, etc. Design Device, circuit, etc. Redundancy and repair On-line EDAC (extended Hamming code; product code) Off-line Spare rows and/or columns mbist1.10 Cheng-Wen Wu, NTHU 154 From BIST to BISR BIST BISD BIRA BISR •BIST: built-in self-test •BIECA: built-in error catch & analysis -BISD: built-in self diagnosis -BIRA: built-in redundancy analysis •BISR: built-in self-repair mbist1.10 Cheng-Wen Wu, NTHU 155 RAM Built-In Self-Repair (BISR) Reconfiguration Mechanism Analyzer RAM Spare Elements Redundancy BIST mbist1.10 Cheng-Wen Wu, NTHU 156 RAM Redundancy 1-D: spare rows (or columns) only SRAM Algorithm: Must-Repair 2-D: spare rows and columns Local and/or global spares NP-complete problem Conventional algorithm: Must-Repair phase Final-Repair phase Repair-Most (greedy) [Tarr et al., 1984] Fault-Driven (exhaustive, slow) [Day, 1985] Fault-Line Covering (b&b) [Huang et al., 1990] mbist1.10 Cheng-Wen Wu, NTHU 157 Redundancy Architectures mbist1.10 Cheng-Wen Wu, NTHU 158 An SRAM with BISR [Kim et al., ITC 98] mbist1.10 Cheng-Wen Wu, NTHU 159 A DRAM Redundancy Example 4 local spare rows per block 2x4=8 global spare columns mbist1.10 Cheng-Wen Wu, NTHU 160 Definitions Faulty line: row or column with at least one faulty cell. A faulty line is covered if all faulty cells in the line are repaired by spare rows and/or columns. A faulty cell not sharing any row or column with any other faulty cell is an orthogonal faulty cell. r: number of (available) spare rows c: number of (available) spare columns F: number of faulty cells in a block F’:number of orthogonal faulty cells in a block mbist1.10 Cheng-Wen Wu, NTHU 161 Example Block with Faulty Cells mbist1.10 Cheng-Wen Wu, NTHU 162 Repair-Most (RM) • Run BIST and construct bitmap. • Construct row and column error counters. • Run Must-Repair algorithm. • Run greedy Final-Repair algorithm. mbist1.10 Cheng-Wen Wu, NTHU 163 Worst-Case Bitmap (After MustRepair) •Max F=2rc. •Max F’=r+c. •Bitmap size: (rc+c)(cr+r). r=2; c=4 mbist1.10 Cheng-Wen Wu, NTHU 164 Local Repair-Most (LRM) RM is not good enough for embedded RAM. Large storage requirement: bitmap and counters Slow LRM improves the performance. Repair-Most based Improved heuristics Early termination rules Concurrent BIST and BIRA No separate Must-Repair phase LRM reduces the storage required. Smaller local bitmap From (rc+c)x(cr+r) to mxn mbist1.10 Cheng-Wen Wu, NTHU 165 LRM Algorithm Activated by BIST whenever a faulty cell is detected. Fault Collection (FC) Collects faulty-cell addresses. Constructs local bitmap. Counts row and column errors. Spare Allocation (SA) Allocate spare rows or columns when bitmap is full. Allocate spare rows or columns at end. mbist1.10 Cheng-Wen Wu, NTHU 166 LRM: FC and SA (1,0), (1,6), (2,4), (3,4), (5,1), (5,2) mbist1.10 Cheng-Wen Wu, NTHU 167 LRM Example (5,2) mbist1.10 (5,4),(5,6),(5,7) Cheng-Wen Wu, NTHU (7,3) 168 Local Optimization (LO) LMR has drawbacks: Selecting line with largest fault count may be slow. Multiple lines may need to be selected for repair. Area overhead is still high. Repair rate depends on bitmap size. LO has a better repair rate based on same hardware overhead, i.e., a higher repair efficiency. Fault Collection (FC) Records faulty cells in bitmap until it is full. Spare Allocation (SA) Exhaustive search performed for repairing all faults. Bitmap cleared; process repeated until done. mbist1.10 Cheng-Wen Wu, NTHU 169 LO: Column*/Row Selection for SA A 1 means that the corresponding col is selected for repair, unless empty. Col selection vector Row selection vector 1. Col 5 selected for repair. 2. Row 5 is selected for repair. * Assume column selection has a lower cost than row selection. mbist1.10 Cheng-Wen Wu, NTHU 170 LO Example mbist1.10 Cheng-Wen Wu, NTHU 171 Essential Spare Pivoting (ESP) Maintain high repair rate without using a bitmap. Small area overhead. Fault Collection (FC) Collect and store faulty-cell address using row- pivot and column-pivot registers. If there is a match for row (col) pivot, the pivot is an essential pivot. If there is no match, store the row/col addresses in the pivot registers. If F > r+c, the RAM is unrepairable. Spare Allocation (SA) Use row and column pivots for spare allocation. Spare rows (cols) for essential row (col) pivots. SA for orthogonal faults. mbist1.10 Cheng-Wen Wu, NTHU 172 ESP Example mbist1.10 Cheng-Wen Wu, NTHU 173 Cell Fault Size Distribution Mixed Poisson-exponential distribution. mbist1.10 Cheng-Wen Wu, NTHU 174 Repair Rate Comparison •1,552 RAM blocks. •1,024x64 bits per block. •r from 6 to 10. •c from 2 to 6. •LRM bitmap: rxc. •LO bitmap: 8x4. mbist1.10 Cheng-Wen Wu, NTHU 175 Normalized Repair Rate mbist1.10 Cheng-Wen Wu, NTHU 176 Repair Rate (r=10) mbist1.10 Cheng-Wen Wu, NTHU 177 Normalized Repair Rate (r=6) mbist1.10 Cheng-Wen Wu, NTHU 178 Area Overhead Overhead is about 5-12% for 16Mb DRAM, r=8, and c=4. mbist1.10 Cheng-Wen Wu, NTHU 179 Computation Time (Simulated) mbist1.10 Cheng-Wen Wu, NTHU 180