SEMICONDUCTOR MEMORIES Digital Integrated Circuits Memory © Prentice Hall 1995 Chapter Overview • Memory Classification • Memory Architectures • The Memory Core • Periphery • Reliability Digital Integrated Circuits Memory © Prentice Hall 1995 Semiconductor Memory Classification RWM Random Access Non-Random Access SRAM FIFO DRAM LIFO NVRWM ROM EPROM Mask-Programmed E2PROM Programmable (PROM) FLASH Shift Register CAM Digital Integrated Circuits Memory © Prentice Hall 1995 Memory Architecture: Decoders Digital Integrated Circuits Memory © Prentice Hall 1995 Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH AK AK+1 AL-1 Bit Line Storage Cell Row Decoder 2L-K Word Line M.2K Sense Amplifiers / Drivers A0 Column Decoder AK -1 Amplify swing to rail-to-rail amplitude Selects appropriate word Input-Output (M bits) Digital Integrated Circuits Memory © Prentice Hall 1995 Hierarchical Memory Architecture Row Address Column Address Block Address Global Data Bus Control Circuitry Block Selector Global Amplifier/Driver I/O Advantages: 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings Digital Integrated Circuits Memory © Prentice Hall 1995 Memory Timing: Definitions Read Cycle READ Read Access Read Access Write Cycle WRITE Write Access Data Valid DATA Data Written Digital Integrated Circuits Memory © Prentice Hall 1995 Memory Timing: Approaches MSB Address Bus LSB Row Address Column Address Address Bus RAS Address Address transition initiates memory operation CAS RAS-CAS timing DRAM Timing Multiplexed Adressing Digital Integrated Circuits Memory SRAM Timing Self-timed © Prentice Hall 1995 MOS NOR ROM VDD Pull-up devices WL[0] GND WL[1] WL[2] GND WL[3] BL[0] Digital Integrated Circuits BL[1] Memory BL[2] BL[3] © Prentice Hall 1995 MOS NOR ROM Layout Metal1 on top of diffusion WL[0] GND (diffusion) WL[1] Polysilicon Basic cell 10 x 7 Metal1 WL[2] 2 WL[3] Only 1 layer (contact mask) is used to program memory array Programming of the memory can be delayed to one of last process steps Digital Integrated Circuits Memory © Prentice Hall 1995 MOS NOR ROM Layout BL[0] BL[1] BL[2] BL[3] Threshold raising implant WL[0] GND (diffusion) Basic Cell 8.5 x 7 Metal1 over diffusion WL[1] Polysilicon WL[2] WL[3] Threshold raising implants disable transistors Digital Integrated Circuits Memory © Prentice Hall 1995 MOS NAND ROM VDD Pull-up devices BL[0] BL[1] BL[2] BL[3] WL[0] WL[1] WL[2] WL[3] All word lines high by default with exception of selected row Digital Integrated Circuits Memory © Prentice Hall 1995 MOS NAND ROM Layout Diffusion Polysilicon Basic cell 5x 6 Threshold lowering implant No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM Digital Integrated Circuits Memory © Prentice Hall 1995 Equivalent Transient Model for MOS NOR ROM VDD Model for NOR ROM BL rword WL Cbit cword Word line parasitics Resistance/cell: (7/2) x 10 /q = 35 Wire capacitance/cell: (7 2) (0.6)2 0.058 + 2 (7 0.6) 0.043 = 0.65 fF Gate Capacitance/cell: (4 2) (0.6)2 1.76 = 5.1 fF. Bit line parasitics: Resistance/cell: (8.5/4) x 0.07 /q = 0.15 (which is negligible) Wire capacitance/cell: (8.5 4) (0.6)2 0.031 + 2 (8.5 0.6) 0.044 = 0.83 fF Drain capacitance/cell: ((3 4) (0.6)2 0.3 + 2 3 0.6 0.8) 0.375 + 4 0.6 0.43 = 2.6 fF Digital Integrated Circuits Memory © Prentice Hall 1995 Equivalent Transient Model for MOS NAND ROM VDD BL CL rbit Model for NAND ROM rword WL cbit cword Word line parasitics: Resistance/cell: (6/2) x 10 /q = 30 Wire capacitance/cell: (6 2) (0.6)2 0.058 + 2 (6 0.6) 0.043 = 0.56 fF Gate Capacitance/cell: (3 2) (0.6)2 1.76 = 3.8 fF. Bit line parasitics: Resistance/cell: 10 k, the average transistor resistance over the range of interest. Wire capacitance/cell: Included in diffusion capacitance Source/Drain capacitance/cell: ((3 3) (0.6)2 0.3 + 2 3 0.6 0.8) 0.375 + ( ) (0.6)2 1.76 = 5.2 fF Digital Integrated Circuits Memory © Prentice Hall 1995 Propagation Delay of NOR ROM Word line delay Consider the 512512 case. The delay of the distributed rc-line containing M cells can be approximated using the expressions derived in Chapter 8. tword = 0.38 (rword cword ) M2 = 0.38 (35 (0.65 + 5.1) fF) 5122 = 20 nsec Bit line delay Assume a (2.4/1.2) pull-down device and a (8/1.2) pull-up transistor. The bit line switches between 5 V and 2.5 V. Cbit = 512 (2.6 + 0.8) fF = 1.7 pF IavHL = 1/2 (2.4/0.9) (19.6 10 -6)((4.25)2/2 + (4.25 3.75 - (3.75)2/2)) 1/2 (8/0.9) (5.3 10 -6) (4.25 1.25 - (1.25)2/2) = 0.36 mA tHL = (1.7 pF 1.25 V) / 0.36 mA = 5.9 nsec The low-to-high response time can be computed using a similar approach. tLH = (1.7 pF 1.25 V) / 0.36 mA = 5.9 nsec Digital Integrated Circuits Memory © Prentice Hall 1995 Decreasing Word Line Delay Driver WL Polysilicon word line Metal word line (a) Driving the word line from both sides Metal bypass WL K cells Polysilicon word line (b) Using a metal bypass (c) Use silicides Digital Integrated Circuits Memory © Prentice Hall 1995 Precharged MOS NOR ROM VDD pre Precharge devices WL[0] GND WL[1] WL[2] GND WL[3] BL[0] BL[1] BL[2] BL[3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design. Digital Integrated Circuits Memory © Prentice Hall 1995 Floating-gate transistor (FAMOS) Floating gate Gate D Drain Source tox G tox n p + + S n Substrate (a) Device cross-section Digital Integrated Circuits Memory (b) Schematic symbol © Prentice Hall 1995 Floating-Gate Transistor Programming 20 V 0V 20 V 10 V 5 V S D Avalanche injection. Digital Integrated Circuits 5 V S 5V 0V D Removing programming voltage leaves charge trapped. Memory 2.5 V S 5V D Programming results in higher V T. © Prentice Hall 1995 FLOTOX EEPROM Floating gate I Gate Drain Source VGD 10 V 20-30 nm 10 V n+ n+ Substrate p 10 nm (a) Flotox transistor (b) Fowler-Nordheim I-V characteristic BL WL VDD (c) EEPROM cell during a read operation Digital Integrated Circuits Memory © Prentice Hall 1995 Flash EEPROM Control gate Floating gate Thin tunneling oxide erasure n+ source programming n+ drain p-substrate Digital Integrated Circuits Memory © Prentice Hall 1995 Cross-sections of NVM cells Flash Digital Integrated Circuits Courtesy Intel Memory EPROM © Prentice Hall 1995 Characteristics of State-of-the-art NVM Digital Integrated Circuits Memory © Prentice Hall 1995 Read-Write Memories (RAM) • STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential • DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended Digital Integrated Circuits Memory © Prentice Hall 1995 6-transistor CMOS SRAM Cell WL VDD M2 M4 Q M6 Q M5 M1 M3 BL Digital Integrated Circuits BL Memory © Prentice Hall 1995 CMOS SRAM Analysis (Write) WL VDD M4 Q=0 M6 Q=1 M5 M1 VDD BL = 1 BL = 0 2 2 VDD VDD VDD VDD k n M6 VDD – VTn ----------- – ---------- = k p M4 VDD – VTp ----------- – ---------- 2 2 8 8 kn M5 V VDD 2 V DD V 2 DD DD --------- = k V – V ----------------------- ---------- – ----------– V Tn 2 n M1 DD Tn 2 2 2 8 Digital Integrated Circuits Memory (W/L)n,M60.33 (W/L)p,M4 (W/L)n,M5 10 (W/L)n,M1 © Prentice Hall 1995 CMOS SRAM Analysis (Read) WL VDD M4 BL Q= 0 M6 M5 Q=1 M1 V DD BL V DD V DD Cbit C bit kn M5 V VD D 2 VDD V 2 DD- – V ----------D D- = k V --------------- ---------------------- – ----------– V Tn 2 n M1 D D Tn 2 2 2 8 (W/L)n,M510 (W/L)n,M1 Digital Integrated Circuits (supercedes read constraint) Memory © Prentice Hall 1995 6T-SRAM — Layout M2 VDD M4 Q Q M1 M3 GND M5 M6 BL Digital Integrated Circuits Memory WL BL © Prentice Hall 1995 Resistance-load SRAM Cell WL VDD RL RL Q Q M3 BL M4 M1 M2 BL Static power dissipation -- Want RL large Bit lines precharged to VDD to address t p problem Digital Integrated Circuits Memory © Prentice Hall 1995 3-Transistor DRAM Cell BL1 BL2 WWL WWL RWL RWL M3 X M1 X M2 VDD -VT BL1 VDD BL2 VDD -VT CS V No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = VWWL -VTn Digital Integrated Circuits Memory © Prentice Hall 1995 3T-DRAM — Layout BL2 BL1 GND RWL M3 M2 WWL M1 Digital Integrated Circuits Memory © Prentice Hall 1995 1-Transistor DRAM Cell BL WL Write "1" Read "1" WL M1 X CS VDD VT GND VDD BL VDD/2 CBL sensing VDD /2 Write: CS is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance CS V = VBL – V PRE = V BIT – V PRE -----------------------C S + CBL Voltage swing is small; typically around 250 mV. Digital Integrated Circuits Memory © Prentice Hall 1995 DRAM Cell Observations 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out. DRAM memory cells are single ended in contrast to SRAM cells. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design. When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD . Digital Integrated Circuits Memory © Prentice Hall 1995 1-T DRAM Cell Capacitor Metal word line M1 word line SiO2 poly n+ Field Oxide n+ poly Inversion layer induced by plate bias Diffused bit line Polysilicon Polysilicon plate gate (a) Cross-section (b) Layout Used Polysilicon-Diffusion Capacitance Expensive in Area Digital Integrated Circuits Memory © Prentice Hall 1995 SEM of poly-diffusion capacitor 1T-DRAM Digital Integrated Circuits Memory © Prentice Hall 1995 Advanced 1T DRAM Cells Word line Insulating Layer Cell plate Capacitor dielectric layer Cell Plate Si Capacitor Insulator Transfer gate Refilling Poly Isolation Storage electrode Storage Node Poly Si Substrate 2nd Field Oxide Trench Cell Digital Integrated Circuits Stacked-capacitor Cell Memory © Prentice Hall 1995 Periphery • Decoders • Sense Amplifiers • Input/Output Buffers • Control / Timing Circuitry Digital Integrated Circuits Memory © Prentice Hall 1995 Row Decoders Collection of 2M complex logic gates Organized in regular and dense fashion (N)AND Decoder NOR Decoder Digital Integrated Circuits Memory © Prentice Hall 1995 Dynamic Decoders Precharge devices GND GND VDD WL 3 WL 3 VDD WL 2 VDD WL 1 WL 2 WL 1 VDD WL 0 V DD A0 A0 A1 A1 Dynamic 2-to-4 NOR decoder WL 0 A0 A0 A1 A1 2-to-4 MOS dynamic NAND Decoder Propagation delay is primary concern Digital Integrated Circuits Memory © Prentice Hall 1995 A NAND decoder using 2-input predecoders WL 1 WL 0 A0A1 A0 A1 A0 A1 A0A 1 A 2A3 A2 A3 A2 A3 A2 A3 A1 A 0 A3 A2 A0 A1 A2 A3 Splitting decoder into two or more logic layers produces a faster and cheaper implementation Digital Integrated Circuits Memory © Prentice Hall 1995 4 input pass-transistor based column decoder A0 A1 2 input NOR decoder BL0 BL1 BL2 BL3 S0 S1 S2 S3 D Advantage: speed (t pd does not add to overall memory access time) only 1 extra transistor in signal path Disadvantage: large transistor count Digital Integrated Circuits Memory © Prentice Hall 1995 4-to-1 tree based column decoder BL0 BL1 BL2 BL3 A0 A0 A1 A1 D Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches Digital Integrated Circuits Memory © Prentice Hall 1995 Decoder for circular shift-register VDD VDD WL0 VDD VDD WL1 VDD VDD WL2 ... R R R VDD Digital Integrated Circuits Memory © Prentice Hall 1995 Sense Amplifiers make V as small as possible V tp = C ---------------Iav large small Idea: Use Sense Amplifer small transition s.a. input Digital Integrated Circuits Memory output © Prentice Hall 1995 Differential Sensing - SRAM VDD VDD PC BL VDD EQ VDD y M3 BL M1 x SE WLi M4 M2 y x x x M5 SE (b) Doubled-ended Current Mirror Amplifier VDD SRAM cell i y Diff. Sense x x Amp y y D D x x SE (a) SRAM sensing scheme. Digital Integrated Circuits y (c) Cross-Coupled Amplifier Memory © Prentice Hall 1995 Latch-Based Sense Amplifier EQ BL BL VDD SE SE Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point. Digital Integrated Circuits Memory © Prentice Hall 1995 Single-to-Differential Conversion WL BL x Diff. x + _ S.A. cell y Vref y How to make good V ref? Digital Integrated Circuits Memory © Prentice Hall 1995 Open bitline architecture EQ L1 R L0 R0 R1 L VDD SE BLL CS ... CS BLR CS SE dummy cell Digital Integrated Circuits CS ... CS CS dummy cell Memory © Prentice Hall 1995 DRAM Read Process with Dummy Cell V (Volt) 6.0 4.0 BL 2.0 BL 5.0 1 2 3 t (nsec) (a) reading a zero 4 5 6.0 4.0 V (Volt) 0.00 SE 3.0 2.0 EQ 1.0 0.00 V (Volt) WL 4.0 1 2 3 4 5 (c) control signals BL 2.0 0.00 BL 1 2 3 t (nsec) 4 5 (b) reading a one Digital Integrated Circuits Memory © Prentice Hall 1995 Single-Ended Cascode Amplifier VDD Vcasc WLC WL Digital Integrated Circuits Memory © Prentice Hall 1995 DRAM Timing Digital Integrated Circuits Memory © Prentice Hall 1995 Address Transition Detection VDD A0 DELAY td A1 DELAY td ATD ATD ... AN-1 Digital Integrated Circuits DELAY td Memory © Prentice Hall 1995 Reliability and Yield Digital Integrated Circuits Memory © Prentice Hall 1995 Open Bit-line Architecture —Cross Coupling EQ WL1 WL0 WLD CWBL WLD CWBL WL0 WL1 BL BL Sense CBL C Digital Integrated Circuits C C Amplifier Memory CBL C C C © Prentice Hall 1995 Folded-Bitline Architecture WL1 BL ... BL WL1 WL0 WL0 CWBL WLD WLD CBL C y x C C C C Sense C EQ Amplifier x CBL y CWBL Digital Integrated Circuits Memory © Prentice Hall 1995 Transposed-Bitline Architecture BL’ Ccross BL SA BL BL" (a) Straightforward bitline routing. BL’ BL Ccross SA BL BL" (b) Transposed bitline architecture. Digital Integrated Circuits Memory © Prentice Hall 1995 Alpha-particles -particle WL VDD BL SiO2 n+ 1 particle ~ 1 million carriers Digital Integrated Circuits Memory © Prentice Hall 1995 Yield Yield curves at different stages of process maturity (from [Veendrick92]) Digital Integrated Circuits Memory © Prentice Hall 1995 Redundancy Row Address Redundant rows : Fuse Bank Memory Array Column Decoder Digital Integrated Circuits Memory Row Decoder Redundant columns Column Address © Prentice Hall 1995 Redundancy and Error Correction Digital Integrated Circuits Memory © Prentice Hall 1995 Programmable Logic Array Product Terms x0x1 x2 AND PLANE OR PLANE f0 x0 Digital Integrated Circuits x1 f1 x2 Memory © Prentice Hall 1995 Pseudo-Static PLA GND GND GND VD D GND GND GND GND VDD x0 x0 x1 x1 x2 AND-PLANE Digital Integrated Circuits x2 f0 f1 OR-PLANE Memory © Prentice Hall 1995 Dynamic PLA A ND VDD GND OR O R AND VDD x0 x0 x1 x1 x2 AND-PLANE Digital Integrated Circuits x2 f0 f1 GND OR-PLANE Memory © Prentice Hall 1995 Clock Signal Generation for self-timed dynamic PLA AND Dummy AND Row AND AND OR Dummy AND Row OR (a) Clock signals Digital Integrated Circuits (b) Timing generation circuitry. Memory © Prentice Hall 1995 PLA Layout And-Plane VDD x0 x0 x1 x1 x2 x2 Pull-up devices Digital Integrated Circuits Memory Or-Plane GND f0 f1 Pull-up devices © Prentice Hall 1995 PLA versus ROM Programmable Logic Array structured approach to random logic “two level logic implementation” NOR-NOR (product of sums) NAND-NAND (sum of products) IDENTICAL TO ROM! Main difference ROM: fully populated PLA: one element per minterm Note: Importance of PLA’s has drastically reduced 1. slow 2. better software techniques (mutli-level logic synthesis) Digital Integrated Circuits Memory © Prentice Hall 1995 Semiconductor Memory Trends Memory Size as a function of time: x 4 every three years Digital Integrated Circuits Memory © Prentice Hall 1995 Semiconductor Memory Trends Increasing die size factor 1.5 per generation Combined with reducing cell size factor 2.6 per generation Digital Integrated Circuits Memory © Prentice Hall 1995 Semiconductor Memory Trends Technology feature size for different SRAM generations Digital Integrated Circuits Memory © Prentice Hall 1995