Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Semiconductor Memories December 20, 2002 © Digital Integrated Circuits2nd Memories Chapter Overview Memory Classification Memory Architectures The Memory Core Periphery Reliability Case Studies © Digital Integrated Circuits2nd Memories Semiconductor Memory Classification Read-Write Memory Random Access Non-Random Access SRAM FIFO DRAM LIFO Non-Volatile Read-Write Memory Read-Only Memory EPROM Mask-Programmed E2PROM Programmable (PROM) FLASH Shift Register CAM © Digital Integrated Circuits2nd Memories Memory Architecture: Decoders M bits S0 S0 Word 0 S1 Word 1 S2 Word 2 SN 2 2 Nwords SN 2 M bits 1 Storage cell Word 0 A0 Word 1 A1 Word 2 A K2 1 Word N 2 2 Word N 2 1 Decoder Word N 2 Storage cell 2 Word N 2 1 K 5 log2N Input-Output (M bits) Intuitive architecture for N x M memory Too many select signals: N words == N select signals © Digital Integrated Circuits2nd Input-Output (M bits) Decoder reduces the number of select signals K = log2N Memories Block Diagram of 4 Mbit SRAM Clock generator Z-address buffer X-address buffer Predecoder and block selector Bit line load 128 K Array Block Subglobal row decoder SubglobalGlobal row decoder row decoder Block31 30 Block Block 1 Transfer gate Column decoder Local row deco Sense amplifier and write driver CS, WE buffer © Digital Integrated Circuits2nd I/O buffer x1/x4 controller Y-address buffer [Hirose90] X -address buffer Memories Read-Only Memory Cells BL BL BL VDD WL WL WL 1 BL WL BL BL WL WL 0 GND Diode ROM © Digital Integrated Circuits2nd MOS ROM 1 MOS ROM 2 Memories MOS OR ROM BL[0] BL[1] BL[2] BL[3] WL[0] V DD WL[1] WL[2] V DD WL[3] V bias Pull-down loads © Digital Integrated Circuits2nd Memories MOS NOR ROM V DD Pull-up devices WL[0] GND WL [1] WL [2] GND WL [3] BL [0] © Digital Integrated Circuits2nd BL [1] BL [2] BL [3] Memories MOS NOR ROM Layout Cell (9.5l x 7l) Programmming using the Active Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion © Digital Integrated Circuits2nd Memories MOS NOR ROM Layout Cell (11l x 7l) Programmming using the Contact Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion © Digital Integrated Circuits2nd Memories MOS NAND ROM V DD Pull-up devices BL [0] BL [1] BL [2] BL [3] WL [0] WL [1] WL [2] WL [3] All word lines high by default with exception of selected row © Digital Integrated Circuits2nd Memories MOS NAND ROM Layout Cell (8l x 7l) Programmming using the Metal-1 Layer Only No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM Polysilicon Diffusion Metal1 on Diffusion © Digital Integrated Circuits2nd Memories NAND ROM Layout Cell (5l x 6l) Programmming using Implants Only Polysilicon Threshold-altering implant Metal1 on Diffusion © Digital Integrated Circuits2nd Memories Equivalent Transient Model for MOS NOR ROM V DD Model for NOR ROM BL rword WL Cbit cword Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon) Bit line parasitics Resistance not dominant (metal) Drain and Gate-Drain capacitance © Digital Integrated Circuits2nd Memories Equivalent Transient Model for MOS NAND ROM V DD Model for NAND ROM BL CL r bit WL r word cbit cword Word line parasitics Similar to NOR ROM Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance © Digital Integrated Circuits2nd Memories Decreasing Word Line Delay Driver WL Polysilicon word line Metal word line (a) Driving the word line from both sides Metal bypass WL K cells Polysilicon word line (b) Using a metal bypass (c) Use silicides © Digital Integrated Circuits2nd Memories Precharged MOS NOR ROM f V DD pre Precharge devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design. © Digital Integrated Circuits2nd Memories Non-Volatile Memories The Floating-gate transistor (FAMOS) Floating gate Gate Source D Drain G tox tox n+ p n+_ S Substrate Device cross-section © Digital Integrated Circuits2nd Schematic symbol Memories Floating-Gate Transistor Programming 20 V 10 V S 5V 0V 20 V D Avalanche injection © Digital Integrated Circuits2nd 2 5V S 5V 0V D Removing programming voltage leaves charge trapped 2 2.5 V S 5V D Programming results in higher V T . Memories FLOTOX EEPROM Gate Floating gate I Drain Source 20–30 nm V GD -10 V 10 V n1 n1 Substrate p 10 nm FLOTOX transistor © Digital Integrated Circuits2nd Fowler-Nordheim I-V characteristic Memories EEPROM Cell BL WL VDD © Digital Integrated Circuits2nd Absolute threshold control is hard Unprogrammed transistor might be depletion 2 transistor cell Memories Flash EEPROM Control gate Floating gate erasure n 1 source Thin tunneling oxide programming n 1 drain p-substrate Many other options … © Digital Integrated Circuits2nd Memories Cross-sections of NVM cells Flash © Digital Integrated Circuits2nd EPROM Courtesy Intel Memories NAND Flash Memory Select transistor Word lines Active area STI Bit line contact © Digital Integrated Circuits2nd Source line contact Courtesy Toshiba Memories Characteristics of State-of-the-art NVM © Digital Integrated Circuits2nd Memories Read-Write Memories (RAM) STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended © Digital Integrated Circuits2nd Memories 6-transistor CMOS SRAM Cell WL V DD M2 M5 Q M1 BL © Digital Integrated Circuits2nd M4 Q M6 M3 BL Memories CMOS SRAM Analysis (Read) WL V DD M4 BL Q= 0 M5 V DD Cbit © Digital Integrated Circuits2nd M1 Q= 1 V DD BL M6 V DD Cbit Memories CMOS SRAM Analysis (Read) 1.2 Voltage Rise (V) 1 0.8 0.6 0.4 0.2 Voltage rise [V] 0 0 © Digital Integrated Circuits2nd 0.5 1 1.2 1.5 2 Cell Ratio (CR) 2.5 3 Memories CMOS SRAM Analysis (Write) WL V DD M4 M5 Q= 1 M1 BL = 1 © Digital Integrated Circuits2nd M6 Q= 0 V DD BL = 0 Memories CMOS SRAM Analysis (Write) © Digital Integrated Circuits2nd Memories 6T-SRAM — Layout VDD M2 M4 Q Q M1 M3 GND M5 BL © Digital Integrated Circuits2nd M6 WL BL Memories Resistance-load SRAM Cell WL V DD RL M3 BL RL Q Q M1 M2 M4 BL Static power dissipation -- Want R L large Bit lines precharged to V DD to address t p problem © Digital Integrated Circuits2nd Memories SRAM Characteristics © Digital Integrated Circuits2nd Memories 3-Transistor DRAM Cell BL 1 BL 2 WWL WWL RWL M3 X M1 CS M2 RWL V DD 2 V T X BL 1 BL 2 V DD DV V DD 2 V T No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = V WWL-VTn © Digital Integrated Circuits2nd Memories 3T-DRAM — Layout BL2 BL1 GND RWL M3 M2 WWL M1 © Digital Integrated Circuits2nd Memories Sense Amp Operation V BL V(1) V PRE D V(1) V(0) Sense amp activated Word line activated © Digital Integrated Circuits2nd t Memories 1-T DRAM Cell Capacitor M 1 word line Metal word line SiO2 Poly n+ Field Oxide n+ Poly Inversion layer induced by plate bias Cross-section Diffused bit line Polysilicon gate Polysilicon plate Layout Uses Polysilicon-Diffusion Capacitance Expensive in Area © Digital Integrated Circuits2nd Memories SEM of poly-diffusion capacitor 1T-DRAM © Digital Integrated Circuits2nd Memories Advanced 1T DRAM Cells Word line Insulating Layer Cell plate Capacitor dielectric layer Cell Plate Si Capacitor Insulator Refilling Poly Transfer gate Isolation Storage electrode Storage Node Poly Si Substrate 2nd Field Oxide Trench Cell © Digital Integrated Circuits2nd Stacked-capacitor Cell Memories Static CAM Memory Cell Bit Bit Bit Bit Bit Word CAM Word ••• ••• CAM M4 M8 M9 M6 M7 M5 CAM ••• ••• Bit Word CAM S M3 Match int S M2 M1 Wired-NOR Match Line © Digital Integrated Circuits2nd Memories CAM in Cache Memory CAM SRAM ARRAY ARRAY Hit Logic Address Decoder Input Drivers Address © Digital Integrated Circuits2nd Tag Sense Amps / Input Drivers Hit R/W Data Memories Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry © Digital Integrated Circuits2nd Memories Row Decoders Collection of 2M complex logic gates Organized in regular and dense fashion (N)AND Decoder NOR Decoder © Digital Integrated Circuits2nd Memories Hierarchical Decoders Multi-stage implementation improves performance ••• WL 1 WL 0 A 0A 1 A 0A 1 A 0A 1 A 0A 1 A 2A 3 A 2A 3 A 2A 3 A 2A 3 ••• NAND decoder using 2-input pre-decoders A1 A0 A0 A1 © Digital Integrated Circuits2nd A3 A2 A2 A3 Memories Dynamic Decoders Precharge devices GND VDD GND WL 3 VDD WL 3 WL 2 WL 2 VDD WL 1 WL 1 V DD WL 0 WL 0 VDD f A0 A0 A1 A1 2-input NOR decoder © Digital Integrated Circuits2nd A0 A0 A1 A1 f 2-input NAND decoder Memories 4-input pass-transistor based column decoder BL BL BL BL 0 A0 1 2 3 S0 S1 S2 A1 S3 2-input NOR decoder D Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count © Digital Integrated Circuits2nd Memories 4-to-1 tree based column decoder BL 0 BL 1 BL 2 BL 3 A0 A0 A1 A1 D Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches © Digital Integrated Circuits2nd Memories Decoder for circular shift-register V DD WL V DD V DD WL 0 R f f f f R V DD V DD WL 1 f f f f R V DD 2 f f f f • • • V DD © Digital Integrated Circuits2nd Memories Sense Amplifiers DV × C tp = ---------------Iav large make D V as small as possible small Idea: Use Sense Amplifer small transition s.a. input © Digital Integrated Circuits2nd output Memories Differential Sense Amplifier V DD M3 M4 y M1 bit SE M2 Out bit M5 Directly applicable to SRAMs © Digital Integrated Circuits2nd Memories Latch-Based Sense Amplifier (DRAM) EQ BL BL VDD SE SE Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point. © Digital Integrated Circuits2nd Memories Charge-Redistribution Amplifier― V EPROM DD SE Load M4 Out V casc M3 Cascode device Cout Ccol WLC Column decoder M2 BL WL © Digital Integrated Circuits2nd M1 CBL EPROM array Memories Open bitline architecture with dummy cells EQ L L1 L0 V DD R0 R1 L SE BLL CS … CS BLR CS Dummy cell © Digital Integrated Circuits2nd … SE CS CS CS Dummy cell Memories DRAM Read Process with Dummy Cell 3 3 2 2 BL V 1 0 0 BL V BL 1 2 1 0 3 BL 0 1 t (ns) 2 3 t (ns) reading 0 reading 1 3 EQ WL 2 V SE 1 0 0 1 2 3 t (ns) control signals © Digital Integrated Circuits2nd Memories Voltage Regulator VDD Mdrive VDL VREF Equivalent Model Vbias VREF + Mdrive VDL © Digital Integrated Circuits2nd Memories DRAM Timing © Digital Integrated Circuits2nd Memories RDRAM Architecture Bus Clocks Data bus k k3 l memory array network mux/demux Column Row © Digital Integrated Circuits2nd demux packet dec. demux packet dec. Memories Address Transition Detection V DD A0 DELAY td A1 DELAY td A N2 1 DELAY td © Digital Integrated Circuits2nd ATD ATD … Memories Reliability and Yield © Digital Integrated Circuits2nd Memories Sensing Parameters in DRAM 1000 C D(1F) V smax (mv) 100 smax V , DD V ,S 10 C ,S Q ,D C C S(1F) Q S(1C) V DD (V) Q S 5 C S V DD / 2 V smax 5 Q S / (C S 1 C D ) 4K 64K 1M 16M 256M 4G Memory Capacity (bits © Digital Integrated Circuits2nd 64G / chip) From [Itoh01] Memories Noise Sources in 1T DRam BL substrate Adjacent BL CWBL a -particles WL leakage CS electrode Ccross © Digital Integrated Circuits2nd Memories Open Bit-line Architecture —Cross Coupling EQ WL 1 WL 0 WL C WBL D C WBL WL D WL 1 WL 0 BL BL C BL C C © Digital Integrated Circuits2nd C Sense Amplifier C BL C C C Memories Alpha-particles (or Neutrons) a -particle WL V DD BL n1 SiO 2 2 1 2 1 1 2 2 1 2 1 1 2 1 Particle ~ 1 Million Carriers © Digital Integrated Circuits2nd Memories Yield Yield curves at different stages of process maturity (from [Veendrick92]) © Digital Integrated Circuits2nd Memories Redundancy Row Address Redundant rows : Fuse Bank Redundant columns Memory Array Row Decoder Column Decoder © Digital Integrated Circuits2nd Column Address Memories Error-Correcting Codes Example: Hamming Codes e.g. B3 Wrong with 1 1 =3 0 © Digital Integrated Circuits2nd Memories Redundancy and Error Correction © Digital Integrated Circuits2nd Memories Sources of Power Dissipation in Memories V DD I DD 5 S C iD V if1S I DCP CHIP nC DE V INT f m selected C PT V INT f mi act I DCP n ROW DEC PERIPHERY m(n 2 1)i hld non-selected ARRAY mC DE V INT f COLUMN DEC V SS © Digital Integrated Circuits2nd From [Itoh00] Memories Data Retention in SRAM 1.30u 1.10u 0.13 m m CMOS Ileakage 900n 700n 500n Factor 7 (A)300n 0.18 m m CMOS 100n 0.00 .600 1.20 1.80 VDD SRAM leakage increases with technology scaling © Digital Integrated Circuits2nd Memories Suppressing Leakage in SRAM V DD V DD low-threshold transistor V DDL sleep V DD,int sleep V DD,int SRAM cell SRAM cell sleep SRAM cell SRAM cell SRAM cell V SS,int Inserting Extra Resistance © Digital Integrated Circuits2nd SRAM cell Reducing the supply voltage Memories Case Studies Programmable Logic Array SRAM Flash Memory © Digital Integrated Circuits2nd Memories PLA versus ROM Programmable Logic Array structured approach to random logic “two level logic implementation” NOR-NOR (product of sums) NAND-NAND (sum of products) IDENTICAL TO ROM! Main difference ROM: fully populated PLA: one element per minterm Note: Importance of PLA’s has drastically reduced 1. slow 2. better software techniques (mutli-level logic synthesis) But … © Digital Integrated Circuits2nd Memories Programmable Logic Array Pseudo-NMOS PLA GND GND GND V DD GND GND GND GND V DD X0 X0 X1 AND-plane © Digital Integrated Circuits2nd X1 X2 X2 f0 f1 OR-plane Memories Dynamic PLA f AND V DD GND f OR f OR f AND V DD X0 X0 X1 X1 AND-plane © Digital Integrated Circuits2nd X2 X2 f0 f 1 GND OR-plane Memories Clock Signal Generation for self-timed dynamic PLA f f f f AND Dummy AND row f OR AND tpre teval f Dummy AND row f AND OR (a) Clock signals © Digital Integrated Circuits2nd (b) Timing generation circuitry Memories PLA Layout And-Plane VDD x0 x0 x1 x1 x2 x2 Pull-up devices © Digital Integrated Circuits2nd Or-Plane f GND f0 f1 Pull-up devices Memories Bit-line Circuitry Block select Bit-line load ATD BEQ Local WL Memory cell B /T B /T CD CD CD I/O I/O line I/O Sense amplifier © Digital Integrated Circuits2nd Memories Sense Amplifier (and Waveforms) Address I /O I /O ATD SEQ Block select ATD BS SA BS BEQ Vdd I/O Lines GND SA SEQ SEQ SEQ SEQ SEQ Vdd DATA Dei SA, SA GND DATA BS Data-cut © Digital Integrated Circuits2nd Memories 1Gbit NAND Flash Memory Charge pump 2kB Page buffer & cache 10.7mm 2 125mm 32 word lines x 1024 blocks 16896 bit lines 11.7mm © Digital Integrated Circuits2nd From [Nakamura02] Memories 125mm2 1Gbit NAND Flash Memory Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time 25s Program time 200s / page Erase time 2ms / block © Digital Integrated Circuits2nd From [Nakamura02] Memories Semiconductor Memory Trends (up to the 90’s) Memory Size as a function of time: x 4 every three years © Digital Integrated Circuits2nd Memories Semiconductor Memory Trends (updated) © Digital Integrated Circuits2nd From [Itoh01] Memories Trends in Memory Cell Area © Digital Integrated Circuits2nd From [Itoh01] Memories Semiconductor Memory Trends Technology feature size for different SRAM generations © Digital Integrated Circuits2nd Memories