CMOS VLSI Design Digital Design Digital Design CMOS VLSI Slide 1 Overview Physical principles Combinational logic Sequential logic Datapath Memories Trends Digital Design CMOS VLSI Slide 2 Dopants Silicon is a semiconductor Pure silicon has no free carriers and conducts poorly Adding dopants increases the conductivity Group V: extra electron (n-type) Group III: missing electron, called hole (p-type) Digital Design Si Si Si Si Si Si As Si Si B Si Si Si Si Si - + CMOS VLSI + - Si Si Si Slide 3 nMOS Operation Body is commonly tied to ground (0 V) When the gate is at a low voltage: – P-type body is at low voltage – Source-body and drain-body diodes are OFF – No current flows, transistor is OFF Source Gate Drain Polysilicon SiO2 0 n+ n+ S p Digital Design D bulk Si CMOS VLSI Slide 4 Transistors as Switches We can view MOS transistors as electrically controlled switches Voltage at gate controls path from source to drain d nMOS pMOS g=1 d d OFF g ON s s s d d d g OFF ON s Digital Design g=0 s CMOS VLSI s Slide 5 CMOS Inverter A VDD Y 0 1 A A Y Y GND Digital Design CMOS VLSI Slide 6 Inverter Cross-section Typically use p-type substrate for nMOS transistors Requires n-well for body of pMOS transistors A GND VDD Y SiO2 n+ diffusion n+ n+ p+ p+ n well p substrate nMOS transistor Digital Design p+ diffusion polysilicon metal1 pMOS transistor CMOS VLSI Slide 7 Inverter Mask Set Transistors and wires are defined by masks Cross-section taken along dashed line A Y GND VDD nMOS transistor well tap substrate tap Digital Design pMOS transistor CMOS VLSI Slide 8 Fabrication Steps Start with blank wafer Build inverter from the bottom up First step will be to form the n-well – Cover wafer with protective layer of SiO2 (oxide) – Remove layer where n-well should be built – Implant or diffuse n dopants into exposed wafer – Strip off SiO2 p substrate Digital Design CMOS VLSI Slide 9 Oxidation Grow SiO2 on top of Si wafer – 900 – 1200 C with H2O or O2 in oxidation furnace SiO2 p substrate Digital Design CMOS VLSI Slide 10 Photoresist Spin on photoresist – Photoresist is a light-sensitive organic polymer – Softens where exposed to light Photoresist SiO2 p substrate Digital Design CMOS VLSI Slide 11 Lithography Expose photoresist through n-well mask Strip off exposed photoresist Photoresist SiO2 p substrate Digital Design CMOS VLSI Slide 12 Etch Etch oxide with hydrofluoric acid (HF) – Seeps through skin and eats bone; nasty stuff!!! Only attacks oxide where resist has been exposed Photoresist SiO2 p substrate Digital Design CMOS VLSI Slide 13 Strip Photoresist Strip off remaining photoresist – Use mixture of acids called piranah etch Necessary so resist doesn’t melt in next step SiO2 p substrate Digital Design CMOS VLSI Slide 14 n-well n-well is formed with diffusion or ion implantation Diffusion – Place wafer in furnace with arsenic gas – Heat until As atoms diffuse into exposed Si Ion Implanatation – Blast wafer with beam of As ions – Ions blocked by SiO2, only enter exposed Si SiO2 n well Digital Design CMOS VLSI Slide 15 Simplified Design Rules Conservative rules to get you started Digital Design CMOS VLSI Slide 16 Complementary CMOS Complementary CMOS logic gates – nMOS pull-down network – pMOS pull-up network inputs – a.k.a. static CMOS Pull-up OFF Pull-up ON Pull-down OFF Z (float) 1 Pull-down ON X (crowbar) Digital Design 0 CMOS VLSI pMOS pull-up network output nMOS pull-down network Slide 17 Example: NAND3 Horizontal N-diffusion and p-diffusion strips Vertical polysilicon gates Metal1 VDD rail at top Metal1 GND rail at bottom 32 l by 40 l Digital Design CMOS VLSI Slide 18 I-V Characteristics In Linear region, Ids depends on – How much charge is in the channel? – How fast is the charge moving? Digital Design CMOS VLSI Slide 19 Channel Charge MOS structure looks like parallel plate capacitor while operating in inversion – Gate – oxide – channel Qchannel = CV Cox = eox / tox C = Cg = eoxWL/tox = CoxWL V = Vgc – Vt = (Vgs – Vds/2) – Vt gate Vg polysilicon gate W tox n+ L n+ SiO2 gate oxide (good insulator, eox = 3.9) + + Cg Vgd drain source Vgs Vs Vd channel + n+ n+ Vds p-type body p-type body Digital Design CMOS VLSI Slide 20 Carrier velocity Charge is carried by e Carrier velocity v proportional to lateral E-field between source and drain v = mE m called mobility E = Vds/L Time for carrier to cross channel: – t=L/v Digital Design CMOS VLSI Slide 21 nMOS Linear I-V Now we know – How much charge Qchannel is in the channel – How much time t each carrier takes to cross Qchannel I ds t W mCox L V V Vds gs t 2 V Vgs Vt ds Vds 2 Digital Design CMOS VLSI V ds W = mCox L Slide 22 Example Example: a 0.6 mm process from AMI semiconductor – tox = 100 Å – m = 350 cm2/V*s 2.5 V =5 – Vt = 0.7 V 2 Plot Ids vs. Vds 1.5 V =4 – Vgs = 0, 1, 2, 3, 4, 5 1 V =3 – Use W/L = 4/2 l 0.5 Ids (mA) gs gs gs 0 Vgs = 2 Vgs = 1 0 3.9 8.85 1014 W W W mCox 350 120 m A /V 2 8 L L 100 10 L Digital Design CMOS VLSI 1 2 3 4 5 Vds Slide 23 Capacitance Any two conductors separated by an insulator have capacitance Gate to channel capacitor is very important – Creates channel charge necessary for operation Source and drain have capacitance to body – Across reverse-biased diodes – Called diffusion capacitance because it is associated with source/drain diffusion Digital Design CMOS VLSI Slide 24 Gate Capacitance Approximate channel as connected to source Cgs = eoxWL/tox = CoxWL = CpermicronW Cpermicron is typically about 2 fF/mm polysilicon gate W tox n+ L n+ SiO2 gate oxide (good insulator, eox = 3.9e0) p-type body Digital Design CMOS VLSI Slide 25 Diffusion Capacitance Csb, Cdb Undesirable, called parasitic capacitance Capacitance depends on area and perimeter – Use small diffusion nodes – Comparable to Cg for contacted diff – ½ Cg for uncontacted – Varies with process Digital Design CMOS VLSI Slide 26 RC Delay Model Use equivalent circuits for MOS transistors – Ideal switch + capacitance and ON resistance – Unit nMOS has resistance R, capacitance C – Unit pMOS has resistance 2R, capacitance C Capacitance proportional to width Resistance inversely proportional to width d g d k s s kC R/k 2R/k g g kC kC s Digital Design kC d k s kC g kC d CMOS VLSI Slide 27 Introduction Chips are mostly made of wires called interconnect – In stick diagram, wires set size – Transistors are little things under the wires – Many layers of wires Wires are as important as transistors – Speed – Power – Noise Alternating layers run orthogonally Digital Design CMOS VLSI Slide 28 Wire Capacitance Wire has capacitance per unit length – To neighbors – To layers above and below Ctotal = Ctop + Cbot + 2Cadj s w layer n+1 h2 Ctop t h1 layer n Cbot Cadj layer n-1 Digital Design CMOS VLSI Slide 29 Lumped Element Models Wires are a distributed system – Approximate with lumped element models N segments R R/N C R/N C/N C/N R R C L-model C/2 R/N R/N C/N C/N R/2 R/2 C/2 p-model C T-model 3-segment p-model is accurate to 3% in simulation L-model needs 100 segments for same accuracy! Use single segment p-model for Elmore delay Digital Design CMOS VLSI Slide 30 Crosstalk A capacitor does not like to change its voltage instantaneously. A wire has high capacitance to its neighbor. – When the neighbor switches from 1-> 0 or 0->1, the wire tends to switch too. – Called capacitive coupling or crosstalk. Crosstalk effects – Noise on nonswitching wires – Increased delay on switching wires Digital Design CMOS VLSI Slide 31 Coupling Waveforms Simulated coupling for Cadj = Cvictim Aggressor 1.8 1.5 1.2 Victim (undriven): 50% 0.9 0.6 Victim (half size driver): 16% Victim (equal size driver): 8% 0.3 Victim (double size driver): 4% 0 0 200 400 600 800 1000 1200 1400 1800 2000 t(ps) Digital Design CMOS VLSI Slide 32 Introduction What makes a circuit fast? – I = C dV/dt -> tpd (C/I) DV – low capacitance – high current 4 B – small swing 4 A Logical effort is proportional to C/I 1 1 pMOS are the enemy! – High capacitance for a given current Can we take the pMOS capacitance off the input? Various circuit families try to do this… Digital Design CMOS VLSI Y Slide 33 Pseudo-nMOS In the old days, nMOS processes had no pMOS – Instead, use pull-up transistor that is always ON In CMOS, use a pMOS that is always ON – Ratio issue – Make pMOS about ¼ effective strength of pulldown network 1.8 1.5 load P/2 1.2 P = 24 Ids Vout 0.9 Vout 0.6 P = 14 16/2 0.3 Vin P=4 0 0 0.3 0.6 0.9 1.2 1.5 1.8 Vin Digital Design CMOS VLSI Slide 34 Dynamic Logic Dynamic gates uses a clocked pMOS pullup Two modes: precharge and evaluate 2 A 2/3 Y 1 Y 1 A Static 4/3 Pseudo-nMOS Precharge Y A 1 Dynamic Evaluate Precharge Y Digital Design CMOS VLSI Slide 35 Pass Transistor Circuits Use pass transistors like switches to do logic Inputs drive diffusion terminals as well as gates CMOS + Transmission Gates: – 2-input multiplexer – Gates should be restoring S S A A S S Y B B S S Digital Design Y CMOS VLSI Slide 36 Sequencing Combinational logic – output depends on current inputs Sequential logic – output depends on current and previous inputs – Requires separating previous, current, future – Called state or tokens – Ex: FSM, pipeline clk in clk clk clk out CL CL Finite State Machine Digital Design CL Pipeline CMOS VLSI Slide 37 Sequencing Overhead Use flip-flops to delay fast tokens so they move through exactly one stage each cycle. Inevitably adds some delay to the slow tokens Makes circuit slower than just the logic delay – Called sequencing overhead Some people call this clocking overhead – But it applies to asynchronous circuits too – Inevitable side effect of maintaining sequence Digital Design CMOS VLSI Slide 38 Sequencing Elements Latch: Level sensitive – a.k.a. transparent latch, D latch Flip-flop: edge triggered – A.k.a. master-slave flip-flop, D flip-flop, D register Timing Diagrams – Transparent – Opaque – Edge-trigger clk Q D Flop D Latch clk Q clk D Q (latch) Q (flop) Digital Design CMOS VLSI Slide 39 Latch Design Buffered output + No backdriving X D Widely used in standard cells + Very robust (most important) - Rather large - Rather slow (1.5 – 2 FO4 delays) - High clock loading Digital Design CMOS VLSI Q Slide 40 Sequencing Methods clk clk Combinational Logic tnonoverlap Combinational Logic Combinational Logic Half-Cycle 1 tpw p Combinational Logic Latch p Latch Pulsed Latches CMOS VLSI 1 Latch 2 Latch 1 p tnonoverlap Tc/2 2 Latch 2-Phase Transparent Latches 1 Half-Cycle 1 Digital Design Flop clk Flop Flip-Flops Flip-flops 2-Phase Latches Pulsed Latches Tc Slide 41 Summary Flip-Flops: – Very easy to use, supported by all tools 2-Phase Transparent Latches: – Lots of skew tolerance and time borrowing Pulsed Latches: – Fast, some skew tol & borrow, hold time risk Digital Design CMOS VLSI Slide 42 Full Adder Design I Brute force implementation from eqns S A B C Cout MAJ ( A, B, C ) A A A S C A B B C C MAJ C B C B B A A Digital Design C B S Cout B A B A B C A B C B B C A C A B Cout B A CMOS VLSI Slide 43 Carry-Skip Adder Carry-ripple is slow through all N stages Carry-skip allows carry to skip over groups of n bits – Decision based on n-bit propagate signal Cout A16:13 B16:13 A12:9 B12:9 A8:5 B8:5 A4:1 P16:13 P12:9 P8:5 P4:1 1 0 C12 + S16:13 Digital Design 1 0 C8 + S12:9 CMOS VLSI 1 0 C4 + S8:5 B4:1 1 0 + Cin S4:1 Slide 44 Tree Adder If lookahead is good, lookahead across lookahead! – Recursive lookahead gives O(log N) delay Many variations on tree adders Digital Design CMOS VLSI Slide 45 Memory Arrays Memory Arrays Random Access Memory Read/Write Memory (RAM) (Volatile) Static RAM (SRAM) Dynamic RAM (DRAM) Mask ROM Programmable ROM (PROM) Digital Design Content Addressable Memory (CAM) Serial Access Memory Read Only Memory (ROM) (Nonvolatile) Shift Registers Serial In Parallel Out (SIPO) Erasable Programmable ROM (EPROM) CMOS VLSI Queues Parallel In Serial Out (PISO) Electrically Erasable Programmable ROM (EEPROM) First In First Out (FIFO) Last In First Out (LIFO) Flash ROM Slide 46 Array Architecture 2n words of 2m bits each If n >> m, fold by 2k into fewer rows of more columns wordlines bitline conditioning bitlines row decoder memory cells: 2n-k rows x 2m+k columns n-k column circuitry k n column decoder 2m bits Good regularity – easy to design Very high density if good cells are used Digital Design CMOS VLSI Slide 47 6T SRAM Cell Cell size accounts for most of array size – Reduce cell size at expense of complexity 6T SRAM Cell – Used in most commercial chips – Data stored in cross-coupled inverters Read: bit – Precharge bit, bit_b word – Raise wordline Write: – Drive data onto bit, bit_b – Raise wordline Digital Design CMOS VLSI bit_b Slide 48 SRAM Sizing High bitlines must not overpower inverters during reads But low bitlines must write new value into cell bit_b bit word weak med med A A_b strong Digital Design CMOS VLSI Slide 49 Decoders n:2n decoder consists of 2n n-input AND gates – One needed for each row of memory – Build AND from NAND or NOR gates Static CMOS A1 Pseudo-nMOS A0 A1 word0 word1 Digital Design 1 1 8 A1 1 4 A0 1 A0 word0 word word1 word2 word2 word3 word3 CMOS VLSI A0 1/2 4 16 A1 1 1 2 8 word Slide 50 Decoder Layout Decoders must be pitch-matched to SRAM cell – Requires very skinny gates A3 A3 A2 A2 A1 A1 A0 A0 VDD word GND buffer inverter NAND gate Digital Design CMOS VLSI Slide 51 Sense Amplifiers Bitlines have many cells attached – Ex: 32-kbit SRAM has 256 rows x 128 cols – 128 cells on each bitline tpd (C/I) DV – Even with shared diffusion contacts, 64C of diffusion capacitance (big C) – Discharged slowly through small transistors (small I) Sense amplifiers are triggered on small voltage swing (reduce DV) Digital Design CMOS VLSI Slide 52 Queues Queues allow data to be read and written at different rates. Read and write each use their own clock, data Queue indicates whether it is full or empty Build with SRAM and read/write counters (pointers) WriteClk WriteData ReadClk Queue FULL Digital Design ReadData EMPTY CMOS VLSI Slide 53 CAMs Extension of ordinary memory (e.g. SRAM) – Read and write memory as usual – Also match to see which words contain a key adr data/key read CAM match write Digital Design CMOS VLSI Slide 54 10T CAM Cell Add four match transistors to 6T SRAM – 56 x 43 l unit cell bit bit_b word cell_b cell match Digital Design CMOS VLSI Slide 55 CAM Cell Operation address read/write Digital Design CMOS VLSI CAM cell clk weak miss match0 row decoder Read and write like ordinary SRAM For matching: – Leave wordline low – Precharge matchlines – Place key on bitlines – Matchlines evaluate Miss line – Pseudo-nMOS NOR of match lines – Goes high if no words match match1 match2 match3 column circuitry data Slide 56 ROM Example 4-word x 6-bit ROM – Represented with dot diagram – Dots indicate 1’s in ROM weak pseudo-nMOS pullups A1 A0 Word 0: 010101 Word 1: 011001 Word 2: 100101 Word 3: 101010 2:4 DEC ROM Array Y5 Y4 Y3 Y2 Y1 Y0 Looks like 6 4-input pseudo-nMOS NORs Digital Design CMOS VLSI Slide 57 PLAs A Programmable Logic Array performs any function in sum-of-products form. Literals: inputs & complements Products / Minterms: AND of literals Outputs: OR of Minterms bc AND Plane OR Plane Example: Full Adder s abc abc abc abc cout ab bc ac a b Inputs Digital Design CMOS VLSI c s Minterms ac ab abc abc abc abc cout Outputs Slide 58 PLA Schematic & Layout AND Plane OR Plane bc ac ab abc abc abc abc a b c s Digital Design cout CMOS VLSI Slide 59 Ideal nMOS I-V Plot 180 nm TSMC process Ideal Models – = 155(W/L) mA/V2 – Vt = 0.4 V – VDD = 1.8 V Ids (mA) 400 Vgs = 1.8 300 Vgs = 1.5 200 Vgs = 1.2 100 0 Digital Design CMOS VLSI Vgs = 0.9 Vgs = 0.6 0 0.3 0.6 0.9 1.2 1.5 1.8 Vds Slide 60 Simulated nMOS I-V Plot 180 nm TSMC process BSIM 3v3 SPICE models I (mA) What differs? 250 – Less ON current 200 – No square law 150 – Current increases 100 in saturation ds Vgs = 1.8 Vgs = 1.5 Vgs = 1.2 Vgs = 0.9 50 Vgs = 0.6 0 0 0.3 0.6 0.9 1.2 1.5 Vds Digital Design CMOS VLSI Slide 61 Velocity Saturation We assumed carrier velocity is proportional to E-field – v = mElat = mVds/L At high fields, this ceases to be true – Carriers scatter off atoms – Velocity reaches vsat • Electrons: 6-10 x 106 cm/s /2 • Holes: 4-8 x 106 cm/s – Better model slope = m sat sat μElat v vsat μEsat Elat 1 Esat Digital Design CMOS VLSI 0 0 Esat 2Esat 3Esat Elat Slide 62 Channel Length Modulation Reverse-biased p-n junctions form a depletion region – Region between n and p with no carriers – Width of depletion Ld region grows with reverse bias V V GND Source Gate Drain – Leff = L – Ld Depletion Region Width: L Shorter Leff gives more current – Ids increases with Vds L n+ n+ L – Even in saturation p GND bulk Si DD DD d eff Digital Design CMOS VLSI Slide 63 Body Effect Vt: gate voltage necessary to invert channel Increases if source voltage increases because source is connected to the channel Increase in Vt with Vs is called the body effect Digital Design CMOS VLSI Slide 64 OFF Transistor Behavior What about current in cutoff? I Simulated results 1 mA What differs? Sub100 mA threshold – Current doesn’t go 10 mA Region 1 mA to 0 in cutoff 100 nA ds 10 nA Saturation Region Vds = 1.8 Subthreshold Slope 1 nA 100 pA 10 pA Vt 0 0.3 0.6 0.9 1.2 1.5 1.8 Vgs Digital Design CMOS VLSI Slide 65 Leakage Sources Subthreshold conduction – Transistors can’t abruptly turn ON or OFF Junction leakage – Reverse-biased PN junction diode current Gate leakage – Tunneling through ultrathin gate dielectric Subthreshold leakage is the biggest source in modern transistors Digital Design CMOS VLSI Slide 66 Low Power Design Reduce dynamic power – a: clock gating, sleep mode – C: small transistors (esp. on clock), short wires – VDD: lowest suitable voltage – f: lowest suitable frequency Reduce static power – Selectively use ratioed circuits – Selectively use low Vt devices – Leakage reduction: stacked devices, body bias, low temperature Digital Design CMOS VLSI Slide 67 Chip-to-Package Bonding Traditionally, chip is surrounded by pad frame – Metal pads on 100 – 200 mm pitch – Gold bond wires attach pads to package – Lead frame distributes signals in package – Metal heat spreader helps with cooling Digital Design CMOS VLSI Slide 68 Bidirectional Pads Combine input and output pad Need tristate driver on output – Use enable signal to set direction – Optimized tristate avoids huge series transistors PAD En Din Dout NAND Dout En Y Dout NOR Digital Design CMOS VLSI Slide 69 Device Scaling Digital Design CMOS VLSI Slide 70 Interconnect Delay Digital Design CMOS VLSI Slide 71