The MOSFET – ID VDS > VGS – VT EE141 Review D G Resistive: VDS < VGS – VT ID What Have We Learnt? Saturation: k’ W ID = n ⋅ ⋅ (VGS − VT )2 ⋅ (1 + λ ⋅ VDS ) 2 L V2 W ID = k n’ ⋅ ⋅ (VGS − VT ) ⋅ VDS − DS 2 L S with VT = VT 0 + γ ⋅ EE141 EECS141 2φF + VSB − 2φF ) EE141 EECS141 Regions of Operation – Simplified ( VDSAT ≈ L·ξc Define VGT = VGS – VT 2.5 x 10 A Unified Model for Manual Analysis -4 define VGT = VGS – VT VDS = VDSAT 2 Velocity Saturation Linear for VGT ≤ 0: ID = 0 G ID (A) 1.5 Linear Relationship 1 ID VDSAT = VGT 0.5 VDS = VGT 0 D S 0 0.5 ID = k '⋅ B Saturation 1 1.5 2 for VGT ≥ 0: W L V2 ⋅ VGT ⋅ Vmin − min ⋅ (1 + λ ⋅VDS ) 2 with Vmin = min (VGT, VDS, VDSAT) 2.5 VDS (V) EE141 EECS141 EE141 EECS141 CMOS Inverter FirstFirst-Order DC Analysis Capacitive Device Model Gate-Channel Capacitance CGC = Cox·W·Leff CGC = (2/3)·Cox·W·Leff V DD Rp (Always) V out V out Junction/Diffusion Capacitance Cdiff = Cj·LS·W + Cjsw·(2LS + W) (Always) VOL = 0 VOH = VDD VM = f(Rn, Rp) Rn V in = V DD EE141 EECS141 V DD Gate Overlap Capacitance CGSO = CGDO = CO·W (Off, Linear) (Saturation) V in = 0 EE141 EECS141 1 Voltage Transfer Characteristic V V OH “ 1” V V IH Definition of Noise Margins “1” out VOH Slope = -1 OH NMH Undefined Region Undefined Region V “ 0” V Slope = -1 IL V OL IL V V IH in EE141 EECS141 Gate Output Gate Input (Stage M) (Stage M+1) EE141 EECS141 The Transistor Req Impact of Sizing/Process Variations 2.5 2.5 2 2 VGS ≥ VT ID Ron S Wider PMOS Symmetrical Wider NMOS 1 0.5 0.5 0 0 0 0 0.5 1 1.5 Vin (V) 2 Fast PMOS Slow NMOS 1.5 Vout(V) Vout(V) 1.5 1 2.5 Nominal Fast NMOS Slow PMOS 0.5 1 1.5 Vin (V) 2 EE141 EECS141 VGS = VDD Rmid D R0 1 Req = ⋅ (R0 + Rmid ) 2 VDS VDD /2 VDD 1 VDD VDD 2 + Req = ⋅ 2 IDSAT ⋅ (1 + λ ⋅VDD ) IDSAT ⋅ (1 + λ ⋅VDD 2 ) 2.5 3 V Req ≈ ⋅ DD 4 IDSAT 5 1 − ⋅ λ ⋅ VDD 6 EE141 EECS141 Req = f(VDD) Computing the Capacitances Miller effect M2 W/L=1, L=0.25µm V in 3 Cdb2 Cgd12 M1 Reverse biased junction Cg4 Cw M4 V out2 V out Cdb1 2 Simplified Model V DD VDD 1 EE141 EECS141 Noise margin low: NML = VIL – VOL “0” OL V VIL NML VOL Noise margin high: NMH = VOH – VIH VIH Cg3 M3 Off Æ Sat (M4) Lin (M3) 4 No Miller effect Fanout Vin Vout CL EE141 EECS141 2 Computing the Capacitances 2 Reverse biased junction Propagation Delay 1 Miller effect ? 3 2.5 2 tp = 0.69 CL (Reqn+Reqp)/2 V out (V) 1.5 (Lin*) 3 (Off Æ Sat*) tpHL tpLH 1 0.5 0 -0.5 0 0.5 1 1.5 t (sec) * assuming HL transition at Vout EE141 EECS141 2 2.5 x 10 -10 EE141 EECS141 Dynamic (Switching) Power Power in CMOS Switching Energy consumed in N cycles, EN: power Charging/Discharging capacitors Leakage EN = CL • VDD2 • n0→1 power n0→1 – number of 0→1 transitions in N cycles Transistors are imperfect switches Junction diodes Short-circuit Pavg = lim N →∞ power Both pull-up and pull-down on during transition EN n 2 ⋅ f = lim 0→1 ⋅ CL ⋅ VDD ⋅ f N →∞ N N n α 0→1 = lim 0→1 ⋅ f N →∞ N 2 Pavg = α 0→1 ⋅ CL ⋅ VDD ⋅ f EE141 EECS141 EE141 EECS141 Transistor Leakage Short Circuit Current VDD VDD Isc ∼ 0 Vout Vin Vdd Isc = IMAX Vin CL 2.5 CL = 20 fF 1.5 Isc (A) Vout x 10−4 2 Large load Vout CL Drain Junction Leakage Small load CL = 100 fF 1 Sub-Threshold Current CL = 500 fF 0.5 0 −0.5 0 20 40 60 Sub-threshold current one of most compelling issues in low-energy circuit design! time (s) EE141 EECS141 Short circuit current is usually well controlled EE141 EECS141 3 WIRE: Fringing and Parallel Plate Capacitance Fringing versus Parallel Plate (a) H W - H/2 + (from [Bakoglu89]) (b) EE141 EECS141 EE141 EECS141 Impact of Interwire Capacitance The Lumped RC-Model The Elmore Delay (from [Bakoglu89]) EE141 EECS141 EE141 EECS141 Driving an RC-line Logic Circuits Static Complementary CMOS VDD Rs (r w,cw,L) Vin … In1 In2 Vout PUN InN F(In1,In2,…InN) … In1 In2 InN PMOS only PDN NMOS only PUN and PDN are dual logic networks PUN and PDN functions are complementary EE141 EECS141 EE141 EECS141 4 Threshold Drops Constructing a Complex Gate VDD PUN VDD S VDD 0 → VDD S VGS VDD C F D VDD OUT = D + A • (B + C) D 0 → VDD - VTn CL CL SN3 D B B SN2 A D A SN4 F SN1 A C B D C F VDD → 0 PDN D VDD VDD → |VTp| VGS CL S S (a) pull-down network CL A (b) Deriving the pull-up network hierarchically by identifying sub-nets D B D C (c) complete gate EE141 EECS141 EE141 EECS141 Ratioed Logic DCVSL VDD VDD Resistive Load VDD Depletion Load RL F In1 In2 In3 PDN VSS PDN Out A A B B VSS (b) depletion load NMOS M2 Out F In1 In2 In3 PDN VSS (a) resistive load M1 PMOS Load VSS VT < 0 F In1 In2 In3 VDD VDD PDN1 PDN2 VSS VSS (c) pseudo-NMOS Differential Cascode Voltage Switch Logic (DCVSL) Goal: to reduce the number of devices over complementary CMOS EE141 EECS141 EE141 EECS141 Pass-Transistor Logic NMOS-only Switch Inputs Switch Out A A = 2.5 V A = 2.5 V Out B B Network C = 2.5 V C = 2.5V B B CL B M2 Mn M1 VB does not pull up to 2.5V, but 2.5V -VTN • N transistors Threshold voltage loss causes static power consumption • No static consumption NMOS has higher threshold than PMOS (body effect) EE141 EECS141 EE141 EECS141 5 Complementary Pass Transistor Logic A A B B Pass-Transistor F Network Inverse Pass-Transistor Network B B A F F=AB B B F=AB F M1 F=A⊕ΒÝ A F=A+B B M3/M4 (b) A B AND/NAND A A A F=A+B A B B A B A B M2 A B B (a) A A B B B Transmission Gate XOR B F=A⊕ΒÝ EXOR/NEXOR OR/NOR EE141 EECS141 EE141 EECS141 Delay Optimization Delay in Transmission Gate Networks 2.5 2.5 V1 In 2.5 C 0 2.5 Vi Vi-1 C 0 Vn-1 Vi+1 C 0 Vn C C 0 (a) Req Req V1 In Req Vi C Vn-1 Vi+1 C C Req Vn C C (b) m Req Req Req Req Req Req In C CC C C CC C (c) EE141 EECS141 EE141 EECS141 Sizing for Speed: Buffer Example In Logical Effort C Delay = k ⋅ Runit Cunit 1 + L γ Cin = τ (p + g ⋅ f ) Out C1 1 C2 2 CN N CL = CN+1 p – intrinsic delay (3kRunitCunitγ) - gate parameter ≠ f(W) g – logical effort (kRunitCunit) – gate parameter ≠ f(W) f – electrical effort (effective fanout) N Delay = ∑ (1 + fi ) i =1 (in units of τinv) For given N: Ci+1/Ci = Ci/Ci-1 To find N: Ci+1/Ci ~ 4 How to generalize this to any logic path? EE141 EECS141 fi = Ci+1/Ci Normalize everything to an inverter: ginv =1, pinv = 1 Divide everything by τinv (everything is measured in unit delays τinv) Assume γ = 1. EE141 EECS141 6 Multistage Networks Optimum Effort per Stage N Delay = ∑ (pi + g i ⋅ fi ) When each stage bears the same effort: i =1 hN = H Stage effort: hi = gifi h=N H Path electrical effort: F = Cout/Cin Stage efforts: g1f1 = g2f2 = … = gNfN Path logical effort: G = g1g2…gN Effective fanout of each stage: fi = h g i Branching effort: B = b1b2…bN Minimum path delay Path effort: H = GFB Dˆ = ∑ (g i fi + pi ) = NH 1/ N + P Path delay D = Σdi = Σpi + Σhi EE141 EECS141 EE141 EECS141 Logical Effort Summary Dynamic Gate Stage Logical Effort Path G = g ∏g f = Cout/Cin Branching Effort n/a B = ∏ bi h = fg H = FGB Effort Delay h DH = ∑ hi Number of Stages 1 N Parasitic Delay In1 In2 In3 d=h+p C B Me off Me on Two phase operation Precharge (Clk = 0) Evaluate (Clk = 1) EE141 EECS141 Solution to Charge Leakage Charge Sharing VDD case 1) if ∆V out < VTn Keeper Clk Mp A Clk Mkp CL Clk Mp Out Out B CL A Ma X Me Same approach as level restorer for pass-transistor logic EE141 EECS141 1 Out ((AB)+C) A Clk D = DH + P EE141 EECS141 CL PDN Clk P = ∑ pi p Delay off Mp on Out F = Cout / Cin Electrical Effort Effort Clk Mp Clk i B=0 Mb Clk Me Ca Cb C V = C V t + C (V V V L DD L out ( ) a DD – Tn ( X ) ) or Ca ∆V out = Vout ( t ) – V DD = – -------- ( V DD – V Tn ( V X ) ) CL case 2) if ∆V out > VTn Ca ∆Vout = –V DD ---------------------- Ca + CL EE141 EECS141 7 Issues in Dynamic Design 3: Backgate Coupling Backgate Coupling Effect 3 Mp Out1 =1 A=0 2 Out2 =0 CL1 In CL2 B=0 Clk Out1 Voltage Clk 1 Clk 0 Me In Out2 2 Time, ns -1 Dynamic NAND 0 Static NAND EE141 EECS141 6 EE141 EECS141 Issues in Dynamic Design 4: Clock Feedthrough Clk Mp A Out CL B Clk Me Domino Logic Coupling between Out and Clk input of the precharge device due to the gate to drain capacitance. So voltage of Out can rise above VDD. The fast rising (and falling edges) of the clock couple to Out. EE141 EECS141 Clk In1 In2 In3 Clk Mp 1→1 1→0 Out1 Mp Mkp Clk 0→0 0→1 In4 In5 PDN PDN Me Clk Me Out2 EE141 EECS141 Sequential Logic Latch versus Register np-CMOS z Clk In1 In2 In3 Clk Mp 1→1 1→0 PDN Out1 Clk Me In4 PUN In5 0→0 0→1 Me Latch – level-sensitive clock is low – hold mode clock is high - transparent Clk Mp Out2 (to PDN) Register – edge-triggered stores data when clock rises D Q D Q Clk Clk Clk Clk D D Q Q Only 0 → 1 transitions allowed at inputs of PDN Only 1 → 0 transitions allowed at inputs of PUN EE141 EECS141 4 EE141 EECS141 8 Mux-Based Latch Timing Definitions CLK CLK t tsu D Register D thold DATA STABLE t Q Q CLK CLK tc → q Q DATA STABLE D t CLK EE141 EECS141 EE141 EECS141 Master-Slave (Edge-Triggered) Register More Precise Setup Time Slave Clk CLK Master t 0 1 D D QM 1 QM 0 D Q t Q Q t CLK (a) CLK 1.05tC 2 Two opposite latches trigger on edge Also called master-slave latch pair Q tC 2 Q tSu tD 2 C tH EE141 EECS141 (b) EE141 EECS141 Other Latches/Registers: C2MOS CLK VDD VDD M2 M6 M4 D CLK M3 CLK X CL1 CLK VDD M8 M7 Other Latches/Registers: TSPC VDD CLK Q CL2 VDD VDD Out In CLK CLK In CLK CLK Out CLK M1 Master Stage M5 Slave Stage Positive latch (transparent when CLK= 1) Keepers should be added to staticize EE141 EECS141 Negative latch (transparent when CLK= 0) EE141 EECS141 9 Pulse-Triggered Latches Pulsed Latches Ways to design an edge-triggered sequential cell: Master-Slave Latches Data L2 D Q D Q Clk M3 M6 CLK VDD CLKG CLKG M2 M1 MP M5 CLKG X MN M4 D Q (a) register Clk Clk D L Data VDD Q Pulse-Triggered Latch L1 VDD (b) glitch generation Clk CLK Clk CLKG (c) glitch clock EE141 EECS141 EE141 EECS141 Other Sequential Circuits Schmitt Trigger Monostable Astable Timing Constraints R1 In Multivibrators D D tCLK1 CLK tc − q tc − q, cd tsu, thold Multivibrators R2 Combinational Logic Q Q tCLK2 tlogic tlogic, cd Cycle time: TClk > tc-q + tlogic + tsu Race margin: thold < tc-q,cd + tlogic,cd EE141 EECS141 EE141 EECS141 Timing Constraints In CLK R1 D Q Combinational Logic tCLK1 tc − q tc − q, cd tsu, thold Timing Constraints D Q Q Combinational Logic tCLK1 tc − q tlogic tlogic, cd Worst case is when receiving edge arrives early (negative δ) R1 D CLK tCLK2 Minimum cycle time: T + δ = tc-q + tlogic + tsu EE141 EECS141 In R2 tc − q, cd tsu, thold R2 D Q tCLK2 tlogic tlogic, cd Hold time constraint: t(c-q, cd) + t(logic, cd) > thold + δ Worst case is when receiving edge arrives late Race between data and clock Negative skew: system NEVER FAILS! EE141 EECS141 10 Longest Logic Path in EdgeEdge-Triggered Systems Impact of Jitter TC LK 2 CLK 1 5 4 3 t j itter TSU Clk -tji tte r 6 In Tlogic T Combinat ional Logic REGS CLK tc-q , tc-q, ts u, thold tjitter TClk-Q cd t log ic t log ic, cd EE141 EECS141 TJS + δ Latest point of launching Earliest arrival of next cycle Worst Case EE141 EECS141 Clock Constraints in EdgeEdge-Triggered Systems Shortest Path If launching edge is late and receiving edge is early, the data will not be too late if: Earliest point of launching Tc-q + Tlogic + TSU < T – TJS,1 – TJS,2 + δ Clk1 Tclk-q,cd Tlogic, cd Clk2 TH Minimum cycle time is determined by the maximum delays through the logic Tc-q + Tlogic + TSU - δ + 2 TJS < T Data must not arrive before this time Nominal clock edge Skew can be either positive or negative EE141 EECS141 EE141 EECS141 Clock Constraints in EdgeEdge-Triggered Systems If launching edge is early and receiving edge is late: Full-Adder A Cin Tc-q, cd + Tlogic, cd – TJS,1 > TH + TJS,2 + δ B Full adder Cout Sum Minimum logic delay Tc-q, cd + Tlogic, cd > TH + 2TJS+ δ EE141 EECS141 EE141 EECS141 11 Express Sum and Carry as a function of P, G, D Manchester Carry Chain VDD Define 3 new variable which ONLY depend on A, B φ P0 Generate (G) = AB P1 P2 P3 C3 Propagate (P) = A ⊕ B Ci,0 Delete = A B G1 G0 G3 G2 φ Can also derive expressions for S and Co based on D and P C0 Note that we will be sometimes using an alternate definition for Propagate (P) = A + B EE141 EECS141 C1 C2 C3 EE141 EECS141 Carry-Bypass Adder G1 P0 C o,0 FA P0 G1 Ci,0 FA P0 C o ,0 G1 P2 C o ,1 FA G1 FA P2 Co,1 G2 FA P3 Co,2 G2 FA P3 C o,2 Also called Carry-Skip G3 FA G3 Co,3 Bit 0–3 Bit 4–7 Setup Setup tsetup Bit 8–11 Bit 12–15 Setup Setup Carry propagation tbypass Carry propagation Carry propagation Carry propagation Sum Sum Sum BP=P oP1 P2 P3 FA Multiplexer P0 Ci,0 Carry-Bypass Adder (cont.) Co,3 tsum Sum M bits Idea: If (P0 and P1 and P2 and P3 = 1) then Co3 = C 0, else “kill” or “generate”. tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum EE141 EECS141 EE141 EECS141 Linear Carry Select Bit 0-3 Square Root Carry Select Bit 4-7 Setup Bit 8-11 Setup Bit 0-1 Bit 12-15 Setup Setup "0" Carry "0" "0" Carry "0" "0" Carry "0" "0" Carry "1" Carry (5) (5) "1" (6) "1" Carry (5) "1" (7) "1" Carry (5) (8) Bit 9-13 Setup Bit 14-19 "0" "0" Carry "0" "0" Carry Setup "0" "0" Carry (1) "1" Carry "1" Carry "1" (3) (5) Multiplexer Multiplexer Multiplexer Multiplexer Sum Generation Sum Generation Sum Generation Sum Generation S 4-7 S8-11 Ci,0 "1" (3) Multiplexer Ci,0 "1" Carry "1" (4) (4) Multiplexer "1" Carry "1" (5) (5) Multiplexer "1" Carry (7) (6) (6) Multiplexer (7) Mux (8) (9) S0-3 EE141 EECS141 "1" Setup "0" Carry "0" (1) "1" Bit 5-8 (1) (1) "0" Bit 2-4 Setup Sum Generation S0-1 S 1 2-15 (10) Sum Generation Sum Generation S2-4 S5-8 Sum Generation S9-13 Sum S14-19 (9) EE141 EECS141 12 S5 S6 S7 S8 S9 S10 S11 S12 (A6, B6) (A7, B7) (A8, B8) (A9, B9) (A10, B10) (A11, B11) (A12, B12) S15 S4 (A5, B5) S14 S3 (A4, B4) S13 S2 (A3, B3) (A14, B14) S1 (A2, B2) (A13, B13) S0 (A0, B0) Tree Adders (A1, B1) Carry Lookahead Trees Co , 0 = G0 + P 0 Ci , 0 C o, 1 = G 1 + P 1 G 0 + P1 P0 Ci, 0 C o, 2 = G 2 + P2 G 1 + P2 P1 G 0 + P 2 P1 P0 C i, 0 Can continue building the tree hierarchically. (A15, B15) = ( G2 + P2 G 1) + ( P2 P1 ) ( G0 + P0 Ci , 0 ) = G 2:1 + P 2:1 C o, 0 16-bit radix-2 Kogge-Stone tree EE141 EECS141 EE141 EECS141 CarryCarry-Save Multiplier The Array Multiplier X3 X3 X2 X1 X0 HA FA FA HA X2 X1 X0 FA FA X2 X1 X0 FA FA FA HA Z5 Z4 Y2 Y0 HA HA HA HA FA FA FA HA FA FA FA FA FA HA HA Y1 Z0 X2 FA Z6 X0 X3 X3 Z7 X1 Z1 HA Y3 HA Z2 Vector Merging Adder t mult = (N − 1) ⋅ tcarry + (N − 1) ⋅ t and + t merge Z3 EE141 EECS141 EE141 EECS141 Equivalent Transient Model for MOS NOR ROM Equivalent Transient Model for MOS NAND ROM V DD Model for NOR ROM V DD Model for NAND ROM BL BL rword WL CL r bit Cbit cword WL r word cbit cword Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon) Bit line parasitics Word line parasitics Bit line parasitics Similar to NOR ROM Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance Resistance not dominant (metal) Drain and Gate-Drain capacitance EE141 EECS141 EE141 EECS141 13 NAND Flash Memory 6-transistor CMOS SRAM Cell Word line(poly) WL V DD M2 Gate Unit Cell ONO Gate Oxide M5 FG Q M1 BL M4 Q M6 M3 BL Source line (Diff. Layer) EE141 EECS141 Courtesy Toshiba EE141 EECS141 1-Transistor DRAM Cell Write: C S is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance CS ∆V = VBL – V PRE = V BIT – V PRE -----------C S + CBL Voltage swing is small; typically around 250 mV. EE141 EECS141 14