Digital IC-Design CMOS Steady State Behavior High noise margin Chapter 6 proper design ⇒ 40% of VDD Low impedance connection to GND or VDD High Gain Combinational Logic “No” static power consumption Comparable p rise and fall times VM close to VDD/2 VOH at VDD and VOL at GND Overview Static CMOS Conventional Static CMOS Logic Ratioed Logic Differential Logic Pass Transistor Logic Combinational v.s. Sequential Combinational Logic Combinational Logic State Dynamic CMOS Logic Domino Logic np- CMOS Combinational Output = f( In(t) ) Sequential Output = f( In(t), In(t-1) ) 1 Static v.s. Dynamic CMOS Design Static v.s. Dynamic CMOS Design Static VDD low resistive Each gate output have a low-resistive path to either VDD or GND VDD A VDD B fNAND B fNAND Dynamic Relies on temporary storage of signal values on the capacitance of high impedance circuit nodes Static CMOS fNAND B B A A A CLK GND GND GND Static Dynamic NMOS not Good for Passing “1” VDD VDD Pull-Up ll Net C CLK PUN PMOS Only l VDD VDD VDD VDD-VT VT Pull-Down Net PDN NMOS Only VDD GND PUN and PDN are Dual Networks VDD-VT VT VDD-VT VT The transistors will close when the output is approaching VDD-VT 2 CMOS Static CMOS Logic Style NMOS PMOS Out = In if A ⋅ B Out = In if A ⋅ B VDD 0 VDD 0 VDD VDD A+ B = A B VDD 0 VDD-VT Shown by De Morgan Law’s 0 VDD 0 VDD PUN is the DUAL of PDN A B = A+ B The static g gate is inverting g 0 0 = VT NMOS pass a “strong” 0 but a “weak” 1 PMOS pass a “strong” 1 but a “weak” 0 AND = NAND + INV Example: NAND Example: NOR fPUN VDD A B fNAND f NOR = A + B + C = A B C Note that PMOS have Negative Logic VDD B GND f NAND = A B = A + B f NAND = A B = A B f NOR = A + B + C = A + B + C A fPUN A fPDN B fPDN C fNOR A B Note that PMOS have Negative Logic C GND 3 Layout: 2-input AND Gate Example: Complex Gate - AOI A B C Vdd fPUN fAOI Vdd Vdd PMOS f AOI = AB + C = ( A + B )C VDD A fPDN AND NAND B f AOI = AB + C = AB + C C fAOI NMOS B A A C Common cell in most cell libraries B GND GND Layout: 4-input NAND Gate GND Logic Graph: Complex Gate - OAI A X = ( A + B) ⋅ C VDD A B C fNAND Crossing = Transistor PMOS D C fNAND B NMOS f VDD A C j C B f GND C B i f Dot = Node VDD i ABCD GND f OAI C VDD D A B A B j A GND Common cell in most cell libraries 4 Layout Example: Complex Gate Euler Paths VDD A j C B X = ( A + B) ⋅ C f f C B i C A f VDD VDD B VDD VDD GND j C A C B A B C f f f ( A+ B ) C C B A j A B f ( A+ B ) C Euler Path i i All the transistors are visited once in one single path C A i f VDD GND B j A GND GND Example AB + CD VDD D C A B VDD A C Req-p A Req-p f AB +CD B Propagation Delay Analysis A B C D A B CL Req-n CL A Req-n D Req-p B A f AB + CD Req-p The switch model: For manual calculations on complex gates A A Req-n Req-p B B Req-n CL Req-n tp = 0.69 Req CL GND INV NAND NOR 5 Equivalent Resistance Req Propagation delay analysis Req = the average of The switch model is attractive for manual calculations on complex gates the resistance at VDD and VDD/2 Fix value on Req is a reasonable approximation. Which value? An average value over different regions: ID (A) ID (A) Linear Velocity Saturated Reqq -n = Rn (VOUT =VDD ) + Rn (VOUT =VDD / 2) 2 = Req ⎡VDS ⎤ ⎡V ⎤ + ⎢ DS ⎥ ⎢I ⎥ ⎣ D ⎦ (VOUT =VDD ) ⎣ I D ⎦ (VOUT =VDD / 2) = 2 Req -n Velocity Saturated Saturated Req VDS (V) VDS (V) VDD/2 Long Channel Example 3V VDD/2 VDD VDD Short Channel Analysis of Propagation Delay Req-p Req − n ⎡ VDS ⎤ ⎡V ⎤ + ⎢ DS ⎥ ⎢I ⎥ ⎣ D ⎦ (VOUT =VDD ) ⎣ I D ⎦ (VOUT =VDD / 2) = 2 Req-p qp Req-n 3/200μ = 15kΩ 3/600μ = 5kΩ -3 -2.5 -2 -1.5 -1 -0.5 VDS VGS=3 B B 0 0 ID [uA] 700 600 500 400 300 200 100 0 A Req-p qp CL Req-n -50 1.5/180μ = 8.4kΩ 1.5/500μ μ = 3kΩ VDS 0 0.5 1 1.5 2 2.5 -100 -150 VGS=-3 3 Req-n = (3+5)/2 = 4kΩ A Req-n Three cases 1. Pull up of one PMOS, Worst case tpLH =0.69RpCL 2. Pull up of two PMOS at the same time tpLH =0.69(Rp /2)CL 3. Pull down of NMOS tpHL =0.69(2R 0 69(2Rn)CL -200 ID [uA] -250 Req-p = (15+8.4)/2 = 11.7kΩ NAND See also equation 3.43 6 Example: Propagation Delay Req-p A Example: Compare with Inverter Req-p = 11.7kΩ Req-p fAND B B CL Inverter pull down Req-n = 4kΩ t pHL = 0.69 × Rn CL Req-n A = 0.69 × 4 × 103 × 5 × 10-15 = 14 ps CL =Cdiff + Cgate = 5 fF Req-n t pLH = 0.69 × R p CL NAND pull down = 0.69 × 11.7 × 103 × 5 ×10-15 = 40 ps NAND = 0.69 × 2 × 4 × 103 × 5 × 10-15 = 28 ps = 0.69 × 2 × 4 ×10 × 5 × 10 3 -15 = 28 ps Dimension for the Inverter Performance fAND Sp = Sp Sn = CL = 14 ps 0.69 ×11.7 ×103 × 5 × 10−15 = 2.9 14 ×10−12 t pHL = 0.69 × 2 × Rn CL Sn 0.69 × 2 × 4 × 103 × 5 × 10−15 = 2.0 14 ×10−12 Physical Model (page 202) t pHL = t pLH = 0.69 Scaling R with a factor S Rp Due to 2 serial NMOS t pHL = 0.69 × 2 × Rn CL t pHL = 0.69 × 2 × RnCL t pLH = 0.69 × fAND fAND Short Channel Transistors 3 CLVDD = 4 I DSAT CLVDD CLVDD = 00.52 52 = VDSATn V W ' S ×W k VDSATn (VDD − VTn − ) VDSATp (VDD − VTp − DSATp ) −k p L 2 L 2 VDSATn 0.63 ' −6 ) 115 × 10 × 0.63 × (2.5 − 0.43 − ) knVDSATn (VDD − VTn − 2 2 S= = = 2.2 V − 1 −6 − k p' VDSATp (VDD − VTp − DSATp ) −( −30 × 10 × (−1) × (−2.5 − (−0.4) − )) 2 2 = 0.52 0 52 ' n Approximative model Compare with Physical model (next slide) CLVDD CLVDD = 0.52 W ' S ×W k VDSATn kp VDSATp L L −6 115 ×10 × 0.63 = = 2.4 −30 × 10−6 × (−1) t pHL = t pLH ≈ 0.52 ' n S= kn' × VDSATn k p' × VDSATp 7 Power-Delay Product Example using the same data Energy per switching event PDP = P × t p = CL × V 2 2 DD If we lower the supply, the PDP will be reduced, but also the performance Energy and performance C ×V 2 EDP = P × t = L DD × t p 2 2 p Short Channel Transistors Balancing regarding VM Some claims that EDP is a better measure since it includes the delay VDSATn2 0.632 ) 115× ((1.25 − 0.43) × 0.63 − ) 2 2 = 3.5 = = 2 −1 V Wn −k p' ((VDD +VM −VTp )VDSATp + DSATp ) −30 × (−2.5 +1.25 − (−0.4) − ) 2 2 kn' ((VM −VTn )VDSATn − Wp To be balanced, The PMOS should be 3.5 times wider than the NMOS For the minimal NMOS with Wn=0.375 μm, the corresponding PMOS has Wp=1.3 μm Example: Propagation Delay Req-p qp A Scale Factor for Worst Case Req-n = 4kΩ Req-p = 11.7kΩ CL =5 fF Req-p qp Pull up of one PMOS B B CL = 0.69 ×11.7 × 103 × 50 × 10-15 = 40 ps Req-n Pull down of NMOS t pLH = 0.69 × 2 × Rn CL = 0.69 × 2 × 4 × 103 × 50 × 10-15 = 27 ps NAND Assuming Wp ≈ 3Wn 1 3 B A B t pLH = 0.69 × R p CL Req-n A 1 A Long Channel CL 6 A B 2 A 3 B A 2 6 CL 2 B A 2 B 1 CL 1 NAND NAND NOR Factor for stacked transistors For stacked and low hole mobility For stacked and low hole mobility (3.5 for short channel) 8 Influence of Fan-In and Fan-Out Transistor Sizing (0.35 um) For same driving capability in all gates (factor 3.5) For symmetrical delay (factor 2.4) VDD A B fNAND C 5.4 C B 5.4 B A 5.4 Fan- In: Quadratic (stacked) 1. Resistance Increase 2. Capacitance Increase 3.6 VDD D 1.8 A Fan- In: Linear Overall C increases by 2N D D VDD VDD C A´ CLK´ 3.6 CLK 1.2 A 0.6 1.8 A´ GND Fan- Out: Linear Two Gate-C per Fan-out A f D 1.2 B 1.2 GND 1.2 GND A 1.2 C 1.2 tp = a1 FI + a2 FI2 + a3 FO GND Assuming Wp ≈ 3Wn Assuming non-scaled trans. tp as a function of Fan-In Delay Improvement: Complex Gates Progressive Sizing: VDD tp [ns] 4 tppHL 3 tp 2 tpLH 1 1 3 5 7 Performance degrades fast with large fan-in Fan-in larger than 3 to 4 is usually avoided - Memories o s is s an a exception 9 Fan-in Simulation of an N-input NAND PUN fNAND D m4 C m3 B m2 CL C3 C1 C1 A m1 m4 m3 m2 m1 discharge discharge discharge discharge CL C L+ C3 CL+ C3+ C2 CL+ C3+ C2+ C1 I.e. Increase size downwards Can Reduce Delay with more than 30%! m1>m2>m3>m4 9 Delay Improvement: Complex Gates Delay Improvement: Complex Gates Transistor Ordering VDD PUN VDD VDD PUN fNAND PUN fNAND VDD D m4 CL C1 C1 m2 A m1 C1 A C3 m3 B fNAND CL C3 C Gates in the critical path CL C3 VDD C1 C1 C1 A Progressive sizing: Cdiff increase m1>m2>m3>m4 Delay Improvement: Complex Gates Improved Logic Design CInternal Discharged Improvement: up to 15% Delay Improvement: Complex Gates If CL (Fan-out) is large, all the transistors have to be extra scaled CL Reduced Fan-in Buffering: B ff i Isolate I l Fan- in from Fan-out CL Only O l the h b buffer ff transistors i have to be extra scaled 10 Ratioed Logic Ratioed Logic with Resistive load Rpull-up >> Req-n VDD VDD Rpull-up "Rpull-up" f "Rpull-up" f PDN GND Resistive load NMOS VDD St ti power consumption Static ti PDN Pseudo-NMOS tpLH = 0.69 Rpull-up CL (large) GND Static Logic: # of Transistors = 2N Ratioed Logic: # of Transistors = N+1 5 VOUT Req − n + R pull −up Asymmetrical VTC GND Pseudo- NMOS Req − n f PDN GND VOL = Rpull-up f PDN VOH = VDD VDD VDD Pseudo - NMOS: Example NOR N Sat P Lin N Off P Lin 4 3 N Sat P Sat 2 VOH = VDD; VOL = ? 1 N Lin P Sat 1 ID = kp 2 (VDD - VT ) 2 = kn (VDD - VT - VOL = (VDD - VT )(1- 1- VDD fNOR A B C kp kn 2 N Lin P Off 3 4 Assuming that VTn = -VTp GND 5 VOL )VOL 2 ) D VIN Wn = W p = 0.6 μ VDD = 3V ; VT = 0.5V ; k´n = 3k´ p = 150 μ A / V Leffp = 0.3μm (minimal) VOL = (VDD − VT )(1 − 1 − k´ p × W p / Leffp k´n × Wn / Leffn i.e. kn = 3kp ) VOL = (3 − 0.5)(1 0 5)(1 − 1 − 50 × 0.6 / 0.3 ) = 0.5 0 5V 150 × 0.6 / 0.3 VOL = (3 − 0.5)(1 − 1 − 50 × 0.6 /1.5 ) = 0.1V 150 × 0.6 / 0.3 Leffp = 1.5 μm i.e. kn = 15kp Small area but static power 11 Pseudo- NMOS NAND Gate Improved Design Two dual NMOS net Layout Example VDD VDD VDD fNAND C B ABCD GND VDD f f B B B A GND PDN Dual GND Fast, VOH=VDD and VOL=GND Differential Cascode Voltage Switch Logic (DCVSL) 3 Trans. per PDN ((4 in static CMOS)) N transistors Switching Network Low input load No stacked PMOS Fast Relatively high dynamic power consumption A Both inverse and non-inverse signals at both input and output Pass- Transistor Logic Example DCVSL - XOR VDD PDN GND GND A B A&A B&B No static power consumption f f fNAND D VDD B B A B f “No” static power consumption in pass transistors f GND AND NAND 12 Pass- Transistor Logic NMOS switch net AND/NAND 0 B 1 B 1 B 0 B f AND 0 A f NAND 0 1 A 1 B 0 B 0 B 1 B 1 B f AND 1 A f NAND 1 1 B 0 f AND VDD B → VDD f NAND A = VDD Static Current X VX = VDD -VT 0 B 0 0 A f AND V (V) f NAND B VX = VDD -VT The PMOS will not be completely closed while the NMOS is open - Static power consumption 0 B NMOS Switch Net: Improvement Level restorer Advantage: Full Swing B →1 VX = VDD Careful design: kmr small to be able to sink node X X A = VDD B →1 Symbol X VDD B→0 Disadvantage: More Complex, Larger Capacitance mr A =1 Transmission Gate V VX = VDD B t 13 Resistance in a Transmission Gate Resistance Req (W/L = 1) in 0.25 um Balanced design R Req − p Req − n VDD (V) 1 1.5 2 2.5 NMOS (kΩ) 35 19 15 13 PMOS (kΩ) 115 55 38 31 GND 100 VIN See Table on Back Cover R (kΩ) GND PMOS VOUT VIN NMOS VOUT VDD VDD 50 Req −total Total Vin (V) 0.5 1.5 VDD S fMUX B B S A fMUX A A B S 2.5 VDD (V) B VDD A 2 Transmission Gate XOR Pass Transistor Based Multiplexer S 1 B S GND 8 Transistors in static CMOS (+ one inverter for S) B fXOR 8 transistors in static CMOS (+ two inverters) or 10 transistors in a two stage operation 14 Transmission Gate XOR 00 10 0 Complementary Pass Transistor Logic 0 0 0 Inputs 0 0 1 1 1 1 1 11 1 Inverse Inputs 1 B 1 0 1 0 1 f B CPL B B A B A f f B B f A 0 1 0 A B=0 Transmission gate open, B=1 “Inverter” B 0 A CPL vs. CMOS + Fewer transistors + Mostly NMOS ”CPL: The Low Power Family” - Need level restorer - More wires f B Lower internal swing No input inverters Smaller stack height Complex p functions with a minimum of transistors f A OR/NOR XOR/XNOR CPL: Fast Level Restoring Cross coupling give fast restoring Ä Lower Power Consumption - smaller transistors B B A f AND/NAND + + + + Inverse Switching Network B A 1 0 0 f 1 0 1 01 Switching Network B B Low short-circuit current in inverters A A F F A XOR A A F A A XOR F A 15 Digital IC-Design Digital IC-Design Chapter 6 Combinational Logic Dynamic Combinational Logic Cont. Overview Static CMOS Conventional Static CMOS Logic Ratioed Logic Differential Logic Pass Transistor Logic D i CMOS Logic L i Dynamic Domino Logic np- CMOS Static v.s Dynamic CMOS Design Static Each gate output have a low low-resistive resistive path to either VDD or GND Dynamic The dynamic circuit relies on temporary g of signal g values on the storage capacitance of high impedance circuit nodes. 16 Static v.s. Dynamic CMOS Design Dynamic Logic High Impedance node VDD VDD VDD A C CLK VDD VDD φ=0 Out=1 B fNAND B fNAND B B A A A CLK GND GND φ=0 φ=1 Out=0 CL A=1/0 fNAND VDD φ =1 A=1 φ =1 GND Out=1 CL CL A=0 φ=1 GND GND GND Static Dynamic Precharge Evaluation No Glitching Dynamic Logic Example VDD φ N + 2 Transistors VDD VDD φ φ - 2N in static CMOS Out In PDN CL In CL PUN A Out φ CL φ GND NMOS Net GND PMOS Net C fAB+C VOH= VDD and VOL= GND - Ratio less (VOH high imped.) No static power Low input load B Requires clock φ GND No glitching during evaluation 17 Noise Margins Dynamic 4 Input NAND Gate NML = VIL-VOL is small VIL ≤ VT (VIL << VT due to leakage) VDD φ 1 VDD φ fNAND D Out VOL VDD VIL D C B A B NMH is high but sensitive to capacitive coupling due to the high impedance GND A VDD Out=1 φ =1 If CA is discharged during evaluation φ Sub threshold current fNAND t A->1 Out Charge Leakage Prech. GND ABCD Charge Sharing φ CL fNAND GND GND φ =1 A=0 φ φ Reliability Problems: Charge Leakage VDD VDD C CL φ fNAND φ B=0 Eval. t φ CL CA CL has to share its charge. ii.e. e Voltage drop at the output Minimum Clock Frequency: > 1 MHz 18 VDD Charge Sharing Charge Sharing - Example φ fNAND Vout = VDD VA CA CA Voutt Vout fNAND C CLVDD + C AVA CL + C A A=0 VDD CL VDD CL + C A 25 fF 5V 25 fF + 3 × 3.1 fF = 3.6V Vout = 25fF 3.1fF Vout 3.1fF B 3.1fF φ GND Large leakage in inverter P-trans. A complex PDN with parallel transistors is even worse CL = VDD CL + C A Example: Large Internal Node φ φ D VA << VDD CL Vout = φ Ctot = CL + C A CL VDD CL B=0 Qtot = CLVDD + C AVA Vout = CA A->1 Qtot Ctot Charge Sharing — Solutions Weak pull up Euler Path f(A+BC)D B VDD A C PMOS D φ NMOS A BC Dφ φ φ fNAND φ fNAND fAND A B φ φ φ GND Large Capacitive Node f(A+BC)D φ 19 Cascading Dynamic Gates Domino Logic Erroneous discharging of the second stage Prech Prech. φ A=1 φ φ f1 φ Eval. Eval φ f2 PDN Only 0->1 transitions allowed during evaluation Domino Logic - Characteristics Only non-inverting logic - limits the use Very fast – Few P-channel F P h l transistors t i t – No delay on zero output (at inverter) – Fast transition 0->1 φ PDN φ PDN φ PDN φ PDN φ φ All inputs are set to 0 during precharge np-Logic φ φ φ 1 φ 1 PUN PUN PDN φ φ 0 PDN φ VTn2 t φ φ 1 0 f1 f2 φ 1 φ PDN φ 0 φ φ N-inputs are 0 and P-inputs are 1 during precharge 20 np-Logic Differential Dynamic Logic Inverting stages Faster than domino (no inverters) P-blocks are slower Two clock-phases φ φ φ VDD f PDN PDN Dual PUN PDN φ φ f A&A B&B φ PUN VDD PDN φ φ φ Alternative to Full Custom Layout Gate Array: Sea-of-Gates Pre-Diffused Transistors Routing g Channels φ Gate Array — Sea-of-gates Pre-Diffused Transistors Metal Add On 4-Input NOR VDD A B C D GND f NOR 21