ELEC 5970-003/6970-003 (Fall 2004) Advanced Topics in Electrical Engineering Designing VLSI for Low-Power and Self-Test Power Consumption in a CMOS Circuit Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University http://www.eng.auburn.edu/~vagrawal vagrawal@eng.auburn.edu 9/23-30/04 ELEC 5970-003/6970-003 1 Motivation • Low power applications – Remote systems (e.g., satellite) – Portable systems (e.g., mobile phone) • Methods of low power design – Reduced supply voltage – Adiabatic switching – Clock suppression – Logic design for reduced activity – Reduce Hazards (40% in arithmetic logic) – Software techniques • Reference: Chandrakasan and Brodersen 9/23-30/04 ELEC 5970-003/6970-003 2 Low-Power Design • Design practices that reduce power consumption at least by one order of magnitude; in practice 50% reduction is often acceptable. • General topics – High-level and software techniques – Gate and circuit-level methods – Power estimation techniques – Test power 9/23-30/04 ELEC 5970-003/6970-003 3 VLSI Chip Power Density Source: Intel Sun’s Surface Power Density (W/cm2) 10000 Rocket Nozzle 1000 Nuclear Reactor 100 8086 Hot Plate 10 4004 8008 8085 386 286 8080 1 1970 9/23-30/04 1980 P6 Pentium® 486 1990 Year 2000 ELEC 5970-003/6970-003 2010 4 Specific Topics on Low-Power • • • • • Power dissipation in CMOS circuits Low-power CMOS technologies Dynamic reduction techniques Leakage power Power estimation 9/23-30/04 ELEC 5970-003/6970-003 5 Components of Power • Dynamic – Signal transitions • Logic activity • Glitches – Short-circuit • Static – Leakage 9/23-30/04 ELEC 5970-003/6970-003 6 Power of a Transition isc R VDD Power Vo Vi CL R = CLVDD2/2 + Psc Ground 9/23-30/04 ELEC 5970-003/6970-003 7 Short Circuit Current, isc(t) VDD VDD - VTp Vi(t) Volt Vo(t) VTn 0 45μA Iscmaxr isc(t) Amp 0 9/23-30/04 tB tE ELEC 5970-003/6970-003 1 Time (ns) 8 Peak Short Circuit Current • Increases with the size (or gain, β) of transistors • Decreases with load capacitance, CL • Largest when CL= 0 • Reference: M. A. Ortega and J. Figueras, “Short Circuit Power Modeling in Submicron CMOS,” PATMOS’96, Aug. 1996, pp. 147-166. 9/23-30/04 ELEC 5970-003/6970-003 9 Short-Circuit Energy per Transition • Escr=∫ tE tB VDD isc(t)dt = (tE – tB) IscmaxrVDD/2 • Escr = tr (VDD+ VTp-VTn) Iscmaxr/2 • Escf = tf (VDD+ VTp-VTn) Iscmaxf/2 • Escf = 0, when VDD = |VTp| + VTn 9/23-30/04 ELEC 5970-003/6970-003 10 Short-Circuit Energy • Increases with rise and fall times of input • Decreases for larger output load capacitance • Decreases and eventually becomes zero when VDD is scaled down but the threshold voltages are not scaled down 9/23-30/04 ELEC 5970-003/6970-003 11 Short-Circuit Power Calculation • Assume equal rise and fall times • Model input-output capacitive coupling (Miller capacitance) • Use a spice model for transistors – T. Sakurai and A. Newton, “Alpha-power Law MOSFET model and Its Application to a CMOS Inverter,” IEEE J. Solid State Circuits, vol. 25, April 1990, pp. 584-594. 9/23-30/04 ELEC 5970-003/6970-003 12 Psc vs. C 0.7μ CMOS 45% Psc/Ptotal 3ns 0% 35 9/23-30/04 Input rise time 0.5ns C (fF) ELEC 5970-003/6970-003 75 13 Technology Scaling • Scale down by factors of 2 and 4, i.e., model 0.7, 0.35 and 0.17 micron technologies • Constant electric field assumed • Capacitance scaled down by the technology scale down factor 9/23-30/04 ELEC 5970-003/6970-003 14 Technology Scaling Results L=0.17μ, C=10fF Psc/Ptotal 70% L=0.35μ, C=20fF 10% L=0.7μ, C=40fF 0% 0.4 9/23-30/04 tr (ns) ELEC 5970-003/6970-003 1.6 15 Effects of Scaling Down • • • • 1-16% short-circuit power at 0.7 micron 4-37% at 0.35 micron 12-60% at 0.17 micron Reference: S. R. Vemuru and N. Steinberg, “Short Circuit Power Dissipation Estimation for CMOS Logic Gates,” IEEE Trans. on Circuits and Systems I, vol. 41, Nov. 1994, pp. 762-765. 9/23-30/04 ELEC 5970-003/6970-003 16 Summary: Short-Circuit Power • Short-circuit power is consumed by each transition (increases with input transition time). • Reduction requires that gate output transition should not be slower than the input transition (faster gates can consume more short-circuit power). • Scaling down of supply voltage with respect to threshold voltages reduces short-circuit power. 9/23-30/04 ELEC 5970-003/6970-003 17 Components of Power • Dynamic – Signal transitions • Logic activity • Glitches – Short-circuit • Static – Leakage 9/23-30/04 ELEC 5970-003/6970-003 18 Leakage Power IG Ground VDD R n+ Isub IPT IGIDL 9/23-30/04 ELEC 5970-003/6970-003 n+ ID 19 Leakage Current Components • Subthreshold conduction, Isub • Reverse bias pn junction conduction, ID • Gate induced drain leakage, IGIDL due to tunneling at the gate-drain overlap • Drain source punchthrough, IPT due to short channel and high drain-source voltage • Gate tunneling, IG through thin oxide 9/23-30/04 ELEC 5970-003/6970-003 20 Subthreshold Current Isub = μ0 Cox (W/L) Vt2 exp{(VGS-VTH)/nVt} μ0: carrier surface mobility Cox: gate oxide capacitance per unit area L: channel length W: gate width Vt = kT/q: thermal voltage n: a technology parameter 9/23-30/04 ELEC 5970-003/6970-003 21 IDS for Short Channel Device Isub = μ0 Cox (W/L) Vt2 exp{(VGS-VTH+ηVDS)/nVt} VDS = drain to source voltage η: a proportionality factor 9/23-30/04 ELEC 5970-003/6970-003 22 Increased Subthreshold Leakage Scaled device Log Isub Ic 0 VTH’ VTH 9/23-30/04 ELEC 5970-003/6970-003 Gate voltage 23 Summary: Leakage Power • Leakage power as a fraction of the total power increases as clock frequency drops. Turning supply off in unused parts can save power. • For a gate it is a small fraction of the total power; it can be significant for very large circuits. • Scaling down features requires lowering the threshold voltage, which increases leakage power; roughly doubles with each shrinking. • Multiple-threshold devices are used to reduce leakage power. 9/23-30/04 ELEC 5970-003/6970-003 24 Components of Power • Dynamic – Signal transitions • Logic activity • Glitches – Short-circuit • Static – Leakage 9/23-30/04 ELEC 5970-003/6970-003 25 Power of a Transition isc R VDD Power Vo Vi CL R = CLVDD2/2 + Psc Ground 9/23-30/04 ELEC 5970-003/6970-003 26 Dynamic Power • Each transition of a gate consumes CV2/2. • Methods of power saving: – Minimize load capacitances • Transistor sizing • Library-based gate selection – Reduce transitions • Logic design • Glitch reduction 9/23-30/04 ELEC 5970-003/6970-003 27 Glitch Power Reduction • Design a digital circuit for minimum transient energy consumption by eliminating hazards 9/23-30/04 ELEC 5970-003/6970-003 28 Theorem 1 • For correct operation with minimum energy consumption, a Boolean gate must produce no more than one event per transition 9/23-30/04 ELEC 5970-003/6970-003 29 Theorem 2 • Given that events occur at the input of a gate (inertial delay = d ) at times t1 < . . . < tn , the number of events at the gate output cannot exceed tn – t1 min ( n , 1 + -------d ) tn - t1 t1 9/23-30/04 t2 t3 ELEC 5970-003/6970-003 tn time 30 Minimum Transient Design • Minimum transient energy condition for a Boolean gate: | t i - tj | < d Where ti and tj are arrival times of input events and d is the inertial delay of gate 9/23-30/04 ELEC 5970-003/6970-003 31 Balanced Delay Method • All input events arrive simultaneously • Overall circuit delay not increased • Delay buffers may have to be inserted 4? 1 1 1 1 1 1 1 3 1 9/23-30/04 1 1 ELEC 5970-003/6970-003 32 Hazard Filter Method • Gate delay is made greater than maximum input path delay difference • No delay buffers needed (least transient energy) • Overall circuit delay may increase 2 1 1 1 1 1? 3? 1 9/23-30/04 1 1 1 ELEC 5970-003/6970-003 2 33 Linear Program • Variables: gate and buffer delays • Objective: minimize number of buffers • Subject to: overall circuit delay • Subject to: minimum transient condition for multi-input gates • AMPL, MINOS 5.5 (Fourer, Gay and Kernighan) 9/23-30/04 ELEC 5970-003/6970-003 34 Variables: Full Adder add1b 0 0 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 9/23-30/04 1 0 ELEC 5970-003/6970-003 1 35 Objective Function • Ideal: minimize the number of non-zero delay buffers • Actual: sum of buffer delays 9/23-30/04 ELEC 5970-003/6970-003 36 Specify Critical Path Delay 0 0 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 Sum of delays on critical path ≤ maxdel 9/23-30/04 ELEC 5970-003/6970-003 37 Multi-Input Gate Condition d1 0 0 0 1 d d 1 0 0 1 0 1 d1 - d2 ≤ d d2 - d1 ≤ d 9/23-30/04 ELEC 5970-003/6970-003 d d2 38 AMPL Solution: maxdel = 6 1 2 1 1 1 2 1 1 1 2 2 9/23-30/04 ELEC 5970-003/6970-003 39 AMPL Solution: maxdel = 7 3 1 1 1 2 1 1 2 1 2 9/23-30/04 ELEC 5970-003/6970-003 40 AMPL Solution: maxdel ≥ 11 5 1 1 1 2 3 1 3 4 9/23-30/04 ELEC 5970-003/6970-003 41 Power Estimates for add1b Power* with respect to Ref. No. maxdel of Ref: model del. Ref: unit del. buf. Peak Ave. Peak Ave. 6 2 0.60 0.89 0.60 0.90 7 1 0.56 0.85 0.56 0.86 ≥11 0 0.52 0.80 0.52 0.81 * Hsiao et al., ICCAD-97 9/23-30/04 ELEC 5970-003/6970-003 42 VDD Open at t = 0 Large C V Circuit Energy, E(t) Power Calculation in Spice Ground t 1 1 2 E(t) = -- C VDD - -- C V 2 ~ C VDD ( VDD - V ) 2 2 Ref.: M. Shoji, CMOS Digital Circuit Technology, Prentice Hall, 1988, p. 172. 9/23-30/04 ELEC 5970-003/6970-003 43 Power Dissipation of ALU4 Energy in nanojoules 7 1 micron CMOS, 57 gates, 14 PI, 8 PO 100 random vectors simulated in Spice 6 5 Original ALU delay ~ 3.5ns 4 3 Minimum energy ALU delay ~ 10ns 2 1 0 0.0 0.5 1.0 1.5 2.0 microseconds 9/23-30/04 ELEC 5970-003/6970-003 44 Signal Amplitude, Volts F0 Output of ALU4 Original ALU, delay = 7 units (~3.5ns) 5 0 Minimum energy ALU, delay = 21 units (~10ns) 5 0 0 40 80 120 160 nanoseconds 9/23-30/04 ELEC 5970-003/6970-003 45 References • E. Jacobs and M. Berkelaar, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC/IEEE Workshop on Circuits, Systems and Signal Processing, Nov. 1996, pp. 183-188; also Int. Workshop on Logic Synthesis, May 1997. • V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th Int. Conf. VLSI Design, Jan. 1997, pp. 193-197. • V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and a Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp. 434-439. • Last two papers are available at website http://www.eng.auburn.edu/~vagrawal 9/23-30/04 ELEC 5970-003/6970-003 46 A Limitation • Constraints are written by path enumeration. • Since number of paths in a circuit can be exponential in circuit size, the formulation is infeasible for large circuits. • Example: c880 has 6.96M constraints. 9/23-30/04 ELEC 5970-003/6970-003 47 Timing Window • Define two timing window variables per gate output: – ti Earliest time of signal transition at gate i. – Ti Latest time of signal transition at gate i. t1, T1 . . . ti, Ti i tn, Tn Ref: T. Raja, Master’s Thesis, Rutgers Univ., 2002 9/23-30/04 ELEC 5970-003/6970-003 48 Linear Program • Gate variables d4 . . . d12 • Buffer Variables d15 . . . d29 • Corresponding window variables t4 . . . t29 and T4 . . . T29. 9/23-30/04 ELEC 5970-003/6970-003 49 Multiple-Input Gate Constraints For Gate 7: T7 > T5 + d7; T7 > T6 + d7; 9/23-30/04 t7 < t 5 + d 7; t7 < t6 + d 7; ELEC 5970-003/6970-003 d7 > T7 - t7; 50 Single-Input Gate Constraints Buffer 19: T16 + d19 = T19 ; t16 + d19 = t19 ; 9/23-30/04 ELEC 5970-003/6970-003 51 Overall Delay Constraints T11 < maxdelay T12 < maxdelay 9/23-30/04 ELEC 5970-003/6970-003 52 Advantage of Timing Window • Path constraints (exponential in n): 2 × 2 × … 2 = 2n paths between I/O pair • A single variable specifies I/O delay. Total variables, O(n). • LP constraint set is linear in the size of circuit. 9/23-30/04 ELEC 5970-003/6970-003 53 Number of constraints Comparison of Constraints Number of gates in circuit 9/23-30/04 ELEC 5970-003/6970-003 54 Results: 1-Bit Adder 9/23-30/04 ELEC 5970-003/6970-003 55 Estimation of Power • Circuit is simulated by an event-driven simulator for both optimized and unoptimized gate delays. • All transitions at a gate are counted as Events[gate]. • Power consumed Events[gate] x # of fanouts. • Ref: “Effects of delay model on peak power estimation of VLSI circuits,” Hsiao, et al. (ICCAD`97). 9/23-30/04 ELEC 5970-003/6970-003 56 Color codes for number of transitions Original 1-Bit Adder 9/23-30/04 ELEC 5970-003/6970-003 57 Color codes for number of transitions Optimized 1-Bit Adder 9/23-30/04 ELEC 5970-003/6970-003 58 Results: 1-Bit Adder Simulated over all possible vector transitions •Average power = optimized/unit delay = 244 / 308 = 0.792 •Peak power = optimized/unit delay = 6 / 10 = 0.60 Power Savings : Peak = 40 % Average = 21 % 9/23-30/04 ELEC 5970-003/6970-003 59 Results: 4-Bit ALU maxdelay Buffers inserted 7 10 12 15 5 2 1 0 Power Savings : Peak = 33 %, Average = 21 % 9/23-30/04 ELEC 5970-003/6970-003 60 Benchmark Circuits Circuit Maxdel. (gates) No. of Buffers C432 17 34 95 66 0.72 0.62 0.67 0.60 C880 24 48 62 34 0.68 0.68 0.54 0.52 C6288 47 94 294 120 0.40 0.36 0.36 0.34 c7552 43 86 366 111 0.38 0.36 0.34 0.32 9/23-30/04 Normalized Power Average Peak ELEC 5970-003/6970-003 61 Physical Design Gate Gate Gate l/w l/w l/w Gate l/w Gate delay modeled as a linear function of gate size, total load capacitance, and fanout gate sizes (Berkelaar and Jacobs, 1996). Layout circuit with some nominal gate sizes. Enter extracted routing delays in LP as constants and solve for gate delays. Change gate sizes as determined from a linear system of equations. Iterate if routing delays change. 9/23-30/04 ELEC 5970-003/6970-003 62 Power Dissipation of ALU4 9/23-30/04 ELEC 5970-003/6970-003 63 References • R. Fourer, D. M. Gay and B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, South San Francisco: The Scientific Press, 1993. • M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC Workshop, Mierlo, The Netherlands, Nov. 1996, pp. 183-188. • V. D. Agrawal, “Low Power Design by Hazard Filtering,” Proc. 10th Int’l Conf. VLSI Design, Jan. 1997, pp. 193-197. • V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and Linear Programming Method,” Proc. 12th Int’l Conf. VLSI Design, Jan. 1999, pp. 434-439. • M. Hsiao, E. M. Rudnick and J. H. Patel, “Effects of Delay Model in Peak Power Estimation of VLSI Circuits,” Proc. ICCAD, Nov. 1997, pp. 45-51. • T. Raja, A Reduced Constraint Set Linear Program for Low Power Design of Digital Circuits, Master’s Thesis, Rutgers Univ., New Jersey, 2002. 9/23-30/04 ELEC 5970-003/6970-003 64 Conclusion • Glitch-free design through LP: constraint-set is linear in the size of the circuit. • LP solution: – Eliminates glitches at all gate outputs, – Holds I/O delay within specification, and – Combines path-balancing and hazard-filtering to minimize the number of delay buffers. • Linear constraint set LP produces results exactly identical to the LP requiring exponential constraint-set. • Results show peak power savings up to 68% and average power savings up to 64%. 9/23-30/04 ELEC 5970-003/6970-003 65