ELEC 5770-001/6770-001 Fall 2010 VLSI Design Low Power VLSI Design Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E6770_Fall10/VLSID_Fall2010_LowPower.ppt Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 1 Power Consumption of VLSI Chips Why is it a concern? Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 2 ISSCC, Feb. 2001, Keynote “Ten years from now, microprocessors will run at 10GHz to 30GHz and be capable of processing 1 trillion operations per second – about the same number of calculations that the world's fastest supercomputer can perform now. Patrick P. Gelsinger Senior Vice President General Manager Digital Enterprise Group INTEL CORP. Fall 2010, Nov 16 “Unfortunately, if nothing changes these chips will produce as much heat, for their proportional size, as a nuclear reactor. . . .” ELEC5770-001/6770-001 Guest Lecture 3 VLSI Chip Power Density Sun’s Surface Power Density (W/cm2) 10000 Rocket Nozzle 1000 Nuclear Reactor 100 8086 Hot Plate 10 4004 8008 8085 386 286 8080 1 1970 Fall 2010, Nov 16 1980 P6 Pentium® 486 1990 Year Source: Intel 2000 ELEC5770-001/6770-001 Guest Lecture 2010 4 Low-Power Design Design practices that reduce power consumption at least by one order of magnitude; in practice 50% reduction is often acceptable. Low-power design methods: Algorithms and architectures High-level and software techniques Gate and circuit-level methods Test power Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 5 Specific Topics in Low-Power Power dissipation in CMOS circuits Transistor-level methods Circuit and gate level methods Logic synthesis Dynamic power reduction techniques Leakage power reduction System level methods Low-power CMOS technologies Energy recovery methods Ultra low power logic (subthreshold VDD) Microprocessors Arithmetic circuits Low power memory technology Test Power Power estimation Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 6 CMOS Logic (Inverter) VDD No current flows from power supply! Where is power consumed? GND F. M. Wanlass and C. T. Sah, “Nanowatt Logic using Field-Effect Metal-Oxide-Semiconductor Triodes,” IEEE International SolidState Circuits Conference Digest, vol. IV, February 1963, pp. 32-33. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 7 Components of Power Dynamic, when output changes Signal transitions (major component) Logic activity Glitches Short-circuit (small) Static, when signal is in steady state Leakage (used to be small) Ptotal = = Fall 2010, Nov 16 Pdyn + Pstat Ptran + Psc + Pstat ELEC5770-001/6770-001 Guest Lecture 8 Power of a Transition: Ptran R = Ron V i(t) vi (t) Large resistance v(t) C Ground C = Total load capacitance for gate; includes transistor capacitances of driving gate + routing capacitance + transistor capacitances of driven gates; obtained by layout analysis. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 9 Charging of a Capacitor R t=0 v(t) i(t) C V Charge on capacitor, q(t) = C v(t) Current, i(t) = C dv(t)/dt Fall 2010, Nov 16 = dq(t)/dt ELEC5770-001/6770-001 Guest Lecture 10 i(t) = C dv(t)/dt = dv(t) ∫ ───── = V – v(t) ln [V – v(t)] = [V – v(t)] /R dt ∫ ──── RC –t ── + RC A Initial condition, t = 0, v(t) = 0 → A = ln V v(t) Fall 2010, Nov 16 = –t V [1 – exp(───)] RC ELEC5770-001/6770-001 Guest Lecture 11 v(t) = i(t) Fall 2010, Nov 16 = –t V [1 – exp( ── )] RC dv(t) C ─── dt = ELEC5770-001/6770-001 Guest Lecture V –t ── exp( ── ) R RC 12 Total Energy Per Charging Transition from Power Supply Etrans = = Fall 2010, Nov 16 ∞ ∫ V i(t) dt = 0 CV ∞ V 2 –t ∫ ── exp( ── ) dt 0 R RC 2 ELEC5770-001/6770-001 Guest Lecture 13 Energy Dissipated Per Transition in Resistance ∞2 R ∫ i (t) dt 0 Fall 2010, Nov 16 = V ∞ – 2t R ── ∫ exp( ── ) dt 2 R 0 RC = 1 2 ─ CV 2 2 ELEC5770-001/6770-001 Guest Lecture 14 Energy Stored in Charged Capacitor ∞ ∞ –t V –t ∫ v(t) i(t) dt = ∫ V [1 – exp( ── )] ─ exp( ── ) dt 0 0 RC R RC 1 2 = ─ CV 2 Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 15 Transition Power Gate output rising transition 2 Energy dissipated in pMOS transistor = CV /2 2 Energy stored in capacitor = CV /2 Gate output falling transition 2 Energy dissipated in nMOS transistor = CV /2 2 Energy dissipated per transition = CV /2 Power dissipation: Ptrans = α Fall 2010, Nov 16 2 Etrans α fck = α fck CV /2 = activity factor fck = clock frequency ELEC5770-001/6770-001 Guest Lecture 16 Components of Power Dynamic Signal transitions Logic activity Glitches Short-circuit Static Leakage Fall 2010, Nov 16 Ptotal = = Pdyn + Pstat Ptran + Psc + Pstat ELEC5770-001/6770-001 Guest Lecture 17 Short Circuit Power of a Transition: Psc VDD vi (t) isc(t) vo(t) CL Ground Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 18 Short-Circuit Power Increases with rise and fall times of input. Decreases for larger output load capacitance; large capacitor takes most of the current. Small, about 5-10% of dynamic power dissipated in charging and discharging of the output capacitance. Becomes zero when VDD ≤ Vthn + Vthp Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 19 Components of Power Dynamic Signal transitions Logic activity Glitches Static Fall 2010, Nov 16 Short-circuit Leakage ELEC5770-001/6770-001 Guest Lecture 20 Static (Leakage) Power Leakage power as a fraction of the total power increases as clock frequency drops. Turning supply off in unused parts can save power. For a gate it is a small fraction of the total power; it can be significant for very large circuits. Static power increases as feature size is scaled down; controlling leakage is an important aspect of transistor design and semiconductor process technology. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 21 CMOS Gate Power R = Ron vi (t) vi (t) V i(t) v(t) Large resistance i(t) C isc(t) Ground Leakage current Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture time 22 Some Examples Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 23 Energy Saving by Voltage Reduction Battery size VDD = 0.9V, 500MHz AHr Efficiency % 1.2 3.6 93 103 Battery lifetime x103 11 seconds x10 cycles 1.263 4.198 7.03 22.80 VDD = 0.3V, 5MHz Efficiency % 100+ 100+ Battery lifetime x103 seconds x10 11 cycles 1234 3894 48.60 150.30 seven-times 70 million gate circuit, 45nm CMOS bulk PTM. Lithium-ion battery. Ref.: M. Kulkarni and V. D. Agrawal, “A Tutorial on Battery Simulation – Matching Power Source to Electronic System,” Proc. VLSI Design and Test Symp., July 2010. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 24 State Encoding for a Counter Two-bit binary counter: State sequence, 00 → 01 → 10 → 11 → 00 Six bit transitions in four clock cycles 6/4 = 1.5 transitions per clock Two-bit Gray-code counter State sequence, 00 → 01 → 11 → 10 → 00 Four bit transitions in four clock cycles 4/4 = 1.0 transition per clock Gray-code counter is more power efficient. G. K. Yeap, Practical Low Power Digital VLSI Design, Boston: Kluwer Academic Publishers (now Springer), 1998. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 25 Binary Counter: Original Encoding Present state a 0 0 1 1 b 0 1 0 1 a Next state b A 0 1 1 0 A ab a b B ab ab Fall 2010, Nov 16 A B 1 0 1 0 B CK CLR ELEC5770-001/6770-001 Guest Lecture 26 Binary Counter: Gray Encoding Present state a 0 0 1 1 b 0 1 0 1 Next state A A 0 1 0 1 A ab ab B a b ab Fall 2010, Nov 16 a B 1 1 0 0 B b CK CLR ELEC5770-001/6770-001 Guest Lecture 27 Three-Bit Counters State Binary No. of toggles Gray-code State No. of toggles 000 - 000 - 001 1 001 1 010 2 011 1 011 1 010 1 100 3 110 1 101 1 111 1 110 2 101 1 111 1 100 1 000 3 000 1 Av. Transitions/clock = 1.75 Fall 2010, Nov 16 Av. Transitions/clock = 1 ELEC5770-001/6770-001 Guest Lecture 28 N-Bit Counter: Toggles in Counting Cycle Binary counter: T(binary) = 2(2N – 1) Gray-code counter: T(gray) = 2N T(gray)/T(binary) = 2N-1/(2N – 1) → 0.5 Bits T(binary) T(gray) T(gray)/T(binary) 1 2 2 1.0 2 6 4 0.6667 3 14 8 0.5714 4 30 16 0.5333 5 62 32 0.5161 6 126 64 0.5079 ∞ - - 0.5000 Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 29 Transition probability based on PI statistics FSM State Encoding 0.6 11 0.3 0.4 00 0.6 0.6 0.1 01 0.3 0.1 0.4 01 00 0.9 0.6 0.1 0.1 11 0.9 Expected number of state-bit transitions: 2(0.3+0.4) + 1(0.1+0.1) = 1.6 1(0.3+0.4+0.1) + 2(0.1) = 1.0 State encoding can be selected using a power-based cost function. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 30 FSM: Clock-Gating Moore machine: Outputs depend only on the state variables. If a state has a self-loop in the state transition graph (STG), then clock can be stopped whenever a self-loop is to be executed. Xi/Zk Si Sk Sj Fall 2010, Nov 16 Xj/Zk Xk/Zk Clock can be stopped when (Xk, Sk) combination occurs. ELEC5770-001/6770-001 Guest Lecture 31 Clock-Gating in Moore FSM Flip-flops PI Clock activation logic CK Fall 2010, Nov 16 Latch Combinational logic PO L. Benini and G. De Micheli, Dynamic Power Management, Boston: Springer, 1998. ELEC5770-001/6770-001 Guest Lecture 32 Bus Encoding for Reduced Power Example: Four bit bus 0000 → 1110 has three transitions. If bits of second pattern are inverted, then 0000 → 0001 will have only one transition. Bit-inversion encoding for N-bit bus: Number of bit transitions after inversion encoding Fall 2010, Nov 16 N N/2 0 0 N/2 Number of bit transitions ELEC5770-001/6770-001 Guest Lecture N 33 Sent data Received data Bus-Inversion Encoding Logic Polarity decision logic Fall 2010, Nov 16 Bus register Polarity bit M. Stan and W. Burleson, “Bus-Invert Coding for Low Power I/O,” IEEE Trans. VLSI Systems, vol. 3, no. 1, pp. 49-58, March 1995. ELEC5770-001/6770-001 Guest Lecture 34 Clock-Gating in Low-Power Flip-Flop D D Q CK Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 35 S5378 with Gated-Clock FF 2958 gates, 179 flip-flops TSMC025 CMOS 1,000 random vectors, clock period 50ns Simulation by Powersim* Power (microwatts) Flipflops used Combinational logic Transitions Shortcircuit Static (leakage) Clock Flip-flops Total Normal 95.4 14.1 0.13 220.3 751.6 1,081.5 Gated 133.5 23.1 0.13 118.9 32.5 308.0 * J. D. Alexander, “Simulation Based Power Estimation for Digital CMOS Technologies,” Master’s Thesis, Auburn University, Dec. 2008. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 36 Books on Low-Power Design (1) L. Benini and G. De Micheli, Dynamic Power Management Design Techniques and CAD Tools, Boston: Springer, 1998. T. D. Burd and R. A. Brodersen, Energy Efficient Microprocessor Design, Boston: Springer, 2002. A. Chandrakasan and R. Brodersen, Low-Power Digital CMOS Design, Boston: Springer, 1995. A. Chandrakasan and R. Brodersen, Low-Power CMOS Design, New York: IEEE Press, 1998. J.-M. Chang and M. Pedram, Power Optimization and Synthesis at Behavioral and System Levels using Formal Methods, Boston: Springer, 1999. M. S. Elrabaa, I. S. Abu-Khater and M. I. Elmasry, Advanced Low-Power Digital Circuit Techniques, Boston: Springer, 1997. R. Graybill and R. Melhem, Power Aware Computing, New York: Plenum Publishers, 2002. S. Iman and M. Pedram, Logic Synthesis for Low Power VLSI Designs, Boston: Springer, 1998. J. B. Kuo and J.-H. Lou, Low-Voltage CMOS VLSI Circuits, New York: WileyInterscience, 1999. J. Monteiro and S. Devadas, Computer-Aided Design Techniques for Low Power Sequential Logic Circuits, Boston: Springer, 1997. S. G. Narendra and A. Chandrakasan, Leakage in Nanometer CMOS Technologies, Boston: Springer, 2005. W. Nebel and J. Mermet, Low Power Design in Deep Submicron Electronics, Boston: Springer, 1997. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 37 Books on Low-Power Design (2) N. Nicolici and B. M. Al-Hashimi, Power-Constrained Testing of VLSI Circuits, Boston: Springer, 2003. V. G. Oklobdzija, V. M. Stojanovic, D. M. Markovic and N. Nedovic, Digital System Clocking: High Performance and Low-Power Aspects, Wiley-IEEE, 2005. M. Pedram and J. M. Rabaey, Power Aware Design Methodologies, Boston: Springer, 2002. C. Piguet, Low-Power Electronics Design, Boca Raton: Florida: CRC Press, 2005. J. M. Rabaey and M. Pedram, Low Power Design Methodologies, Boston: Springer, 1996. S. Roudy, P. K. Wright and J. M. Rabaey, Energy Scavenging for Wireless Sensor Networks, Boston: Springer, 2003. K. Roy and S. C. Prasad, Low-Power CMOS VLSI Circuit Design, New York: WileyInterscience, 2000. E. Sánchez-Sinencio and A. G. Andreaou, Low-Voltage/Low-Power Integrated Circuits and Systems – Low-Voltage Mixed-Signal Circuits, New York: IEEE Press, 1999. W. A. Serdijn, Low-Voltage Low-Power Analog Integrated Circuits, Boston:Springer, 1995. S. Sheng and R. W. Brodersen, Low-Power Wireless Communications: A Wideband CDMA System Design, Boston: Springer, 1998. G. Verghese and J. M. Rabaey, Low-Energy FPGAs, Boston: springer, 2001. G. K. Yeap, Practical Low Power Digital VLSI Design, Boston:Springer, 1998. K.-S. Yeo and K. Roy, Low-Voltage Low-Power Subsystems, McGraw Hill, 2004. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 38 Books Useful in Low-Power Design A. Chandrakasan, W. J. Bowhill and F. Fox, Design of HighPerformance Microprocessor Circuits, New York: IEEE Press, 2001. R. C. Jaeger and T. N. Blalock, Microelectronic Circuit Design, Third Edition, McGraw-Hill, 2006. S. M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits, New York: McGraw-Hill, 1996. E. Larsson, Introduction to Advanced System-on-Chip Test Design and Optimization, Springer, 2005. J. M. Rabaey, A. Chandrakasan and B. Nikolić, Digital Integrated Circuits, Second Edition, Upper Saddle River, New Jersey: Prentice-Hall, 2003. J. Segura and C. F. Hawkins, CMOS Electronics, How It Works, How It Fails, New York: IEEE Press, 2004. N. H. E. Weste and D. Harris, CMOS VLSI Design, Third Edition, Reading, Massachusetts: Addison-Wesley, 2005. Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 39 Problem: Bus Encoding A 1-hot encoding is to be used for reducing the capacitive power consumption of an n-bit data bus. All n bits are assumed to be independent and random. Derive a formula for the ratio of power consumptions on the encoded and the un-coded buses. Show that n ≥ 4 is essential for the 1-hot encoding to be beneficial. Reference: A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS Design, Boston: Kluwer Academic Publishers, 1995, pp. 224-225. [Hint: You should be able to solve this problem without the help of the reference.] Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 40 Solution: Bus Encoding Un-coded bus: Two consecutive bits on a wire can be 00, 01, 10 and 11, each with a probability 0.25. Considering only the 01 transition, which draws energy from the supply, the probability of a data pattern consuming CV 2 energy on a wire is ¼. Therefore, the average per pattern energy for all n wires of the bus is CV 2n/4. Encoded bus: Encoded bus contains 2n wires. The 1-hot encoding ensures that whenever there is a change in the data pattern, exactly one wire will have a 01 transition, charging its capacitance and consuming CV 2 energy. There can be 2n possible data patterns and exactly one of these will match the previous pattern and consume no energy. Thus, the per pattern energy consumption of the bus is 0 with probability 2–n, and CV 2 with probability 1 – 2–n. The average per pattern energy for the 1-hot encoded bus is CV 2(1 – 2–n). Fall 2010, Nov 16 ELEC5770-001/6770-001 Guest Lecture 41 Solution: Bus Encoding (Cont.) Power ratio = Encoded bus power / un-coded bus power = 4(1 – 2–n)/n → 4/n for large n For the encoding to be beneficial, the above power ratio should be less than 1. That is, 4(1 – 2–n)/n ≤ 1, or 1 – 2–n ≤ n/4, or n/4 ≥ 1 (approximately) → n ≥ 4. The following table shows 1-hot encoded bus power ratio as a function of bus width: Fall 2010, Nov 16 n 4(1 – 2–n)/n n 4(1 – 2–n)/n 1 2.0000 8 0.4981 2 1.5000 16 0.2500 = 1/4 3 1.1670 32 1/8 4 0.9375 64 1/16 ELEC5770-001/6770-001 Guest Lecture 42