Optimizing Power @ Standby Circuits and Systems Jan M. Rabaey Low Power Design Essentials ©2008 Chapter 8 Chapter Outline Why Sleep Mode Management? Dynamic power in standby – Clock gating Static power in standby – Transistor sizing – Power gating – Body biasing – Supply voltage ramping Low Power Design Essentials ©2008 8.2 Arguments for Sleep Mode Management Many computational applications operate in burst modes, interchanging active and non-active modes – General purposes computers, cell phones, interfaces, embedded processors, consumer applications, … Prime concept: Power dissipation in standby should absolutely minimum, if not zero Sleep mode management has gained importance with increasing leakage Design Time Low Power Design Essentials ©2008 Clock gating Run Time Leakage elimination 8.3 Standby Power - Was Not A Concern In Earlier Days Pentium-1: 15 Watt (5V - 66MHz) Pentium-2: 8 Watt (3.3V- 133 MHz) Processor in idle mode! Floating Point Unit and Cache powered down when not in use Low Power Design Essentials ©2008 [Source: Intel] 8.4 Dynamic Power - Clock Gating Turn off clocks to idle modules – Ensure that spurious activity is set to zero Must ensure that data inputs to module are in stable mode – Primary inputs are from gated latches or registers – Or, disconnected from interconnect network Can be done at different levels of system hierarchy Low Power Design Essentials ©2008 8.5 Clock Gating Turning off the clock for non-active components Clk Register File Enable Bus Logic Module Enable Logic Module Disconnecting the inputs Low Power Design Essentials ©2008 8.6 Clock-gating Efficiently Reduces Power Without clock gating 30.6mW With clock gating MPEG4 decoder 8.5mW VDE 0 5 10 15 20 25 Power [mW] DEU MIF DSP/ HIF 90% of F/F’s clock-gated. 896Kb SRAM 70% power reduction by clockgating alone. © IEEE 2002 Low Power Design Essentials ©2008 [Ref: M. Ohashi, ISSCC’02] 8.7 Clock Gating Challenges to skew management and clock distribution (load on clock network varies dynamically) Fortunately state-of-the-art design tools are starting to do a better job – For example, physically aware clock-gating inserts gaters in clock-tree based on timing constraints and physical layout Power savings Simpler skew management, less area CG CG CG CG CG Low Power Design Essentials ©2008 8.8 Clock Hierarchy and Clock Gating Example: Clock distribution of dual-core Intel Montecito processor © IEEE 2005 “Gaters” provided at lower clock tree levels Automatic skew compensation Low Power Design Essentials ©2008 [Ref: T. Fischer, ISSCC’05] 8.9 Trade-Off between Sleep-Modes and Sleep-Time Typical operation modes Active mode normal processing Standby mode fast resume high passive power Sleep mode slower resume low passive power Resume time from clock gating determined by the time it takes to turn on the clock distribution network Standby Options: Just gate the clock to the module in question Turn off phased-locked loop(s) Turn off clock completely Low Power Design Essentials ©2008 8.10 Sleep Modes in mProcessors and mControllers [Ref: S. Gary, Springer’95] [Ref: TI’06] TI MSP430™ From standby to active in 1 ms Using dual clock system • 0.1-μA power down • 0.8-μA standby • 250-μA / MIPS @ 3 V Low Power Design Essentials ©2008 8.11 Wake-up Delay The Standby Design Exploration Space Sleep Nap Doze Standby Standby Power Trade-off between different operational modes Should blend smoothly with run-time optimizations Low Power Design Essentials ©2008 8.12 Also the Case for Peripheral Devices Hard disk Psleep W Pactive W Tsleep sec Tactive sec IBM 0.75 3.48 0.51 6.97 Fujitsu 0.13 0.95 0.67 1.61 Wireless LAN Card Power Transitions Low Power Design Essentials ©2008 TX RX Doze Off 1.65W 1.4W 0.045W 0W To Off: To Doze: 62 msec 34 msec [Ref: T. Simunic, Kluwer’02] 8.13 The Leakage Challenge – Power in Standby With clock-gating employed in most designs, leakage power has become the dominant standby power source With no activity in module, leakage power should be minimized as well – Remember constant ratio between dynamic and static power … Challenge – how to disable unit most effectively given that no ideal switches are available Low Power Design Essentials ©2008 8.14 Standby Static Power Reduction Approaches Transistor stacking Power gating Body biasing Supply voltage ramping Low Power Design Essentials ©2008 8.15 Transistor Stacking Off-current reduced in complex gates (see leakage power reduction @ design time) Some input patterns more effective than others in reducing leakage Effective standby power reduction strategy: – Select input pattern that minimizes leakage current of combinational logic module – Force inputs of module to correspond to that pattern during standby Pro’s: Little overhead, fast transition Con: Limited effectiveness Low Power Design Essentials ©2008 8.16 Latches Combinational Module … Clk … Latches Transistor Stacking Standby Low Power Design Essentials ©2008 [Ref: S. Narendra, ISLPED’01] 8.17 Forced Transistor Stacking Useful for reducing leakage in non-critical shallow gates (in addition to high VTH) Low Power Design Essentials ©2008 [Ref: S. Narendra, ISLPED’01] 8.18 Power Gating Disconnect module from supply rail(s) during standby sleep Logic sleep Footer or header transistor, or both Most effective when high VT transistors are available Easily introduced in standard design flows But … Impact on performance Very often called “MTCMOS” (when using high- and low- threshold devices) Low Power Design Essentials ©2008 [Ref: T. Sakata, VLSI’93; S. Mutoh, ASIC’93] 8.19 Power Gating ─ Concept Leakage current reduces because Increased resistance in leakage path Stacking effect introduces source biasing VDD IN (= 0) IN = 0 Ileak OUT M1 M1 VTH shift VS = Ileak RS Sleep Extra resistance RS RS VS (similar effect at PMOS side) Low Power Design Essentials ©2008 8.20 Power Gating Options sleep sleep Low VT Low VT sleep footer + header Low VT sleep footer only header only NMOS sleeper transistor more area efficient than PMOS Leakage reduction more effective (under all input patterns) when both footer and header transistors are present Low Power Design Essentials ©2008 8.21 Other option: Boosted-Gate MOS (BGMOS) VDD CMOS logic - low VTH - thin TOX Virtual GND <Standby> <Active> VBOOST VDD 0V Low Power Design Essentials ©2008 Leak cut-off Switch (LS) - high VTH - thick TOX (eliminates tunneling) [T. Inukai, CICC'00] 8.22 Other Option: Boosted-Sleep MOS (also called Super-Cutoff CMOS or SCCMOS) VDD CMOS logic - low VTH - thin TOX Virtual GND <Standby> <Active> VDD 0 -Vboost Low Power Design Essentials ©2008 Leak cut-off Switch (LS) - normal (or high) VTH - normal TOX Area efficient [Ref: T. Inukai, CICC’00] 8.23 Virtual Supplies Active Mode Standby Mode © IEEE 2003 VDD Virtual VDD ON OFF ... ON VDD Virtual VDD ... Virtual GND GND Noise on virtual supplies Low Power Design Essentials ©2008 Virtual GND OFF GND Virtual supply collapse [Ref: J. Tschanz, JSSC’03] 8.24 Decoupling Capacitor Placement Longer time constant Oxide leakage Reduced leakage Logic Logic © IEEE 2003 Decap on supply rails Decap on virtual rails Performance Convergence time Oxide leakage savings Low Power Design Essentials ©2008 [Ref: J. Tschanz, JSSC’03] 8.25 Normalized leakage power in idle mode Leakage Power Savings versus Decap 1 Low-leakage 133nF decap on virtual VCC 0.8 1.32V 75°C 40% 0.6 No decap on virtual VCC 0.4 0.2 90% © IEEE 2003 0 10ns 1ms 10ms 100ms 10ms Idle time Low Power Design Essentials ©2008 [Ref: J. Tschanz, JSSC’03] 8.26 How to Size the Sleep Transistor? Sleep transistor is not free – it will degrade the performance in active mode Circuits in active mode see the sleep transistor as extra power line resistance – The wider the sleep transistor, the better Wide sleep transistors cost area – Minimize the size of the sleep transistor for given ripple (e.g. 5%) – Need to find the worst-case vector Low Power Design Essentials ©2008 8.27 Sleep Transistor Sizing High-VTH transistor must be very large for low resistance in linear region. Low-VTH transistor needs less area for same resistance. MTCMOS Boosted Sleep NonBoosted Sleep Sleep-TR size 5.1% 2.3% 3.2% Leakage power reduction 1450x 3130x 11.5x Virtual supply bounce 60 mV 59 mV 58 mV Low Power Design Essentials ©2008 [Ref: R. Krishnamurthy, ESSCIRC’02] 8.28 Preserving State Virtual supply collapse in sleep mode causes the loss of state in registers Keeping the registers at nominal VDD preserves the state – These registers leak … Can lower the VDD in sleep – Some impact on robustness, noise and softerror immunity Low Power Design Essentials ©2008 8.29 Latch Retaining State during Sleep Transmission gate sleep sleep D Q sleep sleep Clk Black-shaded devices use low-VTH tranistors All others are high VTH. Low Power Design Essentials ©2008 [Ref: S. Mutoh, JSSC’95] 8.30 MTCMOS Derivatives Preventing State Loss VDD VDD Vretain HVT low VTH logic High VTH (small W) sleep virtual GND virtual VDD sleep High VTH low VTH logic Clamping Retention Reduce voltage and retain state Low Power Design Essentials ©2008 8.31 Sleep Transistor Placement “strapper” cells VDD Standard cell row GND M3 VDD VDD’ VDD’ GND’ GND’ M4 M4 M3 GND No sleep transistors Low Power Design Essentials ©2008 VDD GND With headers and footers 8.32 Sleep Transistor Layout ALU Sleep transistor cells Area overhead Low Power Design Essentials ©2008 PMOS 6% NMOS 3% [Ref: J. Tschanz, JSSC’03] 8.33 Dynamic Body Biasing Increase thresholds of transistors during sleep using reverse body biasing – Can be combined with forward body biasing in active No delay penalty But Requires triple-well technology Limited range of threshold adjustments (<100mV) – Not improving with technology scaling Limited leakage reduction (<10x) Energy cost of charging/discharging the substrate capacitance Low Power Design Essentials ©2008 8.34 Dynamic Body Biasing Active mode: Forward Body Bias VDD Standby mode: Reverse Body Bias PMOS body VDD VHIGH FBB PMOS bias PMOS body ... ... NMOS bias NMOS body FBB PMOS bias RBB ... NMOS bias ... RBB GND GND VLOW NMOS body © IEEE 2003 Low threshold, high performance High threshold, low leakage Can also be used to compensate for threshold variations Low Power Design Essentials ©2008 [Ref’s: T Kuroda ISSCC’96; J. Tschanz, JSSC’03] 8.35 The Dynamics of Dynamic Body Bias © IEEE 1995 Needs level shifting and voltage-switch circuitry Low Power Design Essentials ©2008 [Ref: K. Seta, ISSCC’95] 8.36 Body Bias Layout ALU Sleep transistor LBGs ALU core LBGs Number of ALU core LBGs 30 Number of sleep transistor LBGs 10 PMOS device width Area overhead 13 mm 8% ALU core LBGs Sleep transistor LBGs LBG: Local bias generator Low Power Design Essentials ©2008 [Ref: J. Tschanz, JSSC’03] 8.37 DBB for Standby Leakage Reduction - Example VBC (0.13 mm2) © Springer 2006 Application-specific processor (SH-mobile) 250 nm technology core at 1.8 V I/O at 3.3 V 3.3M transistors Low Power Design Essentials ©2008 [Ref: M. Miyazaki, Springer’06] 8.38 Effectiveness of Dynamic Body Biasing 0.6 0.5 VTH (V) 0.4 Forward VBS 0.3 0.2 Reversed VBS 0.1 0 -2 -1 0 1 2 VBS (V) Practical VTH tuning range less than 150 mV in 90 nm technology Low Power Design Essentials ©2008 8.39 Supply Voltage Ramping (SVR) Reduce supply voltage of modules in sleep mode – Can go to 0 V if no state retention is necessary – Down to state retention voltage otherwise, (see Memory in next Chapter), or move state to persistent memory before power-down Most effective leakage reduction technique – Reduces current and voltage But Needs controllable voltage regulator – Becoming more often present in modern integrated system designs Longer re-activation time Simplified version switches between VDD and GND (or VDDL) Low Power Design Essentials ©2008 [Ref: M. Sheets, VLSI’06] 8.40 Supply Ramping VDD 0 VDD DRV Module Module Full power down Power down with data retention Standby power = VDD(standby) x Ileak(standby) Modules must be isolated from neighbors Creating “voltage islands” Low Power Design Essentials ©2008 8.41 Supply Ramping ─ Impact Leakage power as a function of the supply voltage (90 nm) -9 4 x 10 3.5 3 Inverter Pleak 2.5 2 1.5 1 Factor 8.5 Nand4 0.5 0 0 0.1 0.2 0.3 0.4 0.5 VDD 0.6 0.7 0.8 0.9 1 Because of DIBL, dropping supply voltage causes dramatic reduction in leakage – Can go as low as 300 mV before data retention is lost Low Power Design Essentials ©2008 8.42 VDDH VDDL GND VDDL (RV) VDDH VDDL GND VDDH VDDH VDDL GND Integration in Standard Cell Layout Methodology Vvdd Vvdd GND Awake Vvdd GND GND Vvdd Awake_buf Power switch cell GND Integration into power grid Power switch cell easily incorporated into standard design flow – Cell has same pitch as existing components – No changes required to cell library from foundry Switch design can be independent of block size Low Power Design Essentials ©2008 8.43 Standby Leakage Management ─ A Comparison Transistor Stacking Power Gating Dynamic Body Biasing Supply Voltage Ramping Pro’s Conventional technology No performance impact Conventional technology Conceptually simple Most effective Reuse of standard designs No performance impact Most effective Also available in switched version Con’s Limited impact Special registers Performance impact of serial transistor Changes in design flow Triple well Slow activation Does not fare well with technology scaling Needs voltage regulator or extra rails Slow activation Potential Savings 5 - 10 2 - 40 2 - 1000 Huge Low Power Design Essentials ©2008 8.44 Some Long-Term Musings Ideal power-off switch should have zero leakage current (S = 0 mV/decade) Hard to accomplish with traditional electronic devices Maybe possible using MEMS – mechanical switches have a long standing reputation for good isolation Low Power Design Essentials ©2008 [Ref: N. Abele, IEDM’05] 8.45 Summary and Perspectives Today’s designs are not leaky enough to be truly powerperformance optimal! Yet, when not switching, circuits should not leak! Clock gating effectively eliminates dynamic power in standby Effective standby power management techniques are essential in sub-100 nm design – Power gating the most popular and effective technique – Can be supplemented with body biasing and transistor stacking – Voltage ramping probably the most effective technique in the long range (if gate leakage becomes a bigger factor) Emergence of “voltage or power” domains Low Power Design Essentials ©2008 8.46 References Books and Book Chapters V. De et al, “Techniques for Leakage Power Reduction,” in A. Chandrakasan et al, Design of High-Performance Microprocessor Circuits, Ch. 3, IEEE Press, 2001. K. Roy et al, “Circuit Techniques for Leakage Reduction,” in C. Piguet, Low-Power Electronics Design, Ch. 13, CRC Press, 2005. S. Narendra and A. Chandrakasan, Leakage in Nanometer CMOS Technologies, Springer , 2006. Articles Abele, N.; Fritschi, R.; Boucart, K.; Casset, F.; Ancey, P.; Ionescu, A.M., “Suspended-gate MOSFET: bringing new MEMS functionality into solid-state MOS transistor,” Proc. Electron Devices Meeting, 2005. IEDM Technical Digest. IEEE International, pp 479-481, Dec. 2005 T. Fischer, et al., “A 90-nm variable frequency clock system for a power-managed Itanium® architecture processor,” IEEE J. Solid-State Circuits, pp.217–227, Febr. 2006. S. Gary, “Low-Power Microprocessor Design,” in Low Power Design Methodologies, Ed. J. Rabaey and M. Pedram, Chapter 9, pp. 255-288, Kluwer Academic, 1995. T. Inukai et al., “Boosted Gate MOS (BGMOS): Device/Circuit Cooperation Scheme to Achieve Leakage-Free Giga-Scale Integration,” CICC, pp. 409-412, May 2000. H. Kam et al., “A new nano-electro-mechanical field effect transistor (NEMFET) design for lowpower electronics, “ IEDM Tech. Digest, pp. 463- 466, Dec. 2005. R. Krishnamurthy et al, “High-performance and low-power challenges for sub-70nm microprocessor circuits,” 2002 IEEE ESSCIRC Conf., pp. 315-321, Sept. 2002. T.Kuroda et al., “A 0.9V 150MHz 10mW 4mm2 2-D Discrete Cosine Transform Core Processor with Variable-Threshold-Voltage Scheme,” JSSC, vol. 31, no. 11, pp. 1770-1779, Nov. 1996. Low Power Design Essentials ©2008 8.47 References (cntd) M. Miyazaki et al., “Case Study: Leakage Reduction in Hitachi/Renesas Microprocessors”, in A. Narendra, Leakage in Nanometer CMOS Technologies, Ch 10., Springer, 2006. S. Mutoh et al., 1V high-speed digital circuit technology with 0.5 mm multi-threshold CMOS,“ Proc. Sixth Annual IEEE ASIC Conference and Exhibit, pp. 186-189, Sept. 1993. S. Mutoh et al., “1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS”, IEEE Journal of Solid-State Circuits, vol. 30, pp. 847 - 854, August 1995. S. Narendra, et al., “Scaling of Stack Effect and its Application for Leakage Reduction,” ISLPED, pp. 195-200, Aug. 2001. M. Ohashi et al, “A 27MHz 11.1mW MPEG-4 Video Decoder LSI for Mobile Application,” ISSCC, pp. 366-367, Feb. 2002. T. Sakata, M. Horiguchi, K. Itoh; Subthreshold-current reduction circuits for multi-gigabit DRAM's, Symp. VLSI Circuits Dig. , pp. 45 - 46, May 1993. K. Seta, H. Hara, T. Kuroda, M. Kakumu, and T. Sakurai, "50% active-power saving without speed degradation using standby power reduction (SPR) circuit," IEEE International Solid-State Circuits Conference, vol. XXXVIII, pp. 318 - 319, February 1995. M. Sheets et al, J, "A Power-Managed Protocol Processor for Wireless Sensor Networks," Digest of Technical Papers 2006 Symposium on VLSI Circuits, pp. 212-213, June 15-17, 2006. T. Simunic, "Dynamic Management of Power Consumption", in Power Aware Computing, edited by R. Graybill, R. Melhem, Kluwer Academic Publishers, 2002. TI MSP430 Microcontroller family, http://focus.ti.com/lit/ml/slab034m/slab034m.pdf J. W. Tschanz, S. G. Narendra, Y. Ye, B. A. Bloechel, S. Borkar, and V. De, "Dynamic sleep transistor and body bias for active leakage power control of microprocessors," IEEE Journal of Solid-State Circuits, vol. 38, pp. 1838 - 1845, November 2003. Low Power Design Essentials ©2008 8.48