Basics of Energy & Power Dissipation Lecture notes S. Yalamanchili, S. Mukhopadhyay. A. Chowdhary Outline • Basic Concepts • Dynamic power • Static power • Time, Energy, Power Tradeoffs • Activity model for power estimation Combinational and sequential logic (2) Background Reading • http://en.wikipedia.org/wiki/CPU_power_dissip ation • http://en.wikipedia.org/wiki/CMOS#Power:_sw itching_and_leakage • http://www.xbitlabs.com/articles/cpu/display/c ore-i5-2500t-2390t-i3-2100t-pentiumg620t.html • http://www.cpu-world.com/info/charts.html (3) Where Does the Power Go in CMOS? • Dynamic Power Consumption Caused by switching transitions cost of switching state V Vdd PMOS Vout in • Static Power Consumption NMOS Ground Caused by leakage currents in the absence of any switching activity • Power consumption per transistor changes with each technology generation No longer reducing at the same rate What happens to power density? AMD Trinity APU (4) n-channel MOSFET GATE GATE DRAIN SOURCE tox DRAIN SOURCE BODY L L • Vgs < Vth transistor off - Vth is the threshold voltage • Vgs > Vth transistor on • Impact of threshold voltage Higher Vth, slower switching speed, lower leakage Lower Vth, faster switching speed, higher leakage • Actual physics is more complex but this will do for (5) now! Charge as a State Variable a b c x y For computation we should be able to identify if each of the variable (a,b,c,x,y) is in a ‘1’ or a ‘0’ state. We could have used any physical quantity to do that • Voltage • Current • Electron spin • Orientation of magnetic field • ……… All nodes have some capacitance associated with them a b c x y We choose voltage on the capacitor at each node (a,b,c…) to distinguish between a ‘0’ and a ‘1’. Logic 1: Cap is charged Logic 0: Cap is discharged + (6) Abstracting Energy Behavior • How can we abstract energy consumption for a digital device? • Consider the energy cost of charge transfer Vdd Vin Vout 0 1 1 0 Vin Modeled as an on/off resistance PMOS Vout NMOS Modeled as an output capacitance Ground (7) Switch from one state to another To perform computation, we need to switch from one state to another Connect the cap to GND thorough an ON NMOS Logic 1: Cap is charged Logic 0: Cap is discharged + Connect the cap to VCC thorough an ON PMOS The logic dictates whether a node capacitor will be charged or discharged. (8) Dynamic Power • Dynamic power is used in charging and discharging the capacitances in the CMOS circuit. VDD VDD Voltage iDD VDD CL 0 T Input to CMOS inverter iDD CL Tim e Output Capacitor Charging Output Capacitor Discharging CL = load capacitance (9) Switching Delay Vdd Vout -t RC Vs = Vdd(1- e ) t = RC Vdd Charging to logic 1 (10) Switching Delay Vdd Vout Vc = Ve -t RC Vdd Discharging to logic 0 (11) Switching Energy CL is the load capacitance Energy drawn from supply : EvDD = ¥ òi vDD ¥ (t)VDD dt = VDD ò 0 dt 0 Energy stored in capacitor : EC = ( d C L vout ¥ ¥ 0 0 ò ic (t)vout dt = ò ( d C L vout dt )v ) dt = C V dt = C L out L DD VDD ò dv out = C LV 2 DD 0 VDD ò 0 1 vout dvout = C LV 2 DD 2 Energy dissipated per transition? 1 Energy dissipated in resistor : E R = EvDD - EC = C LV 2 DD 2 ¥ ¥ ¥ d C v VDD - vout 1 L out 2 Also, E R = ò iR (t)R dt = ò iR (t) R dt = ò VDD - vout dt = C LV 2 DD R dt 2 0 0 0 ( => Independent of R Power = CL V2 f ) ( ) Note: An equal amount is dissipated when C discharged for a node switching at a rate f (f may be ½ the clock rate) (12) Power(watts) Power(watts) Power Vs. Energy P2 P1 Same Energy = area under the curve P0 Time P0 Time • Energy is a rate of expenditure of energy One joule/sec = one watt • Both profiles use the same amount of energy at different rates or power (13) Dynamic Power vs. Dynamic Energy • Dynamic power: consider the rate at which switching (energy dissipation) takes place VDD VDD Voltage iDD VDD iDD CL 0 T Input to CMOS inverter CL Tim e Output Capacitor Charging Output Capacitor Discharging activity factor = fraction of total capacitance that switches each cycle æ CL ö Pdynamic = a ç ÷ ×Vdd ×Vdd × F è 2 ø (14) Higher Level Blocks Vdd Vdd A A B Vdd B C C A A B B A B C A C B (15) Implications input Combinational Logic clk cond clk clk • What if I halved the clock rate to reduce dynamic energy? • What if I turned off the clock to a block of logic? • What if I lower the voltage? • How can I reduce the capacitance? (16) Static Power • Technology scaling has caused transistors to become smaller and smaller. As a result, static power has become a substantial portion of the total power. GATE SOURCE DRAIN Gate Leakage Junction Leakage Sub-threshold Leakage Pstatic = Vdd × I static (17) Delay Energy VDD Energy-Delay Product (EDP) Energy or delay Energy-Delay Interaction VDD • Delay decreases with supply voltage but energy/power increases æ CL ö Pdynamic = a ç ÷ ×Vdd ×Vdd × F è 2 ø -t RC Vs = Vdd (1- e ) (18) leakage or delay Static Energy-Delay Interaction leakage delay GATE DRAIN SOURCE tox L Vth • Static energy increases exponentially with decrease in threshold voltage • Delay increases with threshold voltage (19) Optimizing Power vs. Energy Maximize battery life minimize energy Thermal envelopes minimize peak power Example: (20) What About Wires? Lumped RC Model 1 Cline 2 Rline = r ×l Resistance per unit length 1 Cline 2 1 t = rc × l 2 2 Cline = c ×l Capacitance per unit length • We will not directly address delay or energy expended in the interconnect in this class Simple architecture model: lump the energy/power with the source component (21) ALU Energy Consumption Binvert CarryIn a0 b0 Operation CarryIn ALU0 Less CarryOut Binvert Operation CarryIn a Result0 0 1 a1 b1 0 Result CarryIn ALU1 Less CarryOut Result1 b 0 2 1 Less a2 b2 0 CarryIn ALU2 Less CarryOut Result2 CarryOut • Can we count the number of transitions in each 1-bit ALU for an operation? • Can we estimate static power? • Computing per operation energy CarryIn a31 b31 0 CarryIn ALU31 Less 3 Result31 Set Overflow (22) Closer Look: A 4-bit Ripple Adder A3 B3 A2 B2 A1 B1 A0 B0 Carry Cin S3 S2 S1 • Critical Path = DXOR+4*(DAND+DOR) for 4-bit ripple adder levels) S0 (9 gate • For an N-bit ripple adder • Critical Path Delay ~ 2(N-1)+3 = (2N+1) Gate delays • Activity (and therefore power) is a function of the input data values! (23) Modeling Component Energy • Per-use energies can be estimated from • Gate level designs and analyses • Circuit-level designs and analyses • Implementation and measurement • There are various open-source tools for analysis • Mentor, Cadence, Synopsys, etc. Hardware Design Technology Parameters Circuit-level Estimation Tool Estimation Results: Area, Energy, Timing, etc. (24) Example: A Simple Energy Model • We can use a simple model of per-access energy for the architecture components Common Components @16nm Access Energy (10-12 joules) Inst. Cache + TLB Read 19.22 Write 21.6 Data Cache + TLB Read 25.28 Write 27.26 Inst. Decoder Logic Switching 16.78 Inst. Registers Read 2.74 Write 4.38 FP. Registers Read 1.26 Write 1.98 Other Buffers Read 9.74 Write 11.18 ALU + Result Bus (interconnect) Logic Switching 123.2 FPU + Result Bus (interconnect) Logic Switching 241.02 • Each unit can be accessed multiple times depending on instruction type • An x86 instruction consumes 600pJ ~ 4nJ dynamic energy. (25) Summary • Two major classes of energy/power dissipation – static and dynamic • Managing energy is different from managing power leads to different solutions • Technology plays a major role in determining relative costs • Energy of components are often estimated using approximate models of switching activity (26) Study Guide • Explain the difference between energy dissipation and power dissipation • Distinguish between static power dissipation and dynamic power dissipation • What is the impact of threshold voltage on the delay and energy dissipation? • As you increase the supply voltage what is the behavior of the delay of logic elements? Why? • As you increase the supply voltage what is the behavior of static and dynamic energy and static and dynamic power of logic elements? (27) Study Guide (cont.) • Do you expect the 0-1 and 1-0 transitions at the output of a gate to dissipate the same amount of energy? • For a mobile device, would you optimize power or energy? Why? What are the consequences of trying to optimize one or the other? • Why does the energy dissipation of a 32-bit integer adder depend on the input values? • If I double the processor clock frequency and run the same program will it take less or more energy? (28) Glossary • Dynamic Energy • Time constant • Dynamic Power • Threshold voltage • Load capacitance • Switching energy • Static Energy • Static Power (29) Clocked CMOS Logic A Out B Vdd A NAND Vdd Eliminates PMOS array that would take 3X the chip area of an N-mos array. Clocking eliminates "race" conditions. B Clock Out CL Out A CL A B B Standard CMOS Clocked Mixed-MOS J. A. Copeland, R. H. Krambeck, "Functionally Static Type Semiconductor Shift Register with Half Dynamic-Half Static Stages," patent 3,993,916, Nov. 23, 1976. (30)