An Overview of Static Power Dissipation

advertisement
An Overview of Static Power Dissipation
Jayanth Srinivasan
1
Introduction
Power consumption is an increasingly important issue in general purpose processors, particularly in the mobile computing segment. In present processors, most of the power dissipation is dynamic power dissipation,
which arises due to signal transitions. Various techniques have been studied and implemented to reduce
dynamic power dissipation, including clock gating, cache sub-banking, voltage scaling, and eliminating
needless computation (these techniques are directly relevant to computer architects).
However, as transistors become smaller and faster, static power (also called leakage power) dissipation will become increasingly significant. Technology scaling is increasing both the absolute and relative
contributions of static power dissipation.
Looking at current technology trends, it is evident that static power dissipation is growing at a faster
rate than dynamic power dissipation. In just a few processor generations, the curves will intersect. Using
scaling theory, Borkar predicts that leakage power increases by 5 times every generation, while active power
remains roughly constant. Because leakage current flows from every transistor that is powered on, with
increasing die sizes and integration, static power will become a significant part of the total power.
2
Sources of Static Power Consumption
There are three sources of power dissipation in digital CMOS circuits which are summarized in the following
equation:
Pavg
=
Pswithing
+
Pshort
iruit +
Pstati
=
CL Vdd2 f
+
Is Vdd
+
Ileakage Vdd
Pswithing refers to the dynamic component of power, where CL is the load capacitance, f is the clock
frequency, and is the node transition activity factor. This equation also assumes the voltage swing is equal
to the supply voltage, Vdd . Pshort iruit is due to the direct-path short circuit current Is , which arises when
both the NMOS and PMOS transistors are simultaneously active, conducting current directly from supply
to ground. Significant short-circuit power dissipation can be avoided if the output rise/fall time of a gate
is much longer than the input rise/fall time.
Pstati is due to the leakage current Ileakage . Ileakage has five
components:
1. Reverse biased pn junction current
1
Diode leakage occurs when a transistor is turned off and another active transistor charges up/down
the drain with respect to the former’s bulk potential. For example, consider an inverter with a high
input voltage. The output is low, and the NMOS is on. The PMOS transistor will be turned off, but
it’s drain to bulk voltage will be
Vdd since the output voltage is at 0V
and the bulk for PMOS is at
Vdd . For the p-well to bulk diode, the leakge current is given by:
ID = IS (eV =VT
where IS is the reverse saturation current,
V
1)
is the diode voltage, and VT is the thermal voltage and
is equal to KT=q . This current is especially significant for an application which spends much of its
time idle, since this power is always being dissipated even when there is no switching.
2. Sub-threshold leakage
This occurs when the gate-source voltage, Vgs , has exceeded the weak inversion point but is still below the threshold voltage
Vt h.
In this region, the MOSFET behaves similar to a bipolar transistor,
with it’s exponential characteristics. The current in the sub-threshold region is given by:
ISUB = K (W=L)e(Vgs
Vth )=(nVT )
(1
e
Vds =VT
)
where n and K are technology paramaters, and Vds is the drain-source voltage.
Scaling down the supply voltage in CMOS requires also to scale down the threshold voltage, Vth , in
order to maintain the performance of the scaled down logic. From the equation above, it becomes clear
that the reduction of the threshold voltage increases the sub-threshold leakage current significantly.
Sub-threshold leakage current along with reverse biased pn junction current are currently the most
important components of leakage current.
3. Gate induced drain leakage (GIDL)
Gate induced drain leakage (GIDL) current (IGIDL ) arises in the high electric field under the gate/drain
overlap region causing deep depletion. GIDL occurs at low
VG and high VD and generates carriers
into the the substrate and drain from surface traps or band-to-band tunneling.
4. Punch through
Punchthrough occurs when the drain and source depletion region approach each other and electrically
”touch” deep in the channel. Punchthrough current ( IP T ) varies quadratically with drain voltage.
5. Gate tunneling
Gate oxide tunneling current (IG ) is present when the electric field at the gate is high enough to tunnel
through the gate oxide layer. This phenomenon is common in scaled down devices with reduced oxide
thickness.
2
3
Impact of technology scaling on static current
Butts et al. have explained the impact of technology scaling on static current by using the constant field
scaling methodology. The primary constraint on device scaling is the process technology (for e.g., lithography). In order to keep up with Moore’s law, and also to maintain chip reliability, chip designers use the
constant field scaling methodology. Constant field scaling reduces the supply voltage by the same factor as
device dimensions in order to keep the electric fields the same across technology generations.
One of the metrics used is:
t = Cgate V dd=IDsat
where t is a single transistor delay, Cgate is the gate capacitance per unit width, and IDsat is the maximum
saturation drain current that can flow from the transistor. Under constant field scaling, if the supply voltage
is reduced by some factor S, the delay must also be reduced by the same factor S. For this, it is sufficient
to keep Cgate =IDSat constant.
Cgate is proportional to the channel length and inversely proportional to the
oxide thickness. Since both these dimensions are reduced by S, Cgate remains constant. Hence, to achieve
the expected performance improvement under scaling, the drive current IDSat must remain constant. IDSat
is a function of many variables including Vdd Vth . In order to maintain constant IDSat , Vth has to be
reduced by a factor greater than S. From the equations in the previous section, we can see that this will lead
to exponentially increasing leakage currents.
4
Estimating Leakage Power
Various research groups have developed power models for the estimation of leakage power dissipation.
However, most of these models are at a transistor level, and are not feasible for efficient architecture power
dissipation simulation. Current publicly available power estimation tools for general purpose architectures
either ignore leakage power dissipation or assign it a fixed fraction of the dynamic power dissipation. Although such approximations may be acceptable with current process parameters, better leakage power estimation should be incorporated into power estimation tools.
Butts et al. have proposed a relatively simple static power model for architects. They model the leakage
power as :
Pleakage = Vdd Nkdesign Ileak
Pleakage is the static power consumption at the architectural level, N is the number of transistors,
kdesign is a design dependent parameter, and Ileak is a technology dependent parameter. This equation
allows us to seperate the contributions to reduction in leakage power by architects and circuit designers. Ileak
depends on technology parameters like Vth while kdesign depends on design parameters like the fraction of
where
transistors on at any time.
3
(Note: I’m not too sure how easy it would be to incorporate a leakage power model into Wattch for
RSIM. We would have to estimate transistor counts for the different blocks, and I’m not too sure we can
directly use the kdesign values in the Butts paper.)
5
Reducing Static Power
Many circuit and device level techniques have been evolved to reduce static power dissipation. The leakage
power equation developed by Butts et al. also lends itself to some obvious ways to reduce power dissipation.
5.1 Input selection for stand-by mode
Studies have shown that vectors at the input to logic gates have a large impact on the leakage current. Chen
et al. have developed a genetic algorithm based technique to estimate the standby leakage poewr in CMOS
circuits.
5.2 Steeper sub-threshold swing
Sub-threshold swing is the metric used to evaluate sub-threshold leakage current. The equation for subthreshold leakage current in section 2 lends itself to various methods to reduce this current. The current is
proportional to the temperature of operation. Hence, one option is to operate the circuit at liquid nitrogen
temperature. This is expensive and not practical for mobile applications though. Another option is to use a
Silicon on Insulator (SOI) circuit. It’s found that the leakage current of the SOI device in the standby mode
is much lower than that of the bulk silicon device for the same threshold voltage.
5.3 Multiple supply voltages
Since power dissipation decreases quadratically with the scaling of supply voltage, while delay only increases linearly, it is possible to use high supply voltage in the critical paths of a design to achieve the
required performance while the off-critical paths of the design use lower supply voltage to achieve low
power dissipation. By partioning the circuit into several domains operating at different supply voltages,
both static and dynamic savings are possible. However, level shifter circuits are required for inter-domain
communication. Another way to reduce the supply voltage without impacting performance is to emphasize
high IPC designs. However, this should not come at the cost of added circuitry, as the extra leakage current
might offset the benefit of the savings due to lower voltage.
5.4 Multiple threshold voltages
It is clear that threshold voltage is one of the most important parameters for device and circuit design. For
the active mode, the low Vth is preferred because of the higher performance. However, for the standby mode
of operation, high Vth is useful for reduction of leakage power. Hence, if different threshold voltages could
be used during the different modes of operation, large improvements in performance are possible without
4
sacrificing the speed. Different threshold voltages can be developed during fabrication. Different transistor
speeds may be used in different ways. One method would be to employ fast devices along critical timing
paths and to slower higher
Vth
devices in non critical parts of the circuit. A second technique involves
determining which functional units require the lowest latencies and allocating the budget of fast, leaky
devices to these units only.
5.5 Reducing the number of devices
One obvious technique to reduce static power is to reduce the total number of transistors used in the circuit.
However, it is difficult to find opportunities to reduce the device count enough to impact power. Since a large
number of devices must be removed to have a noticeable impact, units with replication make obvious targets.
Cache size, number of functional units, and issue/retire bandwidth may all be reduced with varying degrees
of difficulty and performance impact. Another beneficial task for architects would be to equalize utilization.
Power gating may be used to achieve the same benefit of reducing the number of devices without actually
removing any devices. It is analogous to clock gating. Sections of the circuit are turned off when not in
use in order to reduce leakage power. However, additional circuitry is required to monitor when shutting off
can be done and to implement the powering down. This leads to extra power dissipation. The other major
problem with power gating is the latency required for units to turn on after they have been powered down.
Due to the huge capacitance on the power supply nodes in a unit, several clock cycles will be needed to
allow the power supply to reach its operating level. Solutions to this problem involve stalling or prediction
of when units are required.
5.6 Using more efficient circuits
kdesign offers few opportunities for static power reduction directly. Power efficient circuitry can be used if
performance is maintained within required limits.
5.7 Power reduction with speculation
Speculation can be an important tool for architects when designing power-efficient architectures. It provides
an opportunity to use slower devices without proportionally impacting performance. Fast circuitry is used
for the performance critical speculation circuitry while slower circuitry can be used for the relatively simple
verification. Thus, the verification circuitry may use high threshold devices, use a lower supply voltage,
a lower frequency, etc. resulting in both static and dynamic power savings. DIVA is a good example of
an architecture in which such devices can be used. Another application of speculation is predicting when
certain circuitry will be needed in order to bring it out of a power gated state. Speculation can be used to
power down parts of the circuit and power them up again.
5
6
Related Work in the Architecture Community
Many research groups have proposed and developed architectural techniques to reduce dynamic power dissipation. However, very little work has been done on static power dissipation from an architectural perspective.
Recent work by Powell et al. combines circuit and architectural techniques to reduce the power consumption in a processor’s cache. The cache miss rate is used to determine the working set size of the application
relative to that of the cache. Power is then removed from the unused portions of the cache using gated-Vdd
transistors. Recent work by Kaxiras et al. also attacks static power dissipation in the cache. Policies and
implementations for reducing cache leakage by invalidating and turning off cache lines when they hold date
not likely to be reused is discussed. This leads to power savings in the cache.
The device community has been looking at the problem of static leakage for a much longer time. Several
device techniques have been developed. Several low static power transistor families (like MTCMOS) have
also been developed.
7
Future Work
Most of the current work on reducing static power dissipation lies in the domain of circuit and device
engineers. They targer lower power circuits by tweaking the design at a fabrication level.
Reducing the number of devices in order to save static power is difficult. Most of the devices which can
be removed are redundant and can be removed during fabrication using design algorithms. There is more
scope in attacking power gating. Exploiting speculation is probably the best way for architects to deal with
static power consumption.
Our current adaptive framework for multimedia applications can be modified to take static power consumption into account. This could possibly lead to different results than the case where we only consider
dynamic power. A more aggressive architecture, though efficient from the point of view of dynamic power,
might cause too much leakage when idle. Speculation could be used to predict the need for functional units.
Issue width and instruction window size could possibly impact static power in a different way than dynamic
power
Based on this, voltage can be scaled appropriately. Different sections of the circuit can be identified, and
alloted different Vdd and Vth .
Soft errors in multimedia applications can be exploited to further reduce static power. An architecture
like DIVA can be further optimized for static power consumption when certain soft errors are allowed. That
would allow us to further reduce the speed of the verification circuit (resulting in higher Vth and lower Vdd
and to also reduce the dynamic power consumption of entire circuit.
[1, 11, 7, 2, 8, 5, 3, 9, 6, 4, 10]
References
[1] T. M. Austin. Diva: A reliable substrate for deep submicron microarchitecture design. In Proc. of the
32nd Annual Intl. Symp. on Microarchitecture, 1998.
6
[2] S. Borkar. Design challenges of technology scaling. In IEEE MICRO, 1999, 1999.
[3] J. A. Butts and G. S. Sohi. A static power model for architects. In Proc. of the 33rd Annual Intl. Symp.
on Microarchitecture, 2000.
[4] A. P. Chandrakasan and R. W. Brodersen. Low power digital cmos design.
[5] S. Kaxiras, Z. Hu, and M. Martonosi. Cache decay : Exploiting generational behavior to reduce cache
leakage power. In Proc. of the 28th Annual Intl. Symp. on Comp. Architecture, 2001.
[6] W. Nebel and J. Mermet. Low power design in deep submicron electronics.
[7] J. P.Halter and F. N. Najm. A gate-level leakage power reduction method for ultra-low-power cmos
circuits. In Proc. IEEE Custom IC Conference, 1997, 1997.
[8] M. D. Powell, S.-H. Yang, B. Falsafi, K. Roy, and T.N.Vijaykumar. Gated-vdd: A circuit technique
to reduce leakage in deep-submicron cache memories. In Proc. of the Intl. Symposium on Low Power
Electronics and Design, 2000, 2000.
[9] K. Roy and S. C. Prasad. Low-power cmos vlsi circuit design.
[10] A. S. Sedra and K. C. Smith. Microelectronic circuits.
[11] S.-H. Yang, M. D. Powell, B. Falsafi, K. Roy, and T.N.Vijaykumar. An integrated circuit/architecture
approach to reducing leakage in deep-submicron high-performance i-caches. In Proc. of the 7th Intl.
Symp. on High-Perf. Comp. Architecture, 2001.
7
Download