Lectures 1 and 2: Introduction

advertisement
CSV881: Low-Power Design
Introduction to Low Power Design
Vishwani D. Agrawal
James J. Danaher Professor
Dept. of Electrical and Computer Engineering
Auburn University, Auburn, AL 36849
vagrawal@eng.auburn.edu
http://www.eng.auburn.edu/~vagrawal
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
1
Course Objectives
Low-power is a current need in VLSI
design.
 Learn basic ideas, concepts, theory and
methods.
 Gain experience with techniques and tools.

Copyright Agrawal, 2011
Lectures 1, 2: Introduction
2
Course Description
This course is designed for the MTech program in VLSI at IIT, Delhi. It is
patterned after a one-semester graduate-level course offered at Auburn
University. A set of 16 lectures that include classroom exercises provide
understanding of theoretical and practical aspects of power and energy in
digital VLSI systems. The course fulfills a basic need of today’s industrial
design environment. Specific topics include power components of digital
CMOS circuits, power analysis, glitch elimination for reducing dynamic power,
dual-threshold design for reduced static power, voltage and frequency
scaling*, power management in memories* and microprocessors*, parallelism
for power saving, battery management*, test power, ultra-low voltage
(subthreshold) logic circuits*, and low power technologies (domino CMOS,
pass transistor logic)*, adiabatic logic*.
________________
* Not included in short course.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
3
Outline






















Lecture 1:
Introduction (37)
* Number of slides
Lecture 2:
Examples
Homework 1 (10 points)
Lecture 3:
Power dissipation in CMOS Circuits (46):
Lecture 4:
Power of a transition, static power
Homework 2 (10 points)
Lecture 5:
Gate-level power analysis (56):
Lecture 6:
Logic simulation, delay estimation
Lecture 7:
Transition density, Probabilistic methods
Homework 3 (10 points)
Lecture 8:
Linear Programming – A Mathematical optimization technique (44):
Lecture 9:
Examples of LP and ILP optimization
Homework 4 (10 points)
Lecture 10:
Gale-level power optimization (59):
Lecture 11:
Glitch-free design for reduced dynamic power
Lecture 12:
Dual-threshold design for reduced leakage
Lecture 13:
Multicore design for low power (23)
Homework 5 (10 points)
Lecture 14:
Test Power (52)
Lecture 15:
Test Power (continued)
Lecture 16:
SoC Test Scheduling
EXAM (50 points)
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
4
Schedule









Oct 21, 2013 – 3:30-5:00PM Lectures 1 and 2
Oct 22, 2013 – 3:30-5:00PM Lectures 3 and 4
Oct 23, 2013 – 3:30-5:00PM Lectures 5 and 6
Oct 24, 2013 – 3:30-5:00PM Lectures 7 and 8
Oct 25, 2013 – 3:30-5:00PM Lectures 9 and 10
Oct 26, 2013 – 3:30-5:00PM Lectures 11 and 12
Oct 28, 2013 – 4:00-5:30PM Lectures 13 and 14
Oct 29, 2013 – 4:00-5:30PM Lectures 15 and 16
Oct 31, 2013 – EXAM
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
5
Power Consumption of VLSI Chips
Why is it a concern?
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
6
ISSCC, Feb. 2001, Keynote
“Ten years from now,
microprocessors will run at
10GHz to 30GHz and be capable
of processing 1 trillion operations
per second – about the same
number of calculations that the
world's fastest supercomputer
can perform now.
Patrick P. Gelsinger
Senior Vice President
General Manager
Digital Enterprise Group
INTEL CORP.
Copyright Agrawal, 2011
“Unfortunately, if nothing
changes these chips will produce
as much heat, for their
proportional size, as a nuclear
reactor. . . .”
Lectures 1, 2: Introduction
7
VLSI Chip Power Density
Sun’s
Surface
Power Density (W/cm2)
10000
Rocket
Nozzle
1000
Nuclear
Reactor
100
8086
Hot Plate
10 4004
8008 8085
386
286
8080
1
1970
Copyright Agrawal, 2011
1980
P6
Pentium®
486
1990
Year
Lectures 1, 2: Introduction
Source: Intel
2000
2010
8
Year
1999
2002
2005
2008
2011
2014
Feature size (nm)
180
130
100
70
50
35
Logic transistors/cm2
6.2M
18M
39M
84M
180M
390M
Clock (GHz)
1.25
2.1
3.5
6.0
10.0
16.9
Chip size (mm2)
340
430
520
620
750
900
Power supply (V)
1.8
1.5
1.2
0.9
0.6
0.5
High-perf. Power (W)
90
130
160
170
175
183
Untrue predictions.
SIA Roadmap for Processors (1999)
Source: http://www.semichips.org
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
9
Recent Data
Source: http://www.eetimes.com/story/OEG20040123S0041
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
10
Low-Power Design
 Design
practices that reduce power
consumption by at least one order of
magnitude; in practice 50% reduction
is often acceptable.
 Low-power design methods:
Algorithms and architectures
 High-level and software techniques
 Gate and circuit-level methods
 Test power

Copyright Agrawal, 2011
Lectures 1, 2: Introduction
11
VLSI Building Blocks










Finite-state machine (FSM)
Bus
Flip-flops and shift registers
Memories
Datapath
Processors
Power grid
Clock distribution
Analog circuits
RF components
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
12
Specific Topics in Low-Power


Power dissipation in CMOS circuits
Device technology



Circuit and gate level methods







Logic synthesis
Dynamic power reduction techniques
Leakage power reduction
System level methods


Low-power CMOS technologies
Energy recovery methods
Microprocessors
Arithmetic circuits
Low power memory technology
Test Power
Power estimation
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
13
Some Examples
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
14
State Encoding for a Counter

Two-bit binary counter:
 State
sequence, 00 → 01 → 10 → 11 → 00
 Six bit transitions in four clock cycles
 6/4 = 1.5 transitions per clock

Two-bit Gray-code counter
 State
sequence, 00 → 01 → 11 → 10 → 00
 Four bit transitions in four clock cycles
 4/4 = 1.0 transition per clock

Gray-code counter is more power efficient.
G. K. Yeap, Practical Low Power Digital VLSI Design, Boston:
Kluwer Academic Publishers (now Springer), 1998.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
15
Binary Counter: Original
Encoding
Present
state
a
0
0
1
1
b
0
1
0
1
a
Next state
b
A
0
1
1
0
B
1
0
1
0
A = a’b + ab’ = a xor b
B = a’b’ + ab’ = b’
Copyright Agrawal, 2011
A
B
CK
CLR
Lectures 1, 2: Introduction
16
Binary Counter: Gray Encoding
Present
state
a
0
0
1
1
b
0
1
0
1
Next state
A
A
0
1
0
1
B
1
1
0
0
B
b
a
A = a’b + ab = b
B = a’b’ + a’b = a’
Copyright Agrawal, 2011
CK
CLR
Lectures 1, 2: Introduction
17
Three-Bit Counters
State
Binary
No. of toggles
Gray-code
State
No. of toggles
000
-
000
-
001
1
001
1
010
2
011
1
011
1
010
1
100
3
110
1
101
1
111
1
110
2
101
1
111
1
100
1
000
3
000
1
Av. Transitions/clock = 1.75
Copyright Agrawal, 2011
Av. Transitions/clock = 1
Lectures 1, 2: Introduction
18
N-Bit Counter: Toggles in Counting Cycle



Binary counter: T(binary) = 2(2N – 1)
Gray-code counter: T(gray) = 2N
T(gray)/T(binary) = 2N-1/(2N – 1) → 0.5
Bits
T(binary)
T(gray)
T(gray)/T(binary)
1
2
2
1.0
2
6
4
0.6667
3
14
8
0.5714
4
30
16
0.5333
5
62
32
0.5161
6
126
64
0.5079
∞
-
-
0.5000
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
19
FSM State Encoding
Transition
probability
based on
PI statistics
0.6
11
0.3
0.4
00
0.6
0.6
0.1
01
0.3
0.1
0.4
01
00
0.9
0.6
0.1
0.1
11
0.9
Expected number of state-bit transitions:
2(0.3+0.4) + 1(0.1+0.1) = 1.6
1(0.3+0.4+0.1) + 2(0.1) = 1.0
State encoding can be selected using a power-based cost function.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
20
FSM: Clock-Gating

Moore machine: Outputs depend only on
the state variables.

If a state has a self-loop in the state transition
graph (STG), then clock can be stopped
whenever a self-loop is to be executed.
Xi/Zk
Si
Sk
Sj
Copyright Agrawal, 2011
Xk/Zk
Clock can be stopped
when (Xk, Sk) combination
occurs.
Xj/Zk
Lectures 1, 2: Introduction
21
Clock-Gating in Moore FSM
PI
Flip-flops
Combinational
logic
Clock
activation
logic
CK
Copyright Agrawal, 2011
Latch
PO
L. Benini and G. De Micheli,
Dynamic Power Management,
Boston: Springer, 1998.
Lectures 1, 2: Introduction
22
Bus Encoding for Reduced Power

Example: Four bit bus
0000 → 1110 has three transitions.
 If bits of second pattern are inverted, then 0000 →
0001 will have only one transition.

Bit-inversion encoding for N-bit bus:
Number of bit transitions
after inversion encoding

Copyright Agrawal, 2011
N
N/2
0
0
N/2
Number of bit transitions
Lectures 1, 2: Introduction
N
23
Sent data
Received data
Bus-Inversion Encoding Logic
Polarity
decision
logic
Copyright Agrawal, 2011
Bus register
Polarity bit
M. Stan and W. Burleson, “Bus-Invert
Coding for Low Power I/O,” IEEE
Trans. VLSI Systems, vol. 3, no. 1,
pp. 49-58, March 1995.
Lectures 1, 2: Introduction
24
Clock-Gating in Low-Power Flip-Flop
D
D
Q
CK
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
25
Example: Benchmark S5378




TSMC025 CMOS technology
50ns clock
1,000 random vectors
Reference: J. D. Alexander, Simulation Based Power Estimation
for Digital CMOS Technologies, Master’s Thesis, Auburn
University, December 2008, Section 3.8.
Clock
Number Number
of comb. of flipgates
flops
Power consumption in μW
Comb.
gates
Flip-flops
Total
Ungated
2,958
179
330
752
1,082
Gated
3,316
179
276
32
308
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
26
Example: Shift Register
D
D
Q
D
Q
D
Q
D
Q
Output
D
Q
D
Q
D
Q
D
Q
CK
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
27
Reduced-Power Shift Register
D
Q
D
Q
D
Q
D
Q
multiplexer
D
D
Q
D
Q
D
Q
D
Output
Q
CK(f/2)
Flip-flops are operated at full voltage and half the clock frequency.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
28
Power Consumption of Shift Register
P = C’VDD2f/n
16-bit shift register, 2μ CMOS
Deg. of
parallelism
Freq
(MHz)
Power
(μW)
1
33.0
1535
2
16.5
887
4
8.25
738
C. Piguet, “Circuit and Logic Level
Design,” pages 103-133 in W. Nebel
and J. Mermet (ed.), Low Power
Design in Deep Submicron
Electronics, Springer, 1997.
Copyright Agrawal, 2011
Normalized power
1.0
0.5
0.25
0.0
Lectures 1, 2: Introduction
1
2
4
Degree of parallelism, n
29
Books on Low-Power Design (1)







L. Benini and G. De Micheli, Dynamic Power Management Design
Techniques and CAD Tools, Boston: Springer, 1998.
T. D. Burd and R. A. Brodersen, Energy Efficient Microprocessor
Design, Boston: Springer, 2002.
A. Chandrakasan and R. Brodersen, Low-Power Digital CMOS
Design, Boston: Springer, 1995.
A. Chandrakasan and R. Brodersen, Low-Power CMOS Design, New
York: IEEE Press, 1998.
J.-M. Chang and M. Pedram, Power Optimization and Synthesis at
Behavioral and System Levels using Formal Methods, Boston:
Springer, 1999.
D. Chinnery and K. Keutzer, Closing the Power Gap Between ASIC &
Custom: Tools and Techniques for Low Power Design, Springer,
2007, ISBN 0387257632, 9780387257631.
M. S. Elrabaa, I. S. Abu-Khater and M. I. Elmasry, Advanced LowPower Digital Circuit Techniques, Boston: Springer, 1997.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
30
Books on Low-Power Design (2)







P. Girard, N. Nicolici and X. Wen, Power-Aware Testing and Test
Strategies for Low Power Devices, Springer, 2010.
R. Graybill and R. Melhem, Power Aware Computing, New York:
Plenum Publishers, 2002.
S. Iman and M. Pedram, Logic Synthesis for Low Power VLSI
Designs, Boston: Springer, 1998.
M. Keating, D. Flynn, R. Aitken, A. Gibbons and K. Shi, Low Power
Methodology Manual For System-on-Chip Design, 1st ed. 2007.
Corr. 2nd printing, 2007, XVI, 304 p., Hardcover, ISBN: 978-0-38771818-7.
J. B. Kuo and J.-H. Lou, Low-Voltage CMOS VLSI Circuits, New
York: Wiley-Interscience, 1999.
J. Monteiro and S. Devadas, Computer-Aided Design Techniques
for Low Power Sequential Logic Circuits, Boston: Springer, 1997.
S. G. Narendra and A. Chandrakasan, Leakage in Nanometer
CMOS Technologies, Boston: Springer, 2005.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
31
Books on Low-Power Design (3)







W. Nebel and J. Mermet, Low Power Design in Deep Submicron
Electronics, Boston: Springer, 1997.
N. Nicolici and B. M. Al-Hashimi, Power-Constrained Testing of VLSI
Circuits, Boston: Springer, 2003.
V. G. Oklobdzija, V. M. Stojanovic, D. M. Markovic and N. Nedovic,
Digital System Clocking: High Performance and Low-Power
Aspects, Wiley-IEEE, 2005.
M. Pedram and J. M. Rabaey, Power Aware Design Methodologies,
Boston: Springer, 2002.
C. Piguet, Low-Power Electronics Design, Boca Raton: Florida: CRC
Press, 2005.
J. M. Rabaey and M. Pedram, Low Power Design Methodologies,
Boston: Springer, 1996.
S. Roudy, P. K. Wright and J. M. Rabaey, Energy Scavenging for
Wireless Sensor Networks, Boston: Springer, 2003.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
32
Books on Low-Power Design (4)






K. Roy and S. C. Prasad, Low-Power CMOS VLSI Circuit Design,
New York: Wiley-Interscience, 2000.
E. Sánchez-Sinencio and A. G. Andreaou, Low-Voltage/Low-Power
Integrated Circuits and Systems – Low-Voltage Mixed-Signal
Circuits, New York: IEEE Press, 1999. W. A. Serdijn, Low-Voltage
Low-Power Analog Integrated Circuits, Boston: Springer, 1995.
S. Sheng and R. W. Brodersen, Low-Power Wireless
Communications: A Wideband CDMA System Design, Boston:
Springer, 1998.
G. Verghese and J. M. Rabaey, Low-Energy FPGAs, Boston:
Springer, 2001.
G. K. Yeap, Practical Low Power Digital VLSI Design, Boston:
Springer, 1998.
K.-S. Yeo and K. Roy, Low-Voltage Low-Power Subsystems,
McGraw Hill, 2004.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
33
Books Useful in Low-Power Design







A. Chandrakasan, W. J. Bowhill and F. Fox, Design of HighPerformance Microprocessor Circuits, New York: IEEE Press, 2001.
R. C. Jaeger and T. N. Blalock, Microelectronic Circuit Design, Third
Edition, McGraw-Hill, 2006.
S. M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits, New
York: McGraw-Hill, 1996.
E. Larsson, Introduction to Advanced System-on-Chip Test Design
and Optimization, Springer, 2005.
J. M. Rabaey, A. Chandrakasan and B. Nikolić, Digital Integrated
Circuits, Second Edition, Upper Saddle River, New Jersey: PrenticeHall, 2003.
J. Segura and C. F. Hawkins, CMOS Electronics, How It Works, How It
Fails, New York: IEEE Press, 2004.
N. H. E. Weste and D. Harris, CMOS VLSI Design, Third Edition,
Reading, Massachusetts: Addison-Wesley, 2005.
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
34
Problem: Bus Encoding
A 1-hot encoding is to be used for reducing the capacitive power
consumption of an n-bit data bus. All n bits are assumed to be
independent and random. Derive a formula for the ratio of power
consumptions on the encoded and the un-coded buses. Show that
n ≥ 4 is essential for the 1-hot encoding to be beneficial.
Reference: A. P. Chandrakasan and R. W. Brodersen, Low Power
Digital CMOS Design, New York: Springer, 1995, pp. 224-225. [Hint:
You should be able to solve this problem without the help of the
reference.]
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
35
Solution: Bus Encoding
Un-coded bus: Two consecutive bits on a wire can be 00, 01, 10 and
11, each occurring with a probability 0.25. Considering only the 0→1
transition, which draws energy from the supply, the probability of a
data pattern consuming CV 2 energy on a wire is ¼. Therefore, the
average per pattern energy for all n wires of the bus is CV 2n/4.
Encoded bus: Encoded bus contains 2n wires. The 1-hot encoding
ensures that whenever there is a change in the data pattern, exactly
one wire will have a 01 transition, charging its capacitance and
consuming CV 2 energy. There can be 2n possible data patterns and
exactly one of these will match the previous pattern and consume no
energy. Thus, the per pattern energy consumption of the bus is 0 with
probability 2–n, and CV 2 with probability 1 – 2–n. The average per
pattern energy for the 1-hot encoded bus is CV 2(1 – 2–n).
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
36
Solution: Bus Encoding (Cont.)
Power ratio
=
Encoded bus power / un-coded bus power
=
4(1 – 2–n)/n → 4/n for large n
For the encoding to be beneficial, the above power ratio should be
less than 1. That is, 4(1 – 2–n)/n ≤ 1, or 1 – 2–n ≤ n/4, or n/4 ≥ 1
(approximately) → n ≥ 4.
The following table shows 1-hot encoded bus power ratio as a
function of bus width:
n
4(1 – 2–n)/n
n
4(1 – 2–n)/n
1
2.0000
8
0.4981
2
1.5000
16
0.2500 = 1/4
3
1.1670
32
1/8
4
0.9375
64
1/16
Copyright Agrawal, 2011
Lectures 1, 2: Introduction
37
Download