Tezaswi Raja
Transmeta Corp., San Jose, CA, USA
Vishwani D. Agrawal
Dept. of ECE, Auburn University, AL, USA http://www.eng.auburn.edu/~vagrawal
Michael L. Bushnell
Dept. of ECE, Rutgers University, NJ, USA
Research Funded by: National Science Foundation
Jan 2005 Raja et al.: Low Power Design 1
Motivation
Background on Glitch Elimination Techniques
Problem Statement
New Variable Input Delay Logic
Transistor Level Design of Variable Input
Delay Gate
Results
Physical Level Implementation
Conclusion and Future Work
Jan 2005 Raja et al.: Low Power Design 2
Delay =1
2
Delay = 2
2
Glitches occur due to differential (unbalanced) path delays.
Glitches are transients that are unnecessary for the correct functioning of the circuit.
Glitches waste power in CMOS circuits.
Jan 2005 Raja et al.: Low Power Design 3
Delay Balancing for Glitch Elimination:
Balancing delays by adding buffers on select paths.
Ref: Chandrakasan and Brodersen and other books
Hazard Filtering for Glitch Elimination:
Glitch suppression by increasing the inertial delay of gates.
Ref: Agrawal et al., VLSI Design `97, `99, `03, `04.
Gate Sizing for Glitch Elimination:
Every gate is modeled as an equivalent inverter.
Model is non-linear
Ref : Berkelaar et al., IEEE Trans. on Circuits and Systems ‘96
Transistor Sizing for Area-Speed Oprimization:
Size the width and length of every transistor to get exact delay.
Model is non-linear
Convergence problems due to large search space.
Ref: Fishburn et al., ICCAD ’85.
Jan 2005 Raja et al.: Low Power Design 4
1 Critical path delay = 3
1
1
Delay unit is the smallest delay possible for a gate in a given technology.
Critical Path is the longest delay path in the circuit and determines the speed of the circuit.
Jan 2005 Raja et al.: Low Power Design 5
0
0
1 time
Example (cont.)
1
1
For glitch free operation of first gate:
Differential delay at inputs < inertial delay
OK
Jan 2005 Raja et al.: Low Power Design 6
1
Example (cont.)
1
0 time
1
1
For glitch free operation of second gate:
Differential delay at inputs < inertial delay
OK (Assuming equality does not produce a glitch)
Jan 2005 Raja et al.: Low Power Design 7
Example (cont.)
1 time
1
2
1
0
For glitch free operation of third gate:
Differential delay at inputs < inertial delay
Not true for gate 3
Jan 2005 Raja et al.: Low Power Design 8
Example (cont.)
1 time
1
2
1
1
1
For glitch free operation with no IO delay increase:
Must add a delay buffer.
Buffer is necessary for conventional gate design – only gate output delay is controllable.
Jan 2005 Raja et al.: Low Power Design 9
Controllable Input Delay Gates
1 time
1
2
1
0
Assume gate input delays to be controllable
2
Glitches can be suppressed without buffers
Jan 2005 Raja et al.: Low Power Design 10
Find a glitch reduction technique such that:
All glitches are eliminated in the circuit.
No delay buffers are inserted in the circuit.
Circuit operates at the highest possible speed permitted by the device technology.
Technique should be scalable for large circuits.
Circuits are realizable at the physical level of design.
Note: The objective is to minimize switching power. Hence, no attempt is made to reduce short-circuit and leakage power, which is an order of magnitude lower for present CMOS technologies; those components of power may be addressed in the future research.
Jan 2005 Raja et al.: Low Power Design 11
I/O path delay through a gate = Input Delay + Output Delay
Output Delay
Propagation delay through a gate from the inputs to the outputs.
Input Delay
Extra delay that can be added on a single I/O path through the gate, which can be controlled other input delays.
independently of the
Variable Input Delay Logic
Logic level design of circuits using components with variable input and output delays along different I/O paths through the gate.
Jan 2005 Raja et al.: Low Power Design 12
1 d
3,1
+ d
3
3
2 d
3,2
+ d
3
Separate the output (inertial) and input delay variables.
d
3 d
3,1
- output delay of the gate.
- input delay of the gate along path from 1 to 3.
Technology constraint:
0 d
3,1
,d
3,2
u b
Input delay difference has an upper bound, which we define as
Gate Input Differential Delay Upper Bound ( u b
).
Jan 2005 Raja et al.: Low Power Design 13
b
It is a measure of the maximum difference in delay of any two I/O paths through the gate, that can be designed in a given CMOS technology.
Arbitrary input delays cannot be realized in practice due to the technology limitation at the transistor and layout levels.
The bound u b is the limit of flexibility allowed by the technology to the designer at the transistor and layout levels.
The following feasibility condition must be imposed while determining delays for glitch suppression:
0
d i, j
u b
Jan 2005 Raja et al.: Low Power Design 14
We propose two new LPs for designing circuits based on the specifications of the design.
Where the circuit consumes least power possible and operates at the highest possible speed for that power.
Where the circuit meets a given delay requirement but does it by adding the smallest number of buffers.
Jan 2005 Raja et al.: Low Power Design 15
d
5,1
+ d
5
5
1
2 d
5,2
+ d
5 d
6,2
+ d
6 d
7,5
+ d
7 d
7,6
+ d
7 d
7,4
+ d
7
7
3 d
6,3
+ d
6
6
4
Gate inertial delay variables d
Gate input delay variables from input j d i, j
5
..
d
7 for every path through gate i
Corresponding window variables t
5
..
t
7 and T
5
..
T
7
.
Jan 2005 Raja et al.: Low Power Design 16
d
5,1
+ d
5 1 5 d
7,5
+ d
7 d
5,2
+ d
5
2 d
6,2
+ d
6 d
7,6
+ d
7 d
7,4
+ d
7
3 d
6,3
+ d
6
6
4
Inertial delay constraint for gate 5:
0 d
0 d
5,1
u
5,2
u b b d
5
1
Input delay (feasibility) constraints for gate 5:
Jan 2005 Raja et al.: Low Power Design
7
17
1 d
5,1
+ d
5
5 d
5,2
+ d
5 d
7,5
+ d
7 7
2 d
6,2
+ d
6 d
7,6
+ d
7 d
7,4
+ d
7
3
6 d
6,3
+ d
6
T
5
T
5
4
Differential delay constraints for gate 5:
> T
1
+ d
5,1
> T
2
+ d
5,2
+ d
5
;
+ d
5
; t
5 t
5
< t
1
+ d
5,1
< t
2
+ d
5,2
+ d
5
; d
5
+ d
5
;
> T
5
– t
5
;
Jan 2005 Raja et al.: Low Power Design 18
1 d
5,1
+ d
5
5
2 d
5,2
+ d
5 d
6,2
+ d
6 d
7,5
+ d
7 d
7,6
+ d
7 d
7,4
+ d
7
7
3 d
6,3
+ d
6
6
4
IO delay constraint for each PO in the circuit:
T
7
maxdelay ; maxdelay is the parameter which gives the delay of the critical path.
This determines the speed of operation of the circuit.
Jan 2005 Raja et al.: Low Power Design 19
d
5,1
+ d
5
5
1
2 d
5,2
+ d
5 d
6,2
+ d
6 d
7,5
+ d
7 d
7,6
+ d
7 d
7,4
+ d
7
7
3 6 d
6,3
+ d
6
4
Objective Function
:
maxdelay;
This gives the fastest possible, minimum dynamic power consuming circuit, given the feasibility condition for the technology.
Jan 2005 Raja et al.: Low Power Design 20
Power
Previous solutions
New MDP LP solutions
Power consumed by buffers
Minimum
Dynamic power u b
=
∞ u b
=15 u b
=10 u b
=5
Fastest Possible
Design in any technology
Jan 2005 Raja et al.: Low Power Design u b
=0
Maxdelay
21
If the design needs to meet a given delay specification and the designer is willing to sacrifice some dynamic power by inserting buffers.
Modifications to MDP LP
Insert buffer variables at every fanout stem and branches and at PIs (similar to Linear constraint set method by Raja et al.)
maxdelay is a given parameter, which is the maximum delay of the critical path according to specification.
Jan 2005 Raja et al.: Low Power Design 22
Components of the LP
Gate constraints – unchanged
Input delay (feasibility) constraints – unchanged for same u b
Differential delay constraints – unchanged
Maxdelay constraints – unchanged but maxdelay is a given parameter.
Objective function:
Minimize sum ( d j
) where j є buffers
Jan 2005 Raja et al.: Low Power Design 23
Power
Previous solutions
New MDP LP solutions
New DS LP solutions
Power consumed by buffers
Minimum
Dynamic power u b
=
∞ u b
=15 u b
=10 u b
=5
Fastest Possible
Design in any technology
Jan 2005 Raja et al.: Low Power Design u b
=0
Maxdelay
24
R on
R on
C r
C in
C in
C r d
3,1 d
3,2
R on
C p
C r
C in
Conventional CMOS gate design:
Delay = R on
( C routing
+ C input
)
Energy = 0.5 (C r
+ C in
) V 2
Delay can be changed by changing the resistance or the capacitance.
Resistance does not affect energy per transition.
Jan 2005 Raja et al.: Low Power Design 25
Possible implementations of the variable input delay gate:
Capacitance manipulation method where the input capacitance offered by the respective transistor pair is varied.
Pass transistor added design where an extra transistor is added to increase the resistance and thereby the input delay.
We propose the addition of:
Single nMOS transistor
CMOS pass transistor
We describe the single nMOS transistor added design in detail here. The other two are documented in the thesis.
Jan 2005 Raja et al.: Low Power Design 26
n
R on
R on
C r
R s
C in
C in
C r d
3,1 d
3,2 d
3,1 d
3,1
= R on
(C r d
3,2
= R on
+ C in
) + R s
= Output + Input delay
(C r
+ C in
)
Energy = 0.5 (C r
+ C in
The input delay can be added by an nMOS transistor in series to the path desired.
The addition of resistance does not increase the energy per transition.
C in
) V 2
Jan 2005 Raja et al.: Low Power Design 27
R s
Too large u b cannot be realized in practice due to noise issues.
Increased resistance degrades the slope of a signal and we use the
CMOS gate following it to regenerate the slope.
The regenerative capability of a gate is limited and this determines practical u b value.
The slope allowed in a design depends on the noise specifications of the circuit.
Jan 2005 Raja et al.: Low Power Design 28
Advantages:
Almost completely independent control of input delays.
u b is very high compared to capacitance manipulation method.
Very less overhead compared to a conventional buffer.
Can be integrated to full-custom as well as standard cell place and route design flows.
Design Issues:
nMOSFET degrades the signal when passing logic 1. Hence, it increases the leakage of the transistors in the fanout stages.
However, this is for certain input combinations only.
Short circuit current is a function of the ratio of input/output slopes. Since we increase the input slope by inserting resistance, it might increase short circuit power by a minor amount.
Jan 2005 Raja et al.: Low Power Design 29
R on
R s
R on
C r
C in
C in
C r d
3,1 d
3,2 d
3,1 d
3,1
= R on
(C r
+ C in
) + R s
= Output + Input delay d
3,2
= R on
(C r
+ C in
)
Energy = 0.5 (C r
+ C in
The input delay can be added by the input CMOS pass transistor in series to the path desired.
This does not degrade the signal as both transistors together conduct both logic values well.
C in
) V 2
Jan 2005 Raja et al.: Low Power Design 30
Delay required
Look Up Table for sizes
Transistor Sizes yes Error acceptable?
no
Increment that transistor dimension
Sensitivity of each transistor size to delay
Determine sizes of transistors in a gate for the given delay and given load capacitance.
First guess is given by the look-up table.
Second stage is sensitivity driven.
Reduces the complexity of transistor search.
Jan 2005 Raja et al.: Low Power Design 31
Results for Speed of Circuit Using MDP LP
Maxdelay is normalized to the length of the critical path when all gates are of unit delay.
Each curve is a different benchmark circuit.
As we increase u b the circuit becomes faster.
Flexibility required for fastest operation of circuit is proportional to the size of the circuit.
Jan 2005 Raja et al.: Low Power Design 32
u b
Circuit No. of vectors maxdelay Norm. delay
Original power Optimized power c432 c499 c880 c1355 c1908 c2670 c3540 c5315 c6288 c7552
144
82
200
157
56
54
78
87
141
158
173
35
347
542
71
34
45
67
124
50
4.17
2.26
1.50
2.05
4.32
1.09
7.38
11.06
1.87
1.16
Avg.
Peak Avg.
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
0.54
0.68
0.53
0.53
0.65
0.70
0.48
0.47
0.22
0.28
Peak
0.44
0.56
0.43
0.44
0.55
0.65
0.45
0.36
0.18
0.26
Jan 2005 Raja et al.: Low Power Design 33
Circuit Norm.
Maxdelay c432 c499 c880 c1355 c1908
u b
2.0
1.0
2.0
1.0
1.0
2.0
1.0
2.0
1.0
2.0
Conventional gates
(Raja et al., VLSI Design `03)
Variable input delay gates
Avg.
Peak Buffers Avg.
Peak Buffers
0.72
0.62
0.91
0.70
0.68
0.68
0.58
0.57
0.69
0.59
0.67
0.60
0.87
0.66
0.54
0.52
0.48
0.48
0.59
0.44
95
66
48
0
62
34
224
192
219
70
0.69
0.65
0.86
0.71
0.58
0.56
0.48
0.44
0.56
0.55
0.66
0.55
0.84
0.65
0.45
0.45
0.42
0.39
0.46
0.45
0
1
0
64
61
0
0
32
5
4
Jan 2005 Raja et al.: Low Power Design 34
Circuit Norm.
Maxdelay c2670 c3540 c5315 c6288 c7552
u b
2.0
1.0
2.0
1.0
1.0
2.0
1.0
2.0
1.0
2.0
Power (conventional gates)
(Raja et al., VLSI Design `03)
Avg.
Power (variable input delay gates)
Peak Buffers Avg.
Peak Buffers
0.79
0.71
0.64
0.58
0.63
0.60
0.40
0.36
0.38
0.36
0.65
0.58
0.44
0.46
0.52
0.45
0.36
0.34
0.34
0.32
157
35
239
140
280
171
294
120
366
111
0.70
0.69
0.57
0.54
0.57
0.55
0.91
0.21
0.28
0.27
0.56
0.57
0.46
0.43
0.48
0.46
0.87
0.16
0.24
0.24
1
26
4
584
2
0
3
0
1
0
Jan 2005 Raja et al.: Low Power Design 35
1
2
3
4 d=2
5
7 d=1 d=1
6 d=1
Unoptimized Circuit
1
2
3
4 d=1 d=1
5
7 d=1 d=2
6 d=1
1
2
3 4 d=2 d=1
Jan 2005
5
7 d=1 d=2 d=1
6 d=1
Raja et al.: Low Power Design
Buffer optimized
Circuit nMOS optimized
Circuit
36
time time time
Unoptimized Circuit Buffer optimized Circuit nMOS optimized Circuit
Jan 2005 Raja et al.: Low Power Design 37
Yes
No
Routing acceptable?
AMPL
Delays
Technology Mapping
Transistor Sizes
Create Cells using Prolific
Standard Cell Library
Standard Cell Place and Route
Layout
Extract Routing Capacitance
Routing load
Analog Power simulations
Energy Consumption
Optimized Layout
Jan 2005 Raja et al.: Low Power Design 38
Jan 2005 c7552 Un-optimized
Gate Count = 3827
Transistor Count ≈ 40,000
Critical Delay = 2.15 ns
Area = 710 x 710 um 2 c7552 optimized (u b
= 10)
Gate Count = 3828
Transistor Count
≈ 45,000
Critical Delay = 2.15 ns
Area = 760 x 760 um 2 (1.14)
Raja et al.: Low Power Design 39
Jan 2005
Raja et al.: Low Power Design 40
Patents
V. D. Agrawal, “Low Power Circuits Through Hazard Pulse
Suppression,” U.S. Patent 5,983,007, November 1999.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay
CMOS Logic and Its Application to Low Power Design,” to be submitted to USPTO through Rutgers Univ., May 2004.
Dissertations
T. Raja, Minimum Dynamic Power Design of CMOS Circuits using a Reduced
Constraint Set Linear Program, MS Thesis, Dept. of ECE, Rutgers
University, May 2002.
T. Raja, Minimum Dynamic Power CMOS Design with Variable Input Delay
Logic , PhD Thesis, Dept. of ECE, Rutgers University, May 2004.
S. Uppalapati, Low Power Design of Standard Cell Digital VLSI Circuits,
MS. Thesis, Dept. of ECE, Rutgers University, October 2004.
Jan 2005 Raja et al.: Low Power Design 41
V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th
Int. Conf. VLSI Design, Jan. 1997, pp. 193-197.
V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss,
“Digital Circuit Design for Minimum Transient Energy and a
Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design,
Jan. 1999, pp. 434-439.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic
Power CMOS Circuit Design by a Reduced Constraint Set Linear
Program,” Proc. 16th Int. Conf. VLSI Design, Jan. 2003, pp. 527-532.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “CMOS Circuit Design for Minimum Dynamic Power and Highest Speed,” Proc. 17th Int.
Conf. VLSI Design, Jan. 2004, pp. 1035-1040.
Jan 2005 Raja et al.: Low Power Design 42
Main idea: Minimum dynamic power high speed circuits can be designed if gates with variable input delays are used.
The new design suppresses all glitches without any delay buffers.
Decreases power without loss in speed and very little increase in area.
Developed a linear program solution to demonstrate the idea.
Developed new gate design for transistor level implementation.
Results have been verified by physical layout design of large circuits.
Results show average power savings up to 58%.
Technique easily scalable for large circuits.
Leakage power remains a concern – ongoing research.
Jan 2005 Raja et al.: Low Power Design 43