d 5 1 - Auburn University

advertisement
Minimum Dynamic Power Design Using
Variable Input Delay CMOS Logic
Vishwani D. Agrawal
Dept. of ECE, Auburn University, AL, USA
http://www.eng.auburn.edu/~vagrawal
Tezaswi Raja
Transmeta Corp., San Jose, CA, USA
Michael L. Bushnell
Dept. of ECE, Rutgers University, NJ, USA
Research Funded by: National Science Foundation
Jan 2005
Agrawal: Low Power Design
1
Talk Outline








Motivation
Background on Glitch Elimination Techniques
Problem Statement
New Variable Input Delay Logic
Transistor Level Design of Variable Input
Delay Gate
Results
Physical Level Implementation
Conclusion and Future Work
Jan 2005
Agrawal: Low Power Design
2
What Are Glitches?
Delay =1
2
Delay = 2
2

Glitches occur due to differential (unbalanced) path delays.
Glitches are transients that are unnecessary for the correct
functioning of the circuit.

Glitches waste power in CMOS circuits.

Jan 2005
Agrawal: Low Power Design
3
Prior work

Delay Balancing for Glitch Elimination:



Hazard Filtering for Glitch Elimination:



Glitch suppression by increasing the inertial delay of gates.
Ref: Agrawal et al., VLSI Design `97, `99, `03, `04.
Gate Sizing for Glitch Elimination:




Balancing delays by adding buffers on select paths.
Ref: Chandrakasan and Brodersen and other books
Every gate is modeled as an equivalent inverter.
Model is non-linear
Ref : Berkelaar et al., IEEE Trans. on Circuits and Systems ‘96
Transistor Sizing for Area-Speed Oprimization:




Jan 2005
Size the width and length of every transistor to get exact delay.
Model is non-linear
Convergence problems due to large search space.
Ref: Fishburn et al., ICCAD ’85.
Agrawal: Low Power Design
4
Example: Why Buffers Were Necessary?
1
Critical path delay = 3
1
1


Delay unit is the smallest delay possible for a
gate in a given technology.
Critical Path is the longest delay path in the
circuit and determines the speed of the circuit.
Jan 2005
Agrawal: Low Power Design
5
Example (cont.)
0
1
0
1
time
1

For glitch free operation of first gate:
Differential delay at inputs < inertial delay
 OK

Jan 2005
Agrawal: Low Power Design
6
Example (cont.)
1
1
1
0
time

1
For glitch free operation of second gate:


Jan 2005
Differential delay at inputs < inertial delay
OK (Assuming equality does not produce a glitch)
Agrawal: Low Power Design
7
Example (cont.)
1
time
1
2
1
0

For glitch free operation of third gate:


Jan 2005
Differential delay at inputs < inertial delay
Not true for gate 3
Agrawal: Low Power Design
8
Example (cont.)
1
time
1
2
1
1
1


For glitch free operation with no IO delay increase:
Must add a delay buffer.
Buffer is necessary for conventional gate design – only
gate output delay is controllable.
Jan 2005
Agrawal: Low Power Design
9
Controllable Input Delay Gates
1
time
1
2
1
2
0


Assume gate input delays to be controllable
Glitches can be suppressed without buffers
Jan 2005
Agrawal: Low Power Design
10
Problem Statement

Find a glitch reduction technique such that:
All glitches are eliminated in the circuit.
 No delay buffers are inserted in the circuit.
 Circuit operates at the highest possible speed
permitted by the device technology.
 Technique should be scalable for large circuits.
 Circuits are realizable at the physical level of design.

Note: The objective is to minimize switching power. Hence, no attempt is made to
reduce short-circuit and leakage power, which is an order of magnitude lower for
present CMOS technologies; those components of power may be addressed in the
future research.
Jan 2005
Agrawal: Low Power Design
11
New Variable Input Delay Logic
I/O path delay through a gate = Input Delay + Output Delay

Output Delay


Input Delay


Propagation delay through a gate from the inputs to the
outputs.
Extra delay that can be added on a single I/O path through
the gate, which can be controlled independently of the
other input delays.
Variable Input Delay Logic

Jan 2005
Logic level design of circuits using components with variable
input and output delays along different I/O paths through
the gate.
Agrawal: Low Power Design
12
Delay Model for a New Gate
1
d3,1 + d3
2
d3,2 + d3
3

Separate the output (inertial) and input delay variables.
d3 - output delay of the gate.
d3,1 - input delay of the gate along path from 1 to 3.

Technology constraint:




0  d3,1 ,d3,2  ub
Input delay difference has an upper bound, which we define as
Gate Input Differential Delay Upper Bound ( ub ).
Jan 2005
Agrawal: Low Power Design
13
Gate Input Differential Delay Upper
Bound (ub)

It is a measure of the maximum difference in delay of any
two I/O paths through the gate, that can be designed in a
given CMOS technology.

Arbitrary input delays cannot be realized in practice due to the
technology limitation at the transistor and layout levels.
The bound ub is the limit of flexibility allowed by the
technology to the designer at the transistor and layout levels.
The following feasibility condition must be imposed while
determining delays for glitch suppression:


0  di, j  ub
Jan 2005
Agrawal: Low Power Design
14
New Linear Programs

We propose two new LPs for designing circuits
based on the specifications of the design.

Minimum dynamic power (MDP) LP


Where the circuit consumes least power possible and
operates at the highest possible speed for that power.
Delay specification (DS) LP

Jan 2005
Where the circuit meets a given delay requirement
but does it by adding the smallest number of
buffers.
Agrawal: Low Power Design
15
New MDP LP Example
1
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
2
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4



Gate inertial delay variables d5 ..d7
Gate input delay variables di, j for every path through gate i
from input j
Corresponding window variables t5 ..t7 and T5 ..T7.
Jan 2005
Agrawal: Low Power Design
16
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4


Inertial delay constraint for gate 5: d5  1
Input delay (feasibility) constraints for gate 5:
0  d5,1  ub
 0  d5,2  ub

Jan 2005
Agrawal: Low Power Design
17
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4

Differential delay constraints for gate 5:
T5 > T1 + d5,1 + d5;
T5 > T2 + d5,2 + d5;
Jan 2005
t5 < t1+ d5,1 + d5;
t5 < t2+ d5,2 + d5;
Agrawal: Low Power Design
d5 > T5 – t5;
18
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
7
d7,6 + d7
d6,2 + d6
d7,4 + d7
3
d6,3 + d6
6
4

IO delay constraint for each PO in the circuit:
T7  maxdelay;
maxdelay is the parameter which gives the delay of the critical path.
This determines the speed of operation of the circuit.
Jan 2005
Agrawal: Low Power Design
19
New MDP LP Example (cont.)
1
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
2
7
d7,6 + d7
d6,2 + d6
d7,4 + d7
3
d6,3 + d6
6
4


Objective Function:
minimize maxdelay;
This gives the fastest possible, minimum dynamic power
consuming circuit, given the feasibility condition for the
technology.
Jan 2005
Agrawal: Low Power Design
20
Solution Curves
Power
Previous solutions
New MDP LP solutions
Power
consumed
by buffers
Minimum
Dynamic
power
ub = ∞
ub=15
ub=10
ub=5
Fastest Possible
Design in any
technology
Jan 2005
ub=0
Maxdelay
Agrawal: Low Power Design
21
Delay Specification LP


If the design needs to meet a given delay
specification and the designer is willing to sacrifice
some dynamic power by inserting buffers.
Modifications to MDP LP
Insert buffer variables at every fanout stem and
branches and at PIs (similar to Linear constraint set
method by Raja et al.)
 maxdelay is a given parameter, which is the maximum
delay of the critical path according to specification.

Jan 2005
Agrawal: Low Power Design
22
Delay Specification LP

Components of the LP
Gate constraints – unchanged
 Input delay (feasibility) constraints – unchanged for
same ub
 Differential delay constraints – unchanged
 Maxdelay constraints – unchanged but maxdelay is
a given parameter.
 Objective function:

Minimize sum ( dj) where j є buffers
Jan 2005
Agrawal: Low Power Design
23
Solution Curves
Power
Previous solutions
New MDP LP solutions
New DS LP solutions
Power
consumed
by buffers
Minimum
Dynamic
power
ub = ∞
ub=15
ub=10
ub=5
Fastest Possible
Design in any
technology
Jan 2005
ub=0
Maxdelay
Agrawal: Low Power Design
24
Transistor Level Implementation
Ron
Cr Cin
d3,1
Cin
d3,2
Ron
Ron
Cp
Cr
Cin
Cr



Conventional CMOS gate design:
 Delay = Ron ( Crouting + Cinput )
 Energy = 0.5 (Cr + Cin ) V2
Delay can be changed by changing the resistance or the capacitance.
Resistance does not affect energy per transition.
Jan 2005
Agrawal: Low Power Design
25
Transistor Level Implementation

Possible implementations of the variable input delay
gate:


Capacitance manipulation method where the input
capacitance offered by the respective transistor pair is varied.
Pass transistor added design where an extra transistor is
added to increase the resistance and thereby the input delay.
We propose the addition of:



Jan 2005
Single nMOS transistor
CMOS pass transistor
We describe the single nMOS transistor added design in detail
here. The other two are documented in the thesis.
Agrawal: Low Power Design
26
Single nMOSFET Added Design
Ron
d3,1 = Ron (Cr + Cin ) + Rs Cin
Rs
Cr
Cin
Cin
Ron
d3,1
d3,2

d3,2 = Ron (Cr + Cin )
Energy = 0.5 (Cr + Cin ) V2
Cr

d3,1 = Output + Input delay
The input delay can be added by an nMOS transistor in series to
the path desired.
The addition of resistance does not increase the energy per
transition.
Jan 2005
Agrawal: Low Power Design
27
Effect of Input Slope
Rs




Too large ub cannot be realized in practice due to noise issues.
Increased resistance degrades the slope of a signal and we use the
CMOS gate following it to regenerate the slope.
The regenerative capability of a gate is limited and this determines
practical ub value.
The slope allowed in a design depends on the noise specifications
of the circuit.
Jan 2005
Agrawal: Low Power Design
28
Single nMOSFET Added Design

Advantages:





Almost completely independent control of input delays.
ub is very high compared to capacitance manipulation method.
Very less overhead compared to a conventional buffer.
Can be integrated to full-custom as well as standard cell place and
route design flows.
Design Issues:


nMOSFET degrades the signal when passing logic 1. Hence, it
increases the leakage of the transistors in the fanout stages.
However, this is for certain input combinations only.
Short circuit current is a function of the ratio of input/output
slopes. Since we increase the input slope by inserting resistance, it
might increase short circuit power by a minor amount.
Jan 2005
Agrawal: Low Power Design
29
CMOS Pass Transistor Added Design
Ron
Rs
Cr
Ron
Cin
Cin
d3,1 = Ron (Cr + Cin) + Rs Cin
d3,1
d3,2

d3,2 = Ron (Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
Cr

d3,1 = Output + Input delay
The input delay can be added by the input CMOS pass transistor
in series to the path desired.
This does not degrade the signal as both transistors together
conduct both logic values well.
Jan 2005
Agrawal: Low Power Design
30
Technology Mapping
Delay required
Look Up Table for
sizes
Transistor Sizes




yes
Error
no
acceptable?
Increment that
transistor
dimension
Sensitivity of
each transistor
size to delay
Determine sizes of transistors in a gate for the given delay and
given load capacitance.
First guess is given by the look-up table.
Second stage is sensitivity driven.
Reduces the complexity of transistor search.
Jan 2005
Agrawal: Low Power Design
31
Results for Speed of Circuit Using MDP LP

Maxdelay is normalized to the length of the critical path when all gates are of unit delay.

Each curve is a different benchmark circuit.
As we increase ub the circuit becomes faster.
Flexibility required for fastest operation of circuit is proportional to the size of the circuit.


Jan 2005
Agrawal: Low Power Design
32
Power Opt. Using MDP LP (for ub=10)
Circuit
No. of maxdelay Norm.
vectors
delay
Original power
Optimized power
Avg.
Peak
Avg.
Peak
c432
56
71
4.17
1.0
1.0
0.65
0.55
c499
54
34
2.26
1.0
1.0
0.70
0.65
c880
78
45
1.50
1.0
1.0
0.48
0.45
c1355
87
67
2.05
1.0
1.0
0.47
0.36
c1908
144
173
4.32
1.0
1.0
0.54
0.44
c2670
82
35
1.09
1.0
1.0
0.68
0.56
c3540
200
347
7.38
1.0
1.0
0.53
0.43
c5315
157
542
11.06
1.0
1.0
0.53
0.44
c6288
141
124
1.87
1.0
1.0
0.22
0.18
c7552
158
50
1.16
1.0
1.0
0.28
0.26
Jan 2005
Agrawal: Low Power Design
33
Power Opt. Using DS LP (for ub=10)
Circuit
c432
c499
c880
c1355
c1908
Jan 2005
Norm.
Maxdelay
Conventional gates
Variable input delay gates
(Raja et al., VLSI Design `03)
Avg.
Peak
Buffers
Avg.
Peak
Buffers
1.0
0.72
0.67
95
0.69
0.66
61
2.0
0.62
0.60
66
0.65
0.55
0
1.0
0.91
0.87
48
0.86
0.84
0
2.0
0.70
0.66
0
0.71
0.65
0
1.0
0.68
0.54
62
0.58
0.45
1
2.0
0.68
0.52
34
0.56
0.45
0
1.0
0.58
0.48
224
0.48
0.42
64
2.0
0.57
0.48
192
0.44
0.39
32
1.0
0.69
0.59
219
0.56
0.46
5
2.0
0.59
0.44
70
0.55
0.45
4
Agrawal: Low Power Design
34
Power Opt. Using DS LP (for ub=10)
Circuit
c2670
c3540
c5315
c6288
c7552
Jan 2005
Norm. Power (conventional gates)
Maxdelay (Raja et al., VLSI Design `03)
Power (variable input
delay gates)
Avg.
Peak
Buffers
Avg.
Peak
Buffers
1.0
0.79
0.65
157
0.70
0.56
2
2.0
0.71
0.58
35
0.69
0.57
0
1.0
0.64
0.44
239
0.57
0.46
3
2.0
0.58
0.46
140
0.54
0.43
1
1.0
0.63
0.52
280
0.57
0.48
26
2.0
0.60
0.45
171
0.55
0.46
4
1.0
0.40
0.36
294
0.91
0.87
584
2.0
0.36
0.34
120
0.21
0.16
0
1.0
0.38
0.34
366
0.28
0.24
1
2.0
0.36
0.32
111
0.27
0.24
0
Agrawal: Low Power Design
35
Example Circuit
1
2
3
5
4
d=2
1
2
3
7
d=1
Unoptimized Circuit
d=1
d=1
d=1
5
4
d=1
1
2
3
6
4
d=2
Jan 2005
7
d=1
d=2
d=1
Buffer optimized
Circuit
d=1
5
7
6
d=2
6
d=1
d=1
nMOS optimized
Circuit
d=1
Agrawal: Low Power Design
36
Example Circuit – Spectre Results
time
Unoptimized Circuit
Jan 2005
time
Buffer optimized Circuit
Agrawal: Low Power Design
time
nMOS optimized Circuit
37
Physical Level Verification
AMPL
Delays
Technology Mapping
Transistor Sizes
Create Cells using Prolific
Standard Cell Library
No
Routing
acceptable?
Standard Cell Place and Route
Layout
Extract Routing Capacitance
Routing load
Yes
Optimized Layout
Jan 2005
Analog Power simulations
Energy Consumption
Agrawal: Low Power Design
38
Layouts of C7552 (0.25
CMOS)
c7552 Un-optimized
Gate Count
Transistor Count
Critical Delay
Area
Jan 2005
= 3827
≈ 40,000
= 2.15 ns
= 710 x 710 um2
c7552 optimized (ub = 10)
Gate Count
= 3828
Transistor Count ≈ 45,000
Critical Delay
= 2.15 ns
Area
= 760 x 760 um2(1.14)
Agrawal: Low Power Design
39
Instantaneous Power Savings
Peak Power Savings = 68%
Jan 2005
Agrawal: Low Power Design
40
Patents and Dissertations
Patents

V. D. Agrawal, “Low Power Circuits Through Hazard Pulse
Suppression,” U.S. Patent 5,983,007, November 1999.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay
CMOS Logic and Its Application to Low Power Design,” to be
submitted to USPTO through Rutgers Univ., May 2004.


Dissertations




Jan 2005
T. Raja, Minimum Dynamic Power Design of CMOS Circuits using a Reduced
Constraint Set Linear Program, MS Thesis, Dept. of ECE, Rutgers
University, May 2002.
T. Raja, Minimum Dynamic Power CMOS Design with Variable Input Delay
Logic , PhD Thesis, Dept. of ECE, Rutgers University, May 2004.
S. Uppalapati, Low Power Design of Standard Cell Digital VLSI Circuits,
MS. Thesis, Dept. of ECE, Rutgers University, October 2004.
Agrawal: Low Power Design
41
Papers





V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th Int. Conf.
VLSI Design, Jan. 1997, pp. 193-197.
V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss, “Digital
Circuit Design for Minimum Transient Energy and a Linear Programming
Method,” Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp. 434-439.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic Power CMOS
Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16th Int. Conf.
VLSI Design, Jan. 2003, pp. 527-532.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “CMOS Circuit Design for
Minimum Dynamic Power and Highest Speed,” Proc. 17th Int. Conf. VLSI Design,
Jan. 2004, pp. 1035-1040.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Variable Input Delay CMOS Logic
for Low Power Design,” Proc. 18th Int. Conf. VLSI Design, Jan. 2005, pp. 368-374.
Jan 2005
Agrawal: Low Power Design
42
Conclusion

Main idea: Minimum dynamic power high speed circuits can be
designed if gates with variable input delays are used.

The new design suppresses all glitches without any delay buffers.

Decreases power without loss in speed and very little increase in area.

Developed a linear program solution to demonstrate the idea.

Developed new gate design for transistor level implementation.

Results have been verified by physical layout design of large circuits.

Results show average power savings up to 58%.

Technique easily scalable for large circuits.

Leakage power remains a concern – ongoing research.
Jan 2005
Agrawal: Low Power Design
43
ILP Optimization of Leakage by
Dual-Threshold Devices
70nm CMOS, 90oC, spice evaluation.
Jan 2005
Agrawal: Low Power Design
44
Download