C r + C in - Auburn University

advertisement
Design of Variable Input Delay Gates
for Low Dynamic Power Circuits
Tezaswi Raja, Transmeta Corp., Santa Clara, CA
Vishwani D. Agrawal, Dept. of ECE, Auburn University
Michael L. Bushnell, Dept. of ECE, Rutgers University
Research Funded by:
National Science Foundation
Talk Outline





Motivation
Transistor Level Design of Variable Input
Delay Gate
Results
References
Conclusion and Future Work
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
2
Motivation: Variable Input Delay Gates
2
1
3
2
1
0
0
Unoptimized


Produce glitches
Waste power.



2
3
1
1
2
0
Buffer Optimized
Variable Input Delay Gate
Glitches removed.
Active power
consumed in buffer.
Leakage paths added
through buffer.



Glitches removed.
No extra leakage
paths added.
Issues:


Sep 23, 2005
3
Tezaswi Raja: PATMOS Conf. Leuven.
Can we design such
a gate?
How much can the
delays through IO
paths differ by?
3
Problem Statement

Design a gate at the transistor-level such that
The gate has different delays along different IO
paths.
 The maximum achievable difference in delay
between any two paths (ub) through the gate can be
quantified.

Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
4
Transistor Level Implementation



We propose three new implementations of the variable
input delay gate
Capacitance manipulation method where the input
capacitance offered by the respective transistor pair is
varied.
Pass transistor added design where an extra
transistor is added to increase the resistance and
thereby the input delay. We propose the addition of:



Single nMOS transistor
CMOS pass transistor
We describe the pass transistor added design in detail
here. The first design is documented in the paper.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
5
Concept of Increasing Resistance
Ron C
in
Delay = Ron (Cp + Cr + Cin)
2
Cr
Energy = 0.5 (Cr + Cin) V

Need a CMOS gate with different delays along different IO
paths.

Note that the resistance of the path influences only the delay and not
the energy consumed.
Hence, adding more resistance can be the best way to add delay
without wasting more energy.

Solution: Add another transistor in series to the path.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
6
Single nMOSFET Added Design
Ron
Cr
Cr

Cin
Cin
Ron

d3,1 = Ron (Cr + Cin) + Rs Cin
Rs
d3,1
d3,2
d3,1 = Output + Input delay
d3,2 = Ron (Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
The input delay can be added by the input nMOS transistor in
series to the path desired.
The addition of resistance does not increase the energy per
transition.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
7
1- λ -IdsRs
Ids
+IdsRs
Rs
Linear
Rs
0
Logic 1 transmission
Logic 0 transmission
For pmos cutoff: (pmos threshold)
For nmos cutoff: (nmos threshold)
1- λ - IdsRs > Vdd – Vtp
IdsRs < Vtn

Constraints give upper bound on Rs and λ
Upper bound on Rs determines upper bound on ub
Can be made specific to any technology.

Note: nmos conducts logic ‘0’ well but ‘1’ is degraded (shown by λ).


Sep 23, 2005
Linear
Ids
Cutoff
1
Cutoff
Theoretical Calculation of ub
Tezaswi Raja: PATMOS Conf. Leuven.
8
Effect of Input Slope
Rs




Theoretical ub cannot be realized in practice due to noise issues.
Increased resistance degrades the slope of a signal and we use the
CMOS gate following it to regenerate the slope.
The regenerative capability of a gate is limited and this governs
practical ub value.
The slope allowed in a design depends on the noise specifications
of the circuit.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
9
Single nMOSFET Added Design

Advantages:





Complete independent control of input delays.
ub is very high compared to capacitance manipulation method.
Very less overhead compared to a conventional buffer.
Can be integrated to full-custom as well as standard cell place and
route design flows.
Design Issues:


nMOSFET degrades the signal when passing logic 1. Hence, it
increases the leakage of the transistors in the fanout stages.
However, this is for certain input combinations only.
Short circuit current is a function of the ratio of input/output
slopes. Since we increase the input slope by inserting resistance, it
might increase short circuit power by a minor amount.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
10
CMOS Pass Transistor Added Design
Ron
Rs
Cr
Ron
Cr


Cin
Cin
d3,1 = Ron (Cr + Cin) + Rs Cin
d3,1
d3,2
d3,1 = Output + Input delay
d3,2 = Ron (Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
The input delay can be added by the input CMOS pass transistor
in series to the path desired.
This does not degrade the signal as both transistors together
conduct both logic values well.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
11
1 -IdsRs
Linear
Rs
0
Ids



+IdsRs
Rs
Logic 1 transmission

Linear
Ids
Cutoff
1
Cutoff
Theoretical Calculation of ub
Logic 0 transmission
For pmos cutoff: (pmos threshold)
For nmos cutoff: (nmos threshold)
1 - IdsRs > Vdd – Vtp
IdsRs < Vtn
Constraints give upper bound on Rs and λ
Upper bound on Rs determines upper bound on ub
Can be made specific to any technology.
Note that the resistance is a parallel combination of both the resistances of the
transistors.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
12
CMOS Pass Transistor Added Design

Advantages:
No signal degradation for any logic value
 No increase in leakage current in fanout stage.
 All other advantages as the nMOSFET added design


Design Issues:
Two transistors are added instead of one.
 Effective resistance per unit length is lesser due to
the parallel combination of resistances.

Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
13
Technology Mapping
Delay required
Look Up Table for
sizes
Transistor Sizes




yes
Error
no
acceptable
?
Increment that
transistor
dimension
Sensitivity of
each transistor
size to delay
Determine sizes of transistors in a gate for the given delay and
given load capacitance.
First guess is given by the look-up table.
Second stage is sensitivity driven.
Reduces the complexity of transistor search.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
14
Physical Level Verification
c7552 Un-optimized
Gate Count
Transistor Count
Critical Delay
Area
Sep 23, 2005
= 3827
≈ 40,000
= 2.15 ns
= 710 x 710 um2
c7552 optimized (ub = 10)
Gate Count
= 3828
Transistor Count ≈ 45,000
Critical Delay
= 2.15 ns
Area
= 760 x 760 um2(1.14)
Tezaswi Raja: PATMOS Conf. Leuven.
15
Instantaneous Power Savings
Peak Power Savings = 68%
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
16
Average Energy Savings
Average Energy Savings = 58%
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
17
Related Publications

Theses
1.
“Minimum Dynamic Power Deisgn with Variable Input Delay Logic”, PhD Thesis, Dept. of
Elec. and Comp. Eng., Rutgers University, May 2004.
“Minimum Dynamic Power Design of CMOS Circuits using a Reduced Constraint Set Linear
Program,” MS Thesis, Dept. of Elec. and Comp. Eng., Rutgers University, May 2002.
2.

Journal Papers
1.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Low Power CMOS Design for Minimum
Power and Highest Speed using a New Gate Design”, submitted to IEEE Transactions
on VLSI(IEEETVLSI), in April, 2005.

Conference Papers:
1.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Design of Variable Input Delay Logic for
Low Dynamic Power Circuits,” Proc. Of PATMOS Conf. , Sep 2005.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input delay logic and its
Application to Low Power Design,” Proc. 18th Int’l. Conference on VLSI Design, Jan 2005.
T. Raja, V. D. Agrawal and M. L. Bushnell, “CMOS Design of Circuits for Minimum
Power and Highest Speed,” Proc. 17th Int’l. Conference on VLSI Design, Jan 2004.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic Power Design of
CMOS Circuits using a Reduced Constraint Set Linear Program,” Proc. 16th Int’l. Conf.
on VLSI Design, pp. 527-532, Jan 2003.
2.
3.
4.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
18
Conclusion

Pass transistor (nMOS and CMOS) can be used as a delay element instead of a buffer.

There are limitations to the size of the transmission gate used based on

Input slope degradation

Signal degradation when passing a high signal through nMOS.

Transmission gate can be used for delay as long as the delay does not exceed ub.

Described the technique to calculate ub for a given technology.

Described the algorithm for sizing of the three variable input delay gates for given delay
requirements.

Presented results on power savings using these new gates.

FUTURE WORK:

Include Leakage power in the analysis.

Analyze results for more recent technologies.
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
19
Thank you
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
20
Design Issues and FAQ

Is this not similar to Input Re-ordering
techniques?
Input re-ordering can change only the rise or fall delay but
not both.
 The capacitance manipulation method also cannot have
completely independent control over both rise and fall
delays but input re-ordering has zero control.
 The ub obtained by the input re-ordering is much smaller
than what can be obtained by Capacitance manipulation.

Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
21
Design Issues and FAQ

Does this increase Leakage Power?
Observed no increase for 0.25u technology.
 Need to investigate for present technologies.
 Can be complemented with known leakage reduction
techniques.


How big should the standard cell library be?
For c7552 with 3827 gates, we needed 155 different
standard cells generated by Prolific.
 Area can be further reduced if these cells are custom
designed.

Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
22
Transistor Overhead



1,4 – nMOS added design (for maxdelay = 1 and 2)
2,5 – CMOS added design (for maxdelay = 1 and 2)
3,6 – Buffer added design (for maxdelay = 1 and 2)
Sep 23, 2005
Tezaswi Raja: PATMOS Conf. Leuven.
23
Download