Variable Input Delay CMOS Logic for Low Power Design Tezaswi Raja

advertisement

Variable Input Delay CMOS Logic for

Low Power Design

Tezaswi Raja

Transmeta Corp., San Jose, CA, USA

Vishwani D. Agrawal

Dept. of ECE, Auburn University, AL, USA http://www.eng.auburn.edu/~vagrawal

Michael L. Bushnell

Dept. of ECE, Rutgers University, NJ, USA

Research Funded by: National Science Foundation

Jan 2005 Raja et al.: Low Power Design 1

Talk Outline

Motivation

Background on Glitch Elimination Techniques

Problem Statement

New Variable Input Delay Logic

Transistor Level Design of Variable Input

Delay Gate

Results

Physical Level Implementation

Conclusion and Future Work

Jan 2005 Raja et al.: Low Power Design 2

What Are Glitches?

Delay =1

2

Delay = 2

2

Glitches occur due to differential (unbalanced) path delays.

Glitches are transients that are unnecessary for the correct functioning of the circuit.

Glitches waste power in CMOS circuits.

Jan 2005 Raja et al.: Low Power Design 3

Prior work

Delay Balancing for Glitch Elimination:

Balancing delays by adding buffers on select paths.

Ref: Chandrakasan and Brodersen and other books

Hazard Filtering for Glitch Elimination:

Glitch suppression by increasing the inertial delay of gates.

Ref: Agrawal et al., VLSI Design `97, `99, `03, `04.

Gate Sizing for Glitch Elimination:

Every gate is modeled as an equivalent inverter.

Model is non-linear

Ref : Berkelaar et al., IEEE Trans. on Circuits and Systems ‘96

Transistor Sizing for Area-Speed Oprimization:

Size the width and length of every transistor to get exact delay.

Model is non-linear

Convergence problems due to large search space.

Ref: Fishburn et al., ICCAD ’85.

Jan 2005 Raja et al.: Low Power Design 4

Example: Why Buffers Were Necessary?

1 Critical path delay = 3

1

1

Delay unit is the smallest delay possible for a gate in a given technology.

Critical Path is the longest delay path in the circuit and determines the speed of the circuit.

Jan 2005 Raja et al.: Low Power Design 5

0

0

1 time

Example (cont.)

1

1

For glitch free operation of first gate:

Differential delay at inputs < inertial delay

OK

Jan 2005 Raja et al.: Low Power Design 6

1

Example (cont.)

1

0 time

1

1

 For glitch free operation of second gate:

Differential delay at inputs < inertial delay

OK (Assuming equality does not produce a glitch)

Jan 2005 Raja et al.: Low Power Design 7

Example (cont.)

1 time

1

2

1

0

 For glitch free operation of third gate:

Differential delay at inputs < inertial delay

Not true for gate 3

Jan 2005 Raja et al.: Low Power Design 8

Example (cont.)

1 time

1

2

1

1

1

 For glitch free operation with no IO delay increase:

Must add a delay buffer.

 Buffer is necessary for conventional gate design – only gate output delay is controllable.

Jan 2005 Raja et al.: Low Power Design 9

Controllable Input Delay Gates

1 time

1

2

1

0

 Assume gate input delays to be controllable

2

 Glitches can be suppressed without buffers

Jan 2005 Raja et al.: Low Power Design 10

Problem Statement

Find a glitch reduction technique such that:

All glitches are eliminated in the circuit.

No delay buffers are inserted in the circuit.

Circuit operates at the highest possible speed permitted by the device technology.

Technique should be scalable for large circuits.

Circuits are realizable at the physical level of design.

Note: The objective is to minimize switching power. Hence, no attempt is made to reduce short-circuit and leakage power, which is an order of magnitude lower for present CMOS technologies; those components of power may be addressed in the future research.

Jan 2005 Raja et al.: Low Power Design 11

New Variable Input Delay Logic

I/O path delay through a gate = Input Delay + Output Delay

Output Delay

Propagation delay through a gate from the inputs to the outputs.

Input Delay

Extra delay that can be added on a single I/O path through the gate, which can be controlled other input delays.

independently of the

Variable Input Delay Logic

Logic level design of circuits using components with variable input and output delays along different I/O paths through the gate.

Jan 2005 Raja et al.: Low Power Design 12

Delay Model for a New Gate

1 d

3,1

+ d

3

3

2 d

3,2

+ d

3

Separate the output (inertial) and input delay variables.

d

3 d

3,1

- output delay of the gate.

- input delay of the gate along path from 1 to 3.

Technology constraint:

0  d

3,1

,d

3,2

 u b

Input delay difference has an upper bound, which we define as

Gate Input Differential Delay Upper Bound ( u b

).

Jan 2005 Raja et al.: Low Power Design 13

Gate Input Differential Delay Upper

Bound (u

b

)

It is a measure of the maximum difference in delay of any two I/O paths through the gate, that can be designed in a given CMOS technology.

Arbitrary input delays cannot be realized in practice due to the technology limitation at the transistor and layout levels.

The bound u b is the limit of flexibility allowed by the technology to the designer at the transistor and layout levels.

The following feasibility condition must be imposed while determining delays for glitch suppression:

0

 d i, j

 u b

Jan 2005 Raja et al.: Low Power Design 14

New Linear Programs

We propose two new LPs for designing circuits based on the specifications of the design.

Minimum dynamic power (MDP) LP

Where the circuit consumes least power possible and operates at the highest possible speed for that power.

Delay specification (DS) LP

Where the circuit meets a given delay requirement but does it by adding the smallest number of buffers.

Jan 2005 Raja et al.: Low Power Design 15

New MDP LP Example

d

5,1

+ d

5

5

1

2 d

5,2

+ d

5 d

6,2

+ d

6 d

7,5

+ d

7 d

7,6

+ d

7 d

7,4

+ d

7

7

3 d

6,3

+ d

6

6

4

Gate inertial delay variables d

Gate input delay variables from input j d i, j

5

..

d

7 for every path through gate i

Corresponding window variables t

5

..

t

7 and T

5

..

T

7

.

Jan 2005 Raja et al.: Low Power Design 16

New MDP LP Example (cont.)

d

5,1

+ d

5 1 5 d

7,5

+ d

7 d

5,2

+ d

5

2 d

6,2

+ d

6 d

7,6

+ d

7 d

7,4

+ d

7

3 d

6,3

+ d

6

6

4

Inertial delay constraint for gate 5:

0  d

0  d

5,1

 u

5,2

 u b b d

5

 1

Input delay (feasibility) constraints for gate 5:

Jan 2005 Raja et al.: Low Power Design

7

17

New MDP LP Example (cont.)

1 d

5,1

+ d

5

5 d

5,2

+ d

5 d

7,5

+ d

7 7

2 d

6,2

+ d

6 d

7,6

+ d

7 d

7,4

+ d

7

3

6 d

6,3

+ d

6

T

5

T

5

4

Differential delay constraints for gate 5:

> T

1

+ d

5,1

> T

2

+ d

5,2

+ d

5

;

+ d

5

; t

5 t

5

< t

1

+ d

5,1

< t

2

+ d

5,2

+ d

5

; d

5

+ d

5

;

> T

5

– t

5

;

Jan 2005 Raja et al.: Low Power Design 18

New MDP LP Example (cont.)

1 d

5,1

+ d

5

5

2 d

5,2

+ d

5 d

6,2

+ d

6 d

7,5

+ d

7 d

7,6

+ d

7 d

7,4

+ d

7

7

3 d

6,3

+ d

6

6

4

IO delay constraint for each PO in the circuit:

T

7

 maxdelay ; maxdelay is the parameter which gives the delay of the critical path.

This determines the speed of operation of the circuit.

Jan 2005 Raja et al.: Low Power Design 19

New MDP LP Example (cont.)

d

5,1

+ d

5

5

1

2 d

5,2

+ d

5 d

6,2

+ d

6 d

7,5

+ d

7 d

7,6

+ d

7 d

7,4

+ d

7

7

3 6 d

6,3

+ d

6

4

Objective Function

:

minimize

maxdelay;

This gives the fastest possible, minimum dynamic power consuming circuit, given the feasibility condition for the technology.

Jan 2005 Raja et al.: Low Power Design 20

Power

Solution Curves

Previous solutions

New MDP LP solutions

Power consumed by buffers

Minimum

Dynamic power u b

=

∞ u b

=15 u b

=10 u b

=5

Fastest Possible

Design in any technology

Jan 2005 Raja et al.: Low Power Design u b

=0

Maxdelay

21

Delay Specification LP

If the design needs to meet a given delay specification and the designer is willing to sacrifice some dynamic power by inserting buffers.

Modifications to MDP LP

Insert buffer variables at every fanout stem and branches and at PIs (similar to Linear constraint set method by Raja et al.)

 maxdelay is a given parameter, which is the maximum delay of the critical path according to specification.

Jan 2005 Raja et al.: Low Power Design 22

Delay Specification LP

Components of the LP

Gate constraints – unchanged

Input delay (feasibility) constraints – unchanged for same u b

Differential delay constraints – unchanged

Maxdelay constraints – unchanged but maxdelay is a given parameter.

Objective function:

Minimize sum ( d j

) where j є buffers

Jan 2005 Raja et al.: Low Power Design 23

Power

Solution Curves

Previous solutions

New MDP LP solutions

New DS LP solutions

Power consumed by buffers

Minimum

Dynamic power u b

=

∞ u b

=15 u b

=10 u b

=5

Fastest Possible

Design in any technology

Jan 2005 Raja et al.: Low Power Design u b

=0

Maxdelay

24

Transistor Level Implementation

R on

R on

C r

C in

C in

C r d

3,1 d

3,2

R on

C p

C r

C in

Conventional CMOS gate design:

 Delay = R on

( C routing

+ C input

)

 Energy = 0.5 (C r

+ C in

) V 2

Delay can be changed by changing the resistance or the capacitance.

Resistance does not affect energy per transition.

Jan 2005 Raja et al.: Low Power Design 25

Transistor Level Implementation

Possible implementations of the variable input delay gate:

Capacitance manipulation method where the input capacitance offered by the respective transistor pair is varied.

Pass transistor added design where an extra transistor is added to increase the resistance and thereby the input delay.

We propose the addition of:

Single nMOS transistor

CMOS pass transistor

We describe the single nMOS transistor added design in detail here. The other two are documented in the thesis.

Jan 2005 Raja et al.: Low Power Design 26

Single

n

MOSFET Added Design

R on

R on

C r

R s

C in

C in

C r d

3,1 d

3,2 d

3,1 d

3,1

= R on

(C r d

3,2

= R on

+ C in

) + R s

= Output + Input delay

(C r

+ C in

)

Energy = 0.5 (C r

+ C in

The input delay can be added by an nMOS transistor in series to the path desired.

The addition of resistance does not increase the energy per transition.

C in

) V 2

Jan 2005 Raja et al.: Low Power Design 27

Effect of Input Slope

R s

Too large u b cannot be realized in practice due to noise issues.

Increased resistance degrades the slope of a signal and we use the

CMOS gate following it to regenerate the slope.

The regenerative capability of a gate is limited and this determines practical u b value.

The slope allowed in a design depends on the noise specifications of the circuit.

Jan 2005 Raja et al.: Low Power Design 28

Single nMOSFET Added Design

Advantages:

Almost completely independent control of input delays.

u b is very high compared to capacitance manipulation method.

Very less overhead compared to a conventional buffer.

Can be integrated to full-custom as well as standard cell place and route design flows.

Design Issues:

 nMOSFET degrades the signal when passing logic 1. Hence, it increases the leakage of the transistors in the fanout stages.

However, this is for certain input combinations only.

Short circuit current is a function of the ratio of input/output slopes. Since we increase the input slope by inserting resistance, it might increase short circuit power by a minor amount.

Jan 2005 Raja et al.: Low Power Design 29

CMOS Pass Transistor Added Design

R on

R s

R on

C r

C in

C in

C r d

3,1 d

3,2 d

3,1 d

3,1

= R on

(C r

+ C in

) + R s

= Output + Input delay d

3,2

= R on

(C r

+ C in

)

Energy = 0.5 (C r

+ C in

The input delay can be added by the input CMOS pass transistor in series to the path desired.

This does not degrade the signal as both transistors together conduct both logic values well.

C in

) V 2

Jan 2005 Raja et al.: Low Power Design 30

Technology Mapping

Delay required

Look Up Table for sizes

Transistor Sizes yes Error acceptable?

no

Increment that transistor dimension

Sensitivity of each transistor size to delay

Determine sizes of transistors in a gate for the given delay and given load capacitance.

First guess is given by the look-up table.

Second stage is sensitivity driven.

Reduces the complexity of transistor search.

Jan 2005 Raja et al.: Low Power Design 31

Results for Speed of Circuit Using MDP LP

Maxdelay is normalized to the length of the critical path when all gates are of unit delay.

Each curve is a different benchmark circuit.

As we increase u b the circuit becomes faster.

Flexibility required for fastest operation of circuit is proportional to the size of the circuit.

Jan 2005 Raja et al.: Low Power Design 32

Power Opt. Using MDP LP (for

u b

=10)

Circuit No. of vectors maxdelay Norm. delay

Original power Optimized power c432 c499 c880 c1355 c1908 c2670 c3540 c5315 c6288 c7552

144

82

200

157

56

54

78

87

141

158

173

35

347

542

71

34

45

67

124

50

4.17

2.26

1.50

2.05

4.32

1.09

7.38

11.06

1.87

1.16

Avg.

Peak Avg.

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

0.54

0.68

0.53

0.53

0.65

0.70

0.48

0.47

0.22

0.28

Peak

0.44

0.56

0.43

0.44

0.55

0.65

0.45

0.36

0.18

0.26

Jan 2005 Raja et al.: Low Power Design 33

Circuit Norm.

Maxdelay c432 c499 c880 c1355 c1908

Power Opt. Using DS LP (for

u b

=10)

2.0

1.0

2.0

1.0

1.0

2.0

1.0

2.0

1.0

2.0

Conventional gates

(Raja et al., VLSI Design `03)

Variable input delay gates

Avg.

Peak Buffers Avg.

Peak Buffers

0.72

0.62

0.91

0.70

0.68

0.68

0.58

0.57

0.69

0.59

0.67

0.60

0.87

0.66

0.54

0.52

0.48

0.48

0.59

0.44

95

66

48

0

62

34

224

192

219

70

0.69

0.65

0.86

0.71

0.58

0.56

0.48

0.44

0.56

0.55

0.66

0.55

0.84

0.65

0.45

0.45

0.42

0.39

0.46

0.45

0

1

0

64

61

0

0

32

5

4

Jan 2005 Raja et al.: Low Power Design 34

Circuit Norm.

Maxdelay c2670 c3540 c5315 c6288 c7552

Power Opt. Using DS LP (for

u b

=10)

2.0

1.0

2.0

1.0

1.0

2.0

1.0

2.0

1.0

2.0

Power (conventional gates)

(Raja et al., VLSI Design `03)

Avg.

Power (variable input delay gates)

Peak Buffers Avg.

Peak Buffers

0.79

0.71

0.64

0.58

0.63

0.60

0.40

0.36

0.38

0.36

0.65

0.58

0.44

0.46

0.52

0.45

0.36

0.34

0.34

0.32

157

35

239

140

280

171

294

120

366

111

0.70

0.69

0.57

0.54

0.57

0.55

0.91

0.21

0.28

0.27

0.56

0.57

0.46

0.43

0.48

0.46

0.87

0.16

0.24

0.24

1

26

4

584

2

0

3

0

1

0

Jan 2005 Raja et al.: Low Power Design 35

1

2

3

4 d=2

Example Circuit

5

7 d=1 d=1

6 d=1

Unoptimized Circuit

1

2

3

4 d=1 d=1

5

7 d=1 d=2

6 d=1

1

2

3 4 d=2 d=1

Jan 2005

5

7 d=1 d=2 d=1

6 d=1

Raja et al.: Low Power Design

Buffer optimized

Circuit nMOS optimized

Circuit

36

Example Circuit – Spectre Results

time time time

Unoptimized Circuit Buffer optimized Circuit nMOS optimized Circuit

Jan 2005 Raja et al.: Low Power Design 37

Yes

No

Routing acceptable?

Physical Level Verification

AMPL

Delays

Technology Mapping

Transistor Sizes

Create Cells using Prolific

Standard Cell Library

Standard Cell Place and Route

Layout

Extract Routing Capacitance

Routing load

Analog Power simulations

Energy Consumption

Optimized Layout

Jan 2005 Raja et al.: Low Power Design 38

Layouts of C7552 (0.25

CMOS)

Jan 2005 c7552 Un-optimized

Gate Count = 3827

Transistor Count ≈ 40,000

Critical Delay = 2.15 ns

Area = 710 x 710 um 2 c7552 optimized (u b

= 10)

Gate Count = 3828

Transistor Count

≈ 45,000

Critical Delay = 2.15 ns

Area = 760 x 760 um 2 (1.14)

Raja et al.: Low Power Design 39

Instantaneous Power Savings

Jan 2005

Peak Power Savings = 68%

Raja et al.: Low Power Design 40

Patents and Dissertations

Patents

V. D. Agrawal, “Low Power Circuits Through Hazard Pulse

Suppression,” U.S. Patent 5,983,007, November 1999.

T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay

CMOS Logic and Its Application to Low Power Design,” to be submitted to USPTO through Rutgers Univ., May 2004.

Dissertations

T. Raja, Minimum Dynamic Power Design of CMOS Circuits using a Reduced

Constraint Set Linear Program, MS Thesis, Dept. of ECE, Rutgers

University, May 2002.

T. Raja, Minimum Dynamic Power CMOS Design with Variable Input Delay

Logic , PhD Thesis, Dept. of ECE, Rutgers University, May 2004.

S. Uppalapati, Low Power Design of Standard Cell Digital VLSI Circuits,

MS. Thesis, Dept. of ECE, Rutgers University, October 2004.

Jan 2005 Raja et al.: Low Power Design 41

Papers

V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th

Int. Conf. VLSI Design, Jan. 1997, pp. 193-197.

V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss,

“Digital Circuit Design for Minimum Transient Energy and a

Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design,

Jan. 1999, pp. 434-439.

T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic

Power CMOS Circuit Design by a Reduced Constraint Set Linear

Program,” Proc. 16th Int. Conf. VLSI Design, Jan. 2003, pp. 527-532.

T. Raja, V. D. Agrawal, and M. L. Bushnell, “CMOS Circuit Design for Minimum Dynamic Power and Highest Speed,” Proc. 17th Int.

Conf. VLSI Design, Jan. 2004, pp. 1035-1040.

Jan 2005 Raja et al.: Low Power Design 42

Conclusion

Main idea: Minimum dynamic power high speed circuits can be designed if gates with variable input delays are used.

The new design suppresses all glitches without any delay buffers.

Decreases power without loss in speed and very little increase in area.

Developed a linear program solution to demonstrate the idea.

Developed new gate design for transistor level implementation.

Results have been verified by physical layout design of large circuits.

Results show average power savings up to 58%.

Technique easily scalable for large circuits.

Leakage power remains a concern – ongoing research.

Jan 2005 Raja et al.: Low Power Design 43

Download