Lecture 10 Inverter Dynamic

advertisement
CMPEN 411
VLSI Digital Circuits
Spring 2012
Lecture 10: The Inverter, A Dynamic View
[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003
J. Rabaey, A. Chandrakasan, B. Nikolic]
Sp12 CMPEN 411 L10 S.1
Heads up

This lecture

Inverter dynamic view
- Reading assignment – Rabaey, et al, 5.4.1-5.4.3

Next lecture

Designing fast logic
- Reading assignment – Rabaey, et al, 6.2.1, 9.2.2, 9.2.3
Sp12 CMPEN 411 L10 S.2
Inverter Propagation Delay

Propagation delay is proportional to the time-constant of
the network formed by the pull-down (or pull-up) resistor
and the load capacitance
VDD
tpHL = f(Rn, CL)
Vout = 0
Rn
Vin = V DD

CL
tpHL = ln(2) Reqn CL = 0.69 Reqn CL
tpLH = ln(2) Reqp CL = 0.69 Reqp CL
tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2
To equalize rise and fall times make the on-resistance of
the NMOS and PMOS approximately equal.
Sp12 CMPEN 411 L10 S.3
Inverter Transient Response
VDD=2.5V
0.25m
W/Ln = 1.5
W/Lp = 4.5
Reqn= 13 k ( 1.5)
Reqp= 31 k ( 4.5)
3
Vin
2.5
2
1.5
1
0.5
0
-0.5
0
0.5
1
1.5
t (sec)
Sp12 CMPEN 411 L10 S.4
2
2.5
x 10-10
Inverter Transient Response
VDD=2.5V
0.25m
W/Ln = 1.5
W/Lp = 4.5
Reqn= 13 k ( 1.5)
Reqp= 31 k ( 4.5)
3
Vin
2.5
2
1.5
tf
tpHL
1
tpLH
tr
tpHL = 36 psec
tpLH = 29 psec
0.5
so
0
tp = 32.5 psec
-0.5
0
0.5
1
1.5
t (sec)
Sp12 CMPEN 411 L10 S.5
2
2.5
x 10-10
Design for Performance

Increase W/L ratio of the transistor



Reduce CL




the most powerful and effective performance optimization tool in
the hands of the designer
watch out for self-loading! – when the intrinsic capacitance
dominates the extrinsic capacitance
keep drain diffusions small
limit interconnect capacitance
limit fan-out
Increase VDD



trade-off energy for performance
increasing VDD above a certain level
yields minimal improvements
reliability concerns enforce a firm
upper bound on VDD
Sp12 CMPEN 411 L10 S.7
5.5
5
4.5
4
3.5
3
2.5
2
1.5
1
0.8
1
1.2
1.4
1.6
1.8
VDD (V)
2
2.2
2.4
Impacts of NMOS/PMOS Ratio

So far have sized the PMOS and NMOS so that the Req’s
match (ratio ~ 3)



symmetrical VTC
equal high-to-low and low-to-high propagation delays
If speed is the only concern, reduce the width of the
PMOS device!

widening the PMOS degrades the tpHL due to larger intrinsic
capacitance
 = (W/Lp)/(W/Ln)
r = Reqp/Reqn (resistance ratio of identically-sized PMOS and NMOS)
opt = r when wiring capacitance is negligible
Sp12 CMPEN 411 L10 S.8
PMOS/NMOS Ratio Effects
5 x 10
-11
 = (W/Lp)/(W/Ln)
tpLH
4.5
tpHL
 of 2.4 (= 31 k/13 k)
gives symmetrical
response
tp
4
 of 1.6 to 1.9 gives
optimal performance
3.5
3
1
2
3
 = (W/Lp)/(W/Ln)
Sp12 CMPEN 411 L10 S.9
4
5
Device Sizing for Performance
 Divide capacitive load, CL, into


Cint : intrinsic - diffusion and Miller effect (Cg)
Cext : extrinsic - wiring and fanout
tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint)

where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the
gate
Sp12 CMPEN 411 L10 S.10
Device Sizing for Performance
 Divide capacitive load, CL, into


Cint : intrinsic - diffusion and Miller effect (Cg)
Cext : extrinsic - wiring and fanout
tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint)


where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the
gate
Widening both PMOS and NMOS by a factor S

Req1 = ________, Cint1= _________
tp = __________________________________

tp0 is independent of the sizing of the gate; with no load the drive
of the gate is totally offset by the increased capacitance
any S sufficiently larger than (Cext/Cint) yields the best
performance gains with least area impact

Sp12 CMPEN 411 L10 S.11
Sizing Impacts on Delay
The majority of the
improvement is already
obtained for S = 5. Sizing
factors larger than 10
barely yield any extra gain
(and cost significantly
more area).
x 10-11
3.8
for a fixed load
3.6
3.4
3.2
3
2.8
2.6
2.4
2.2
2
1
3
5
7
9
S
Sp12 CMPEN 411 L10 S.12
11
13
15
self-loading effect
(intrinsic capacitance
dominates)
Impact of Fanout on Delay
tp = tp0 (1 + Cext/Cint )


Extrinsic capacitance, Cext, is a function of the fanout of
the gate - the larger the fanout, the larger the external
load.
First determine the input loading effect of the inverter.
Both Cg and Cint are proportional to the gate sizing, so
Cint = Cg ,  is independent of gate sizing and
tp = tp0 (1 + Cext/ Cg) = tp0 (1 + f/)

The delay of an inverter is a function of the ratio between
its external load capacitance and its input gate
capacitance, or the gate’s effective fan-out f
f = Cext/Cg
Sp12 CMPEN 411 L10 S.13
Inverter Chain

Real goal is to minimize the delay through an inverter chain
In
Out
Cg,1
1
2
N
CL
the delay of the j-th inverter stage is
tp,j = tp0 (1 + Cg,j+1/(Cg,j)) = tp0(1 + fj/ )
and
tp = tp1 + tp2 + . . . + tpN
so
tp = tp,j = tp0  (1 + Cg,j+1/(Cg,j))

If CL is given


How should the inverters be sized?
How many stages are needed to minimize the delay?
Sp12 CMPEN 411 L10 S.14
Sizing the Inverters in the Chain of N inverters

The optimum size of each inverter is the geometric mean
of its neighbors – meaning that if each inverter is sized up
by the same factor f wrt the preceding gate, it will have the
same effective fan-out and the same delay
N
N
f = CL/Cg,1 = F
where the overall effective fan-out of the circuit is
F = CL/Cg,1
and the minimum delay through the inverter chain is
N
tp = N tp0 (1 + ( F ) / )

The relationship between tp and F is linear for one inverter,
square root for two, etc.
Sp12 CMPEN 411 L10 S.15
Example of Inverter Chain Sizing
In
Out
Cg,1

1
CL = 8 Cg,1
CL/Cg,1 has to be evenly distributed over N = 3 inverters
F = CL/Cg,1 = 8/1
f =
Sp12 CMPEN 411 L10 S.16
Example of Inverter Chain Sizing
In
Out
Cg,1

1
f=2
f2 = 4
CL = 8 Cg,1
CL/Cg,1 has to be evenly distributed over N = 3 inverters
F = CL/Cg,1 = 8/1
3
f = 8 = 2
Sp12 CMPEN 411 L10 S.17
Determining N: Optimal Number of Inverters

What is the optimal value for N given F (= fN) ?



if the number of stages is too large, the intrinsic delay of the
stages becomes dominate
if the number of stages is too small, the effective fan-out of each
stage becomes dominate
The optimum N is found by differentiating the minimum
delay expression divided by the number of stages and
setting the result to 0, giving
N
N
 + F - ( F ln(F))/N = 0 and
f = e(1 + /f)

For  = 0 (ignoring self-loading)
N = ln(F)
and the effective-fan out (tapering factor) is f = e = 2.718

For  = 1 (the typical case)
N = ln(F) - 1
and the effective fan-out (tapering factor) is f = 3.6
Sp12 CMPEN 411 L10 S.18
Optimum Effective Fan-Out
5
7
6
4.5
5
4
4
3.5
3
2
3
1
2.5
0
0

0.5
1
1.5

2
2.5
3
1
1.5
2
2.5
3
3.5
4
4.5
f
Choosing f larger than optimum has little effect on delay
and reduces the number of stages (and area).


So it is common practice to use f = 4 (for  = 1) and reduce N
Too many stages has a substantial negative impact on delay
Sp12 CMPEN 411 L10 S.19
5
Example of Inverter (Buffer) Staging
F = 64
N = ln(F) – 1 = 3.16
1
Cg,1 = 1
Cg,1 = 1
16
Cg,1 = 1
1
tp
1
64
65
2
8
18
3
4
15
4
2.8
15.3
CL = 64 Cg,1
4
1
f
CL = 64 Cg,1
8
1
N
2.8
Cg,1 = 1
Sp12 CMPEN 411 L10 S.20
CL = 64 Cg,1
8
22.6
CL = 64 Cg,1
Impact of Buffer Staging for Large CL
F
( = 1)
10

Unbuffered Two Stage
Chain
11
8.3
Opt. Inverter
Chain
8.3
100
101
22
16.5
1,000
1001
65
24.8
10,000
10,001
202
33.1
Impressive speed-ups with optimized cascaded
inverter chain for very large capacitive loads.
Sp12 CMPEN 411 L10 S.21
Input Signal Rise/Fall Time

In reality, the input signal
changes gradually (and both
PMOS and NMOS conduct for
a brief time). This affects the
current available for
charging/discharging CL and
impacts propagation delay.
x 10-11
5.4
5.2
5
4.8
4.6
4.4
4.2
 tp
increases linearly with
increasing input slope, ts,
once ts > tp
4
3.8
3.6
0
 ts
is due to the limited driving
capability of the preceding gate
Sp12 CMPEN 411 L10 S.22
2
4
6
8 x 10-11
ts(sec)
for a minimum-size inverter
with a fan-out of a single gate
Design Challenge

A gate is never designed in isolation: its performance is
affected by both the fan-out and the driving strength of the
gate(s) feeding its inputs.
tip = tistep +  ti-1step

Keep signal rise times smaller than or equal to the gate
propagation delays.



(  0.25)
good for performance
good for power consumption
Keeping rise and fall times of the signals small and of
approximately equal values is one of the major challenges
in high-performance designs - slope engineering.
Sp12 CMPEN 411 L10 S.23
Delay with Long Interconnects

When gates are farther apart, wire capacitance and
resistance can no longer be ignored.
(rw, cw, L)
Vin
cint
Vout
cfan
tp = 0.69RdrCint + (0.69Rdr+0.38Rw)Cw + 0.69(Rdr+Rw)Cfan
where Rdr = (Reqn + Reqp)/2
= 0.69Rdr(Cint+Cfan) + 0.69(Rdrcw+rwCfan)L + 0.38rwcwL2

Wire delay rapidly becomes the dominate factor (due to
the quadratic term) in the delay budget for longer wires.
Sp12 CMPEN 411 L10 S.24
Next Lecture and Reminders

Next lecture

Designing fast logic
- Reading assignment – Rabaey, et al, 6.2.1,9.2.2,9.2.3
Sp12 CMPEN 411 L10 S.25
Download