Logical Effort - VLSI Signal Processing Lab

advertisement
Digital Integrated Circuits
Lecture 5: Logical Effort
Chih-Wei Liu
VLSI Signal Processing LAB
National Chiao Tung University
cwliu@twins.ee.nctu.edu.tw
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
1
Outline
„
RC Delay Estimation
„
Logical Effort
„
Delay in a Logic Gate
„
Multistage Logic Networks
„
Choosing the Best Number of Stages
„
Example
„
Summary
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
2
RC Delay Model
„
Use equivalent circuits for MOS transistors
‰
‰
‰
„
„
Ideal switch + capacitance and ON resistance
Unit nMOS has resistance R, capacitance C
Unit pMOS has resistance 2R (mobility較小), capacitance C
Capacitance proportional to width
Resistance inversely proportional to width
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
3
RC Values
„
„
„
Capacitance 假設nMOS/pMOS寄生電容與閘極電容相同
‰
C = Cg = Cs = Cd = 2 fF/μm of gate width
‰
Values similar across many processes
Resistance
‰
R ≈ 6 KΩ*μm in 0.6um process
‰
Improves with shorter channel lengths
Unit transistors
‰
May refer to minimum contacted device (4/2 λ)
‰
Or maybe 1 μm wide device
‰
Doesn’t matter as long as you are consistent
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
4
Inverter Delay Estimate
„
Estimate the delay of a fanout-of-1 inverter by using the basic RC
delay model of nMOS and pMOS
RC model
A
2 Y
2
1
1
s
kC
2R/k
kC
g
kC
d
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
5
Inverter Delay Estimate
„
Estimate the delay of a fanout-of-1 inverter
2C
R
A
2 Y
2
1
1
2C
2C
Y
R
C
C
假設A電位提昇
C
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
6
Inverter Delay Estimate
„
Estimate the delay of a fanout-of-1 inverter
2C
R
A
2 Y
2
1
1
2C
2C
2C
Y
R
A=1
2C
C
R
C
C
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
C
C
output
7
Inverter Delay Estimate
„
Estimate the delay of a fanout-of-1 inverter
2C
R
A
2 Y
2
1
1
2C
2C
2C
2C
Y
R
C
R
C
C
C
C
tpd = 6RC
若沒有寄生電容的ideal Inverter則為tpd = 3RC
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
8
Example: 3-input NAND
„
Sketch a 3-input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
9
Example: 3-input NAND
„
Sketch a 3-input NAND with transistor widths chosen to
achieve effective rise and fall resistances equal to a unit
inverter (R).
Step 1
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
10
3-input NAND Caps
We still assume C=Cg=Cdiff
„
Annotate the 3-input NAND gate with gate and diffusion
capacitance.
2
2
2
3
3
3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
11
3-input NAND Caps
We assume C=Cg=Cdiff
„
Annotate the 3-input NAND gate with gate and diffusion
capacitance.
2C
2
2C
2C
Step 2
2C
2
2C
2C
2
2C
3C
3C
3C
2C
2C
3
3
3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
3C
3C
3C
3C
12
3-input NAND Caps
„
Annotate the 3-input NAND gate with gate and diffusion
capacitance.
2
Final result
2
2
3
5C
5C
5C
3
3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
9C
3C
3C
13
Elmore Delay Model
„
„
„
ON transistors look like resistors
Pullup or pulldown network modeled as RC ladder
Elmore delay of RC ladder
t pd ≈
∑
Ri −to− sourceCi
nodes i
= R1C1 + ( R1 + R2 ) C2 + ... + ( R1 + R2 + ... + RN ) C N
R1
R2
R3
RN
RC ladder
C1
C2
C3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
CN
14
Example: 2-input NAND
„
Estimate rising and falling propagation delays of a 2input NAND driving h identical gates.
2
2
A
2
B
2x
Y
h copies
2-input NAND
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
15
Example: 2-input NAND
„
Estimate rising and falling propagation delays of a 2input NAND driving h identical gates.
2
2
A
2
B
2x
6C
Y
4hC
h copies
2C
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
16
Example: 2-input NAND
„
Estimate worst-case rising and falling propagation delays
of a 2-input NAND driving h identical gates.
2
2
A
2
B
2x
6C
Y
4hC
h copies
2C
Worst case rising: 只有一個pMOS開啟
R
Y
(6+4h)C
t pdr = ( 6 + 4h ) RC
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
17
Example: 2-input NAND
„
Estimate worst-case rising and falling propagation delays of a 2input NAND driving h identical gates.
2
2
A
2
B
2x
Y
4hC
6C
h copies
2C
Worst case falling:A已經為”1”;B的電位開始上升
x
R/2
R/2
2C
Y
(6+4h)C
t pdf
R
R R
= ( )2C + ( + )[(6 + 4h)C ]
2
2 2
= (7 + 4h) RC
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
18
Delay Components
„
Delay has two parts
‰
Parasitic delay
„
„
‰
6 or 7 RC
Independent of load
Effort delay
„
„
4h RC
Proportional to load capacitance
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
19
Contamination Delay
„
„
Best-case (contamination) delay can be substantially less than
worst-case propagation delay.
Ex: If both inputs fall simultaneously
2
2
A
2
B
2x
R R
Y
(6+4h)C
6C
Y
4hC
2C
tcdr
R
= ( )[(6 + 4h)C ]
2
= (3 + 2h) RC
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
20
Contamination Delay
„
Best-case (contamination) falling delay can be substantially less
than worst-case propagation delay.
„
Ex: B已經為”1”;A的電位開始上升
2
2
A
2
B
2x
6C
Y
4hC
2C
tcdf
R R
= ( + )[(6 + 4h)C ]
2 2
= (6 + 4h) RC
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
21
Delay Components
„
Again, Delay has two parts
‰
Parasitic delay
„
„
‰
3 or 6 RC
Independent of load
Effort delay
„
„
4h or 2h RC
Proportional to load capacitance
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
22
Diffusion Capacitance
2
3
5C
5C
„
„
2
3
5C
„
2
3
9C
3C
3C
We assumed contacted diffusion on every s / d.
Recall 3-input NAND
Good layout minimizes diffusion area
Ex: NAND3 layout shares one diffusion contact
‰
‰
Reduces output capacitance by 2C
Merged un-contacted diffusion might help too
2C
Shared
Contacted
Diffusion
2C
Gate capacitance: 由電晶體的寬度決定
Diffusion capacitance: 與佈局有關
Isolated
Contacted
Diffusion
Merged
Uncontacted
Diffusion
2
2
2
3
3
3C 3C 3C
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
3
7C
3C
3C
23
Layout Comparison
„
Which layout is better?
VDD
A
VDD
B
Y
GND
A
B
Y
GND
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
24
Introduction
„
„
Chip designers face a bewildering array of choices
‰
What is the best circuit topology for a function?
‰
How many stages of logic give least delay?
‰
How wide should the transistors be?
???
Logical effort is a method to make these decisions
‰
Uses a simple model of delay
‰
Allows back-of-the-envelope calculations
‰
Helps make rapid comparisons between alternatives
‰
Emphasizes remarkable symmetries
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
25
Example
„
Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded
automotive processor. Help Ben design the decoder for a register file.
A[3:0] A[3:0]
32 bits
„
16 word register file
‰
Each word is 32 bits wide
‰
Each bit presents load of 3 unit-sized transistors
‰
True and complementary address inputs A[3:0]
‰
Each input may drive 10 unit-sized transistors
16
Register File
16 words
‰
4:16 Decoder
„
Decoder specifications:
Ben needs to decide:
‰
How many stages to use?
‰
How large should each gate be?
‰
How fast can decoder operate?
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
26
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
d abs
τ
τ = 3RC
沒有寄生電容的理想反向器
≈ 12 ps in 180 nm process
Normalized delay representation
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
40 ps in 0.6 μm process
27
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
„
d abs
τ
無關於製程型態的方式來表示Delay,可使電路能
依照佈局來做Delay的比較,而不必考慮不同製程
對電路速度的影響
Delay has two components
Linear delay model
d= f +p
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
28
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
„
d abs
τ
Delay has two components
d= f +p
„
Effort delay: f = gh (a.k.a. stage effort)
‰
Again has two components
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
29
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
„
d abs
τ
Delay has two components
d= f +p
„
Effort delay f = gh (a.k.a. stage effort)
‰
„
Again has two components
g: logical effort
‰
‰
Gate Complexity ↑, g ↑
i.e. 會耗費較多的時間去驅
動相同fanout數的電晶體
Measures relative ability of gate to deliver current
g ≡ 1 for inverter
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
30
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
„
d abs
τ
Delay has two components
d= f +p
„
Effort delay f = gh (a.k.a. stage effort)
‰
„
Again has two components
h: electrical effort = Cout / Cin
‰
‰
Fan-out ↑, h ↑
Ratio of output to input capacitance
Sometimes called fanout
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
31
Delay in a Logic Gate
„
Express delays in process-independent unit
d=
„
d abs
τ
Delay has two components
d= f +p
„
Parasitic delay p
‰
‰
Represents delay of gate driving no load
Set by internal parasitic capacitance
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
32
Delay in NAND2
„
„
Effort delay: 4hC (與外部電容
有關)
Delay in 2-input NAND
d= f+p
2
2
A
2
B
2x
6C
Y
4hC
2C
Normalized delay representation
4hC
=
+2
3C
4
= ( ) h + 2 = gh + p
3
註:Ideal Inverter (無寄生電容、
Single Fanout) τ = 3RC;g = 1
How to calculate the logical effort g ?
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
33
Delay Plots
=f+p
= gh + p
2-input
NAND
6
Normalized Delay: d
d
Inverter
g=
p=
d=
5
g=
p=
d=
4
3
2
1
0
0
1
2
3
4
5
Electrical Effort:
h = Cout / Cin
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
34
Delay Plots
„
=f+p
= gh + p
What about
NOR2?
2-input
NAND
6
Normalized Delay: d
d
Inverter
5
g=1
p=1
d=h+1
4
3
g = 4/3
p=2
d = (4/3)h + 2
Effort Delay: f
2
1
Parasitic Delay: p
0
0
1
2
3
4
5
Electrical Effort:
h = Cout / Cin
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
35
Computing Logical Effort
„
„
„
DEF: Logical effort is the ratio of the input capacitance of
a gate to the input capacitance of an inverter delivering
the same output current.
Measure from delay vs. fanout plots
Or estimate by counting transistor widths
2
Y
2
A
2
Y
1
Cin = 3
g = 3/3
A
2
B
2
Cin = 4
g = 4/3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
A
4
B
4
Y
1
1
Cin = 5
g = 5/3
36
Catalog of Gates
„
Logical effort of common gates
Gate type
Number of inputs
1
2
3
4
n
NAND
4/3
5/3
6/3
(n+2)/3
NOR
5/3
7/3
9/3
(2n+1)/3
2
2
2
2
4, 4
6, 12, 6
8, 16, 16, 8
Inverter
Tristate / mux
XOR, XNOR
1
2
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
37
Recall: 4:1 Multiplexer
„
4:1 mux chooses one of 4 inputs using two selects
S1S0 S1S0 S1S0 S1S0
D0
Logical Effort =2;與輸入的
數目無關
D1
D2
大型多工器的速度與小型多工器
的速度相當?
D3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
38
Computing Parasitic Delay
„
DEF:
‰
‰
„
„
Represents delay of gate driving no load
Set by internal parasitic capacitance
Measure from delay vs. fanout plots
Or estimate by counting total diffusion capacitances at
output
2
Y
2
A
2
Y
1
C=3
p = 3/3 = 1
A
2
B
2
C=6
p = 6/3 = 2
A
4
B
4
Y
1
1
C=6
p = 6/3 =2
增加電晶體大小會降低電阻,但卻會增加寄生電容,但較
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
寬的電晶體可以摺疊,使內部的寄生電容可能變小
39
Catalog of Gates
„
Parasitic delay of common gates
‰
In multiples of pinv (≈1)
Gate type
Number of inputs
1
2
3
4
n
NAND
2
3
4
n
NOR
2
3
4
n
4
6
8
2n
4
6
8
Inverter
Tristate / mux
XOR, XNOR
1
2
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
40
Example: FO4 Inverter
„
Estimate the delay of a fanout-of-4 (FO4) inverter
d
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
g=
h=
p=
d=
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
41
Example1: FO4 Inverter
„
Estimate the delay of a fanout-of-4 (FO4) inverter
d
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
g=1
h=4
p=1
d = gh + p = 5
The FO4 delay is about
200 ps in 0.6 μm process
60 ps in a 180 nm process
f/3 ns in an f μm process
經驗法則
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
42
Example2: Ring Oscillator
„
Estimate the frequency of an N-stage ring oscillator
Odd-number stages
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
Frequency:
g=1
31 stage ring oscillator in
h=1
0.6 μm process has
p=1
frequency of ~ 200 MHz
d=2
fosc = 1/(2*N*d) = 1/4N
如果g ≠ 1,那如何計算多級邏輯網路的延遲呢?
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
43
Multistage Logic Networks
„
„
„
„
Logical effort generalizes to multistage networks
Path Logical Effort
G=
gi
與電晶體大小無關
∏
Cout-path
Path Electrical Effort
H=
Path Effort
F = ∏ f i = ∏ gi hi
10
x
g1 = 1
h1 = x/10
g2 = 5/3
h2 = y/x
Cin-path
y
g3 = 4/3
h3 = z/y
與電晶體大小有關
z
g4 = 1
h4 = 20/z
20
F=GH=(5/3)(4/3)(x/10)(y/x)(z/y)(20/z)=G(20/10)
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
44
Multistage Logic Networks
„
„
Logical effort generalizes to multistage networks
G=
gi
Path Logical Effort
∏
Cout-path
Path Electrical Effort
H=
„
Path Effort
F = ∏ f i = ∏ gi hi
„
Can we always write F = GH?
„
Cin-path
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
45
Branch Paths
„
No! Consider paths that branch:
15
G
H
GH
h1
h2
F
=
=
=
=
=
= GH?
90
5
15
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
90
46
Paths that Branch
„
No! Consider paths that branch:
15
G
H
GH
h1
h2
F
=1
= 90 / 5 = 18
= 18
= (15 +15) / 5 = 6
= 90 / 15 = 6
= g1g2h1h2 = 36 = 2GH
90
5
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
15
90
47
Branching Effort
„
Branching effort
‰
Accounts for branching between stages in path
b=
Con path + Coff path
Con path
B = ∏ bi
„
Note:
Path branching effort
∏h
i
= BH
Now we compute the path effort
‰ F = GBH
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
48
Multistage Delays
„
Path Effort Delay
DF = ∑ f i
„
Path Parasitic Delay
P = ∑ pi
„
Path Delay
D = ∑ d i = DF + P
Recall that F = GBH
各級path effort的乘積為F;path delay則為各級path effort delay的總和
Î最小值發生在各級的path effort delay相同,亦即 fi = F1/N
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
49
Designing Fast Circuits
D = ∑ d i = DF + P
„
Delay is smallest when each stage bears same effort
fˆ = gi hi = F
„
1
N
Thus minimum delay of N stage path is
1
N
D = NF + P
„
This is a key result of logical effort
‰
‰
Find fastest possible delay
這種方式優於模擬
Doesn’t require calculating gate sizes
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
50
Choose Gate Sizes to Meet the Least
Delay Design
„
How wide should the gates be for least delay?
ˆf = gh = g Cout
Cin
gi Couti
⇒ Cini =
fˆ
„
„
Working backward, apply capacitance transformation to
find input capacitance of each gate given load it drives.
Check work by verifying input cap spec is met.
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
51
Example: 3-stage path
„
Select gate sizes x and y for least delay from A to B
x
x
A
8
x
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
y
45
y
B
45
52
Example: 3-stage path
x
x
A
8
x
Logical Effort
Electrical Effort
Branching Effort
Path Effort
Best Stage Effort
Parasitic Delay
Delay
y
45
y
B
45
G=
H=
B=
F=
fˆ =
P=
D=
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
53
Example: 3-stage path
x
x
A
8
x
Logical Effort
Electrical Effort
Branching Effort
Path Effort
Best Stage Effort
Parasitic Delay
Delay
y
45
y
B
45
G = (4/3)*(5/3)*(5/3) = 100/27
H = 45/8
B=3*2=6
F = GBH = 125
fˆ = 3 F = 5
P=2+3+2=7
D = 3*5 + 7 = 22 = 4.4 FO4
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
54
Example: 3-stage path
„
Work backward for sizes
y=
x=
x
x
A
8
x
y
45
y
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
B
45
55
Example: 3-stage path
„
fˆ = gh = g CCoutin
gi Couti
⇒ Cini =
fˆ
Work backward for sizes
y = 45 * (5/3) / 5 = 15
x = (15+15) * (5/3) / 5 = 10
45
A P: 4
N: 4
P: 4
N: 6
P: 12
N: 3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
B
45
56
Example: 3-stage path
„
Check the result:
1. the size of the NAND gate: (10+10+10)*(4/3) / 5 = 8
2. NAND gate delay: d1=g1h1+p1= (4/3)*(10+10+10) / 8 + 2 = 7
3. NAND3 gate delay: d2=g2h2+p2= (5/3)*(15+15) / 10 + 3 = 8
4. NOR gate delay: d3=g3h3+p3= (5/3)*(45/15) + 2 = 7
Î7+8+7 = 22
45
A P: 4
N: 4
P: 4
N: 6
P: 12
N: 3
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
B
45
57
Best Number of Stages
„
How many stages should a path use?
‰
„
Minimizing number of stages is not always fastest
Example: drive 64-bit datapath with unit inverter
1
InitialDriver
D
1
1
1
=
DatapathLoad
N:
f:
D:
64
1
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
64
2
64
3
64
4
58
Best Number of Stages
„
Example: drive 64-bit datapath with unit inverter
InitialDriver
D
1
= NF1/N + P
= N(64)1/N + N
1
1
1
8
4
2.8
16
8
23
DatapathLoad
N:
f:
D:
64
1
64
65
64
2
8
18
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
64
3
4
15
64
4
2.8
15.3
Fastest
59
Derivation for Best Number of Stages
„
Consider adding inverters to end of path
‰
How many give least delay?
n1
D = NF + ∑ pi + ( N − n1 ) pinv
1
N
Logic Block:
n1Stages
Path Effort F
N - n1 ExtraInverters
i =1
1
1
1
∂D
N
N
N
= − F ln F + F + pinv = 0
∂N
„
Define best stage effort
ρ=F
1
N
pinv + ρ (1 − ln ρ ) = 0
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
60
Best Number of Stages
„
„
„
pinv + ρ (1 − ln ρ ) = 0 has no closed-form solution
Neglecting parasitics (pinv = 0), we find ρ = 2.718 (e)
For pinv = 1, solve numerically for ρ = 3.59
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
61
Sensitivity Analysis
How sensitive is delay to using exactly the best number
1.6
1.51
of stages?
D(N) /D(N)
„
1.4
1.26
1.2
1.15
If pinv = 1
1.0
Nˆ = log ρ F
(ρ =2.4)
(ρ=6)
0.0
0.5
0.7
1.0
1.4
2.0
N/ N
„
2.4 < ρ < 6 gives delay within 15% of optimal
‰
‰
We can be sloppy!
I like ρ = 4
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
62
Example, Revisited
„
Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded
automotive processor. Help Ben design the decoder for a register file.
A[3:0] A[3:0]
32 bits
„
16 word register file
‰
Each word is 32 bits wide
‰
Each bit presents load of 3 unit-sized transistors
‰
True and complementary address inputs A[3:0]
‰
Each input may drive 10 unit-sized transistors
16
Register File
16 words
‰
4:16 Decoder
„
Decoder specifications:
Ben needs to decide:
‰
How many stages to use?
‰
How large should each gate be?
‰
How fast can decoder operate?
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
63
Possible 4:16 Decoder
„
„
„
Each word is 32 bits wide
Each bit presents load of 3 unit-sized transistors
Each input may drive 10 unit-sized transistors
A[3] A[3]
10
10
A[2] A[2]
10
10
A[1] A[1]
10
10
A[0] A[0]
10
10
y
z
word[0]
96 units of wordline capacitance
y
z
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
word[15]
64
Number of Stages
„
Decoder effort is mainly electrical and branching
Electrical Effort:
H = (32*3) / 10 = 9.6
Branching Effort:
B=8
„
If we neglect logical effort (assume G = 1)
Path Effort:
F = GBH = 76.8
Number of Stages:
„
N = log4F = 3.1
Try a 3-stage design
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
65
Gate Sizes & Delay
Logical Effort:
G = 1 * 6/3 * 1 = 2
Path Effort:
F = GBH = 154
落在2.4與6之間
fˆ = F 1/ 3 = 5.36
Stage Effort:
Path Delay:
D = 3 fˆ + 1 + 4 + 1 = 22.1
Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7
A[3] A[3]
10
10
A[2] A[2]
10
10
A[1] A[1]
10
10
A[0] A[0]
10
10
y
z
word[0]
96 units of wordline capacitance
y
z
word[15]
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
66
Comparison
„
Compare many alternatives with a spreadsheet
Design
N
G
P
D
NAND4-INV
2
2
5
29.8
NAND2-NOR2
2
20/9
4
30.1
INV-NAND4-INV
3
2
6
22.1
NAND4-INV-INV-INV
4
2
7
21.1
NAND2-NOR2-INV-INV
4
20/9
6
20.5
NAND2-INV-NAND2-INV
4
16/9
6
19.7
INV-NAND2-INV-NAND2-INV
5
16/9
7
20.4
NAND2-INV-NAND2-INV-INV-INV
6
16/9
8
21.6
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
67
Review of Definitions
Term
Stage
Path
number of stages
1
N
logical effort
g
G = ∏ gi
electrical effort
h=
Cout
Cin
H=
branching effort
b=
Con-path + Coff-path
Con-path
B = ∏ bi
effort
f = gh
F = GBH
effort delay
f
DF = ∑ f i
parasitic delay
p
P = ∑ pi
delay
d= f +p
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
Cout-path
Cin-path
D = ∑ d i = DF + P
68
Method of Logical Effort
1)
2)
3)
4)
5)
6)
Compute path effort
Estimate best number of stages
Sketch path with N stages
Estimate least delay
Determine best stage effort
F = GBH
N = log 4 F
1
N
D = NF + P
ˆf = F N1
gi Couti
Cini =
fˆ
Find gate sizes
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
69
Limits of Logical Effort
„
„
Chicken and egg problem
‰
Need path to compute G
‰
But don’t know number of stages without G
Simplistic delay model
‰
„
Interconnect
‰
„
Neglects input rise time effects
Iteration required in designs with wire
Maximum speed only
‰
Not minimum area/power for constrained delay
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
70
Summary
„
Logical effort is useful for thinking of delay in circuits
‰
‰
‰
‰
‰
‰
‰
„
Numeric logical effort characterizes gates
NANDs are faster than NORs in CMOS
Paths are fastest when effort delays are ~4
Path delay is weakly sensitive to stages, sizes
But using fewer stages doesn’t mean faster paths
Delay of path is about log4F FO4 inverter delays
Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits
‰
But requires practice to master
DIC-Lec5 cwliu@twins.ee.nctu.edu.tw
71
Download