Delay

advertisement
EE 201A
(Starting 2005, called EE 201B)
Modeling and Optimization
for VLSI Layout
Instructor:
Lei He
Email:
LHE@ee.ucla.edu
Chapter 7 Interconnect Delay
7.1 Elmore Delay
7.2 High-order model and moment matching
7.3 Stage delay calculation
Basic Circuit Analysis Techniques
Network structures & state
•
Output response
Input waveform & zero-states
Natural response vN(t)
(zero-input response)
Forced response vF(t)
(zero-state response)
For linear circuits: v(t )  vN (t )  vF (t )
• Basic waveforms
– Step input
– Pulse input
– Impulse Input
• Use simple input waveforms to understand the impact of
network design
Basic Input Waveforms
1/
T
1
0
-T/2 T/2
unit step function
u(t)=
0
t 0
1
t 0
PT (t ) 
1
T
T
T 

u
(
t

)

u
(
t

)

2
2 
By definition
u (t ) 
or δ(t) 
unit impulse function
pulse function of width T

t

 ( x ) dx
du (t )
dt
 (t ) : PT (t )
when T  0
 (t )  0
for t  0
singular for t  0
s.t. for any  0

   (t )dt  1

Step Response vs. Impulse Response
• Definitions:
– (unit) step input u(t)
– (unit) impulse input (t)
(Input Waveform)
• Relationship
 (t ) 
(unit) step response g(t)
(unit) impulse response h(t)
(Output Waveform)
du (t )
dt
t
u (t )    ( x)dx



• Elmore delay


0
0
TD   g ' (t )t  dt   h(t )t  dt
dg (t )
dt
h(t ) 
t
g (t )   h( x)dx
0
Analysis of Simple RC Circuit
R  i (t )  v (t )  vT (t )
d (Cv(t ))
dv (t )
i (t ) 
C
dt
dt
dv (t )
 RC
 v (t )  vT (t )
dt
state
variable
Input
waveform
i(t)
R
vT(t)
±
C
v(t)
first-order linear differential
equation with
constant coefficients
Analysis of Simple RC Circuit
zero-input response:
(natural response)
dv(t )
 v(t )  0
dt
1 dv(t)
1
t

 v N (t)  Ke RC
v(t) dt
RC
RC
dv(t )
RC
 v(t )  v0u (t )
step-input response:
dt
t
vF (t )  v0u(t )  v(t )  Ke RC  v0u(t )
match initial state:
output response
for step-input:
v(0)  0
 K  v0u (t )  0
v(t )  v0 (1  e
t
v0
RC
)u (t )
v0u(t)
v0(1-eRC/T)u(t)
Delays of Simple RC Circuit
• v(t) = v0(1 - e-t/RC) under step input v0u(t)
• v(t)=0.9v0  t = 2.3RC
v(t)=0.5v0  t = 0.7RC
• Commonly used metric
TD = RC (Elmore delay to be defined later)
Lumped Capacitance Delay Model
• R = driver resistance
• C = total interconnect capacitance + loading capacitance
• Sink Delay: td = R·C
N3
Rd
N2
driver
N
• 50% delay under step input = 0.7RC
• Valid when driver resistance >> interconnect resistance
• All sinks have equal delay
Cload
Lumped RC Delay Model
t D  R d  Cload  R d  Cint  Cg 
 R d  C0  L  Cg 
• Minimize delay  minimize wire length
N3
Rd
N2
driver
N
Cload
Delay of Distributed RC Lines
Vout(t)
Laplace
Transform
Vout(s)
VIN
R
VOUT
VIN
R
VOUT
C
C
VOUT
1
s cosh sRC
e x  e x
cosh( x) 
2
x2 x4
 1    ......
2! 4!
Vout ( s) 
1.0
0.5
DISTRIBUTED
LUMPED
1.0RC
2.0RC
time
Step response of distributed and lumped RC networks.
A potential step is applied at VIN, and the resulting VOUT
is plotted. The time delays between commonly used
reference points in the output potential is also tabulated.
Delay of Distributed RC Lines (cont’d)
Output potential range
Time elapsed
(Distributed RC
Network)
Time elapsed
(Lumped RC
Network)
0 to 90%
1.0 RC
2.3 RC
10% to 90% (rise time)
0.9 RC
2.2 RC
0 to 63%
0.5 RC
1.0 RC
0 to 50% (delay)
0.4 RC
0.7 RC
0 to 10%
0.1 RC
0.1 RC
Distributed Interconnect Models
• Distributed RC circuit model
– L,T or P circuits
• Distributed RCL circuit model
• Tree of transmission lines
Distributed RC Circuit Models
Distributed RLC Circuit Model
(without mutual inductance)
Z0
Z0
Delays of Complex Circuits under Unit Step Input
• Circuits with monotonic response
1
0.5
T50%
v(t)
t
TR
• Easy to define delay & rise/fall time
• Commonly used definitions
– Delay T50% = time to reach half-value, v(T50%) = 0.5Vdd
– Rise/fall time TR = 1/v’(T50%) where v’(t): rate of change of v(t)
w.r.t. t
– Or rise time = time from 10% to 90% of final value
• Problem: lack of general analytical formula for T50% &
TR!
Delays of Complex Circuits under Unit Step
Input (cont’d)
• Circuits with non-monotonic response
t
• Much more difficult to define delay & rise/fall time
Elmore Delay for Monotonic Responses
v(t)
1
• Assumptions:
– Unit step input
– Monotone output response
0.5
v’(t)
t
T50%
• Basic idea: use of mean of v’(t)
to approximate median of v’(t)
v(t ) : output response (monotone)
v' (t ) : rate of change of v(t )
median
of v’(t)
(T50%)

TD   tv' (t )dt
0
mean of v'(t)
t
Elmore Delay for Monotonic Responses
• T50%: median of v’(t), since

T5 0%
0
v' (t )dt  

T5 0%
v' (t )dt
 half of final value of v(t ) (by def.)
• Elmore delay TD = mean of v’(t)
TD 


0
v' (t )tdt
Why Elmore Delay?
• Elmore delay is easier to compute analytically in most cases
– Elmore’s insight [Elmore, J. App. Phy 1948]
– Verified later on by many other researchers, e.g.
• Elmore delay for RC trees [Penfield-Rubinstein, DAC’81]
• Elmore delay for RC networks with ramp input [Chan, TCAS’86]
• .....
• For RC trees: [Krauter-Tatuianu-Willis-Pileggi, DAC’95]
T50%  TD
• Note: Elmore delay is not 50% value delay in general!
Elmore Delay for RC Trees
h(t) = impulse response
• Definition
– h(t) = impulse response
– TD = mean of h(t)
=
H(t) = step response

 h(t)  t dt
0
• Interpretation

– H(t) = output response (step process) median
TD   tv' (t )dt
0
of v’(t)
mean of v'(t)
– h(t) = rate of change of H(t)
(T50%)
– T50%= median of h(t)
– Elmore delay approximates the median of h(t) by
the mean of h(t)
Elmore Delay of a RC Tree
[Rubinstein-Penfield-Horowitz, T-CAD’83]
Lemma: when a step input is applied to a RC tree
vi (t ) is monotonic in t for every node i in tree
Proof:
 v'i (t )  0 at every node i (v' i (t )  hi (t ))
 impulse response hi (t )  0 at every node i
Apply impulse func. at t=0:
Let hmin (t ) be the min. voltage of any node at t
hmin (0)  0
Assume that hmin (t0 )  0
Then, t1  t0
imin
s.t. h'min (t1 )  0
Let node imin achieve hmin (t1 ) at t1
Then, the current from any node i to imin is  0 at t1
Since hi (t1 )  hmin (t1 ) & i connects imin via resistors
current iimin
i
Since all currents i  imin charge the capacitor at imin
h'min (t1 )  0  contradict ion!
Elmore Delay in a RC Tree (cont’d)
Pi : path from input to node i ; si :subtree rooted at i
R jk : resistance of common path
Si
path resistance Rii
Pj  Pk from input to j & k
TDi   Rki Ck
Rjk
k
k
dvi (t )
dt
1  vi (t )  The voltage drop on Pi   Rk  (current to all cap' s in Si )
The current to cap. of node i  Ci
kPi
  (current t o cap k )  (common path res. between Pi and Pk )
k
  Ck
k
dvk (t )
dv (t )
 Rki   Rki Ck k
dt
dt
k


0
0
TDi   v'i (t )t  dt  vi (t )  t |0   vi (t )dt

T
 lim [vi (T )  T   vi (t )dt ]  lim (vi (T )  1)  T   (1  vi (t )) dt
T 
0
i
input
Theorem : Elmore delay to node i
Proof :
j
T 
0
Elmore Delay in a RC Tree (cont’d)
•
•
(1  vi (T ))  T  0
We shall show later on that Tlim

i.e. 1-vi(T) goes to 0 at a much faster rate than 1/T when T
Let
t
f i (t )   [1  vi ( x)]dx
0
dv ( x)
f i (t )    Rki Ck k
dx
0
dx
k
t
vi(t)
area
t
f i (t )   [1  vi ( x)]dx
0
1
  Rki Ck v k (t )
k
  Rki Ck   Rki Ck [1  vk (t )]
k
f i ()   Rki Ck
k
(1) 0
k

 TDi  lim (1  vi (T ))T   [1  vi (t )]dt
T 
0
 f i ()   Rki Ck
k
t
Some Definitions For Signal Bound Computation
Let Tp   Rkk Ck
k
TRi  ( Rki2 Ck ) Rii
Recall TDi   Rki Ck
k
k
Then,
TRi  TDi  Tp
(since Rkk  Rki & Rii  Rki )
Signal Bounds in RC Trees
•
Theorem
Lower bounds
0
vi (t )  1 
1
t0
TDi
t  0 (non - trivial when t  TDi  TRi )
t  TRi
TDi
Tp
(T p TRi )
e
Tp
e
t
Tp
Upper bounds
TDi  t
1
Tp
t  T p  TRi
t0
v i (t ) 
1
TRi
Tp
(TDi TRi )
e
TRi
e
t
TRi
t  TDi  TRi
Delay Bounds in RC Trees
Lower bounds :
t  TDi  Tp 1  vi (t )
t  TDi  TRi  TRi ln
TRi
Tp 1  vi (t )
when vi (t )  1 
TRi
when vi (t )  1 
TDi
Tp
Upper bounds :
TDi
t
 TRi
1  vi (t )
TDi
t  Tp  TRi  Tp ln
Tp 1  vi (t )
Tp
Computation of Elmore Delay & Delay Bounds
in RC Trees
• Let C(Tk) be total capacitance of subtree rooted at k
• Elmore delay
TDi 
R
k  pi
k
 C (Tk )
upper bound :
Tp   Rk  C (Tk )
k
lower bound :
TRi   Rki2 
k
Ck
Rii
* all three formula can be computed in linear tim e recursivel y in a bottom - up
fashion
Comments on Elmore Delay Model
• Advantages
– Simple closed-form expression
• Useful for interconnect optimization
– Upper bound of 50% delay [Gupta et al., DAC’95, TCAD’97]
• Actual delay asymptotically approaches Elmore delay as input
signal rise time increases
– High fidelity [Boese et al., ICCD’93],[Cong-He, TODAES’96]
• Good solutions under Elmore delay are good solutions under
actual (SPICE) delay
Comments on Elmore Delay Model
• Disadvantages
– Low accuracy, especially poor for slope computation
– Inherently cannot handle inductance effect
• Elmore delay is first moment of impulse response
• Need higher order moments
Chapter 7.2
Higher-order Delay Model
Time Moments of Impulse Response h(t)
•
Definition of moments
L
h(t ) 

H ( s)


0
0
H ( s)   h(t )e  st dt  
  1
i
h(t )  ( st ) dt
 i 0 i!



1
i
  ( s)  h(t ) t i dt
0
i  0 i!

1
i
i-th moment
mi  (1)  h(t ) t i dt
0
i!
•
Note that m1 = Elmore delay when h(t) is monotone voltage response of
impulse input
Pade Approximation
•
H(s) can be modeled by Pade approximation of type (p/q):
Hˆ p ,q ( s ) 
–
•
where q < p << N
bp s p    b1s  b0
aq s q    a1s  1
Or modeled by q-th Pade approximation (q << N):
q
kj
ˆ
ˆ
H q ( s )  H q1,q ( s )  
j 1 s  p j
•
Formulate 2q constraints by matching 2q moments to compute ki’s &
pi’s
General Moment Matching Technique
•
Basic idea: match the moments m-(2q-r), …, m-1, m0, m1, …, mr-1
Hˆ ( s ) 
kq
k1
k2


s  p1 s  p2
s  pq
 mˆ 0  mˆ 1s    mˆ r 1s r 1  ( s r )
•
When r = 2q-1:
(i) initial condition matches, i.e.
hˆ(0 )  h(0 ), or lim sHˆ ( s )  lim sH ( s )
s 
s 
or ( mˆ 1  m1 )
(ii)
ˆ k  mk for k  0,1,,2q  2
m
moments of Hˆ ( s)
Compute Residues & Poles


ki
ki pi
ki


s  pi
1  s pi
pi

s j
( )

j 0 pi
k1  k 2    k q  h(0)  m1
kq
k1 k 2
( 
   )  m0
p1 p2
pq
EQ1
kq
k1 k 2
 ( 2  2    2 )  m1
p1 p2
pq

kq
k1
k2
 ( 2 q 1  2 q 1    2 q 1 )  m2 q  2
p1
p2
pq
( lim sHˆ ( s ))
s 
initial condition
match first 2q-1 moments
Basic Steps for Moment Matching
Step 1: Compute 2q moments m-1, m0, m1, …, m(2q-2) of H(s)
Step 2: Solve 2q non-linear equations of EQ1 to get
p1 , p2 ,, pq : poles
k1 , k2 ,, kq : residues
Step 3: Get approximate waveform
pq t
p1t
p2 t
ˆ
h(t )  k1e  k2e    kq e
Step 4: Increase q and repeat 1-4, if necessary, for better accuracy
Components of Moment Matching Model
• Moment computation
– Iterative DC analysis on transformed equivalent DC circuit
– Recursive computation based on tree traversal
– Incremental moment computation
• Moment matching methods
– Asymptotic Waveform Evaluation (AWE) [Pillage-Rohrer, TCAD’90]
– 2-pole method [Horowitz, 1984] [Gao-Zhou, ISCAS’93]...
• Moment calculation will be provided as an OPTIONAL reading
Chapter 7 Interconnect Delay
7.1 Elmore Delay
7.2 High-order model and moment matching
7.3 Stage delay calculation
Stage Delay
Source
A
Interconnect
B
Load
C
TAC  TAB ( I , L)  TBC ( L)
TAB (0, L) : Gate delay with out interconne ct
TAB ( I , L) : Gate delay with interconne ct
TBC ( L) : Propagatio n delay  Transition delay
TInterconnect  TAB ( I , L)  TAB (0, L)  TBC ( L)
Modeling of Capacitive Load
• First-order approximation: the driver sees the
total capacitance of wires and sinks
• Problem: Ignore shielding effect of resistance 
pessimistic approximation as driving point
admittance
• Transform interconnect circuit into a p-model
[O’Brian-Savarino, ICCAD’89]
– Problem: cannot be easily used with most device
models
• Compute effective capacitance from p-model
[Qian-Pullela-Pileggi, TCAD’94]
P-Model
[O’Brian-Savarino, ICCAD’89]
• Moment matching again!
• Consider the first three moments of driving point
admittance (moments of response current caused by
an applied unit impulse)
• Current in the downstream of node k
I k ( s) 
 C sV (s)
jTk
j
j
Yk ( s )  I k ( s ) / Vin ( s )
Yk ( s ) 

 C sH
jTk
j
j
( s)

( i ) i 1
C
s

(
1

m
s

m
s


)

C
m
 j
 j j s
jTk
(1)
j
( 2) 2
j
i 1 jTk
Driving-Point Admittance Approximations
• Driving-point admittance = Sum of voltage momentweighted subtree capacitance
yk( i ) 
( i 1)
C
m
 j j
jTk

Yk ( s )   yk( i ) s i
i 1
• Approximation of the driving point admittance at the
driver
Y (s )
General RC Tree:
lumped RC elements,
distributed RC lines
Driving-Point Admittance Approximations
• First order approximation: y(1) = sum of subtree
capacitance
(1)
Y ( s)
C  y (1)
• Second order approximation: yk(2) = sum of subtree
capacitance weighted by Elmore delay
Y ( 2) ( s)
R   y ( 2) /( y (1) ) 2
C  y (1)
Third Order Approximation: P Model
Y ( 3) ( s )
R
C2
y
(1)
 C1  C2
y ( 2)   RC12
y (3)  R 2C13
C1
( y ( 2) ) 2
C1  (3)
y
C2  y (1)  C1
y ( 2)
R 2
C1
Current Moment Computation
• Similar to the voltage moment computation
• Iterative tree traversal:
– O(n) run-time, O(n) storage
• Bottom-up tree traversal:
– O(n) run-time
– Can achieve O(k) storage if we impose order of traversal,
k = max degree of a node
– O’Brian and Savarino used bottom-up tree traversal
Bottom-Up Moment Computation
•
Maintain transfer function Hv~w(s) for sink w in subtree Tv, and
moment-weighted capacitance of subtree:
m for j  0 p, C for j  0 p  1
j
w
•
j
Tv
m
As we merge subtrees, compute new transfer function
Hu~v(s) and weighted capacitance recursively:
j 1
CTvj 1  mvj 1Cv   mvj 1 q CTqv ,
q 0
mvj  ( Rv CTvj 1  Lv CTvj  2 ) for j  1 p
•
u
p
u
Rv , Lv , Cv
mvp
v
New transfer function for node w
CTpu 1
mvp
CTpv 1
CTpv 1
H u ~ w ( s)  H u ~ v ( s)  H v ~ w ( s)
j
m   mvj q mwq for j  0 p
j
w
•
q 0
New moment-weighted capacitance of Tu:
CTuj 
j
C
 Tv for j  0 p  1
vchild ( u )
w
mwp
mwp
Current Moment Computation Rule #1
yU(1)  y D(1)  C
YU (s)
YD (s)
C
yU( 2 )  y D( 2 )
yU(3)  y D(3)
Current Moment Computation Rule #2
YU (s)
R
YD (s)
yU(1)  y D(1)
yU( 2)  y D( 2)  R( y D(1) ) 2
yU(3)  y D(3)  2 R( y D(1) )( y D( 2) )  R 2 ( y D(1) )3
Current Moment Computation Rule #3
YU (s)
R
YD (s)
C
yU(1)  y D(1)  C
( 2)
U
y
y
( 2)
D
1 2
 (1) 2
(1)
 R ( y D )  C ( y D )  C 
3 



yU(3)  y D(3)  R 2( y D(1) )( y D( 2) )  C ( y D( 2) ) 
2 2 (1)
2 3
 (1) 3 4
(1) 2
R ( y D )  C ( y D )  C ( y D )  C 
3
3
15 

2
Current Moment Computation Rule #4
(Merging of Sub-trees)
B
YD1 ( s)
YU (s)
(1)
yU(1)   y Di
i 1

B
B Branches
( 2)
yU( 2 )   y Di
i 1
YDB (s)
B
( 3)
yU( 3)   y Di
i 1
Example: Uniform Distributed RC Segment
Purely capacitive
TAB/TAB(0)
Wide metal (distributive)
Wide metal (p)
Wide metal (lumped RC)
Narrow metal (p)
Narrow metal (distributive)
Narrow metal (lumped RC)
Cload/Cmax
Why Effective Capacitance Model?
• The p-model is incompatible with existing empirical device
models
– Mapping of 4D empirical data is not practical from a storage
or run-time point of view
• Convert from a p-model to an effective capacitance
model for compatibility
• Equate the average current in the p-load and the Ceff
load
R
 in
C2
C1
 in
C eff
Equating Average Currents
Ip
 in
IC
R
 in
C1
C2
1
tD

tD
0
1
Ip (t )dt 
tD

tD
0
I C (t )dt
Ip ( s )  Yp ( s )Vout ( s )
I C ( s )  sCeff Vout ( s)
tD = time taken to reach 50% point,
not 50% point of input to 50% point of output
C eff
Waveform Approximation for Vout(t)
• Quadratic from initial voltage (Vi = VDD for falling waveform) to
20% point, linear to the 50% point
Vi  ct 2
Vout (t )  
a  b(t  t x )
0  t  tx
tx  t  tD
• Voltage waveform and first derivative are continuous at tx
a  Vi  ct x2
b  2ct x
Average Currents in Capacitors
tD
1  tx

I C (t ) 
C

(

2
ct
)
dt

C

(

2
ct
)
dt
eff
eff
x
tx

t D  0

 2Ceff  c  t x
tD
[t D 
tx
]
2
 2C2  c  t x
tx
I C2 (t ) 
[t D  ]
tD
2
• Average current of C1 is not quite as simple:
– Current due to quadratic current in C2
– Current due to linear current in C2
Average Currents in C1
• Average current for (0,tx) in C2
t x
2



 2C1  c t x
RC1 
2
  RC1t x  ( RC1 ) 1  e

I C1 (t ) 


tx
 2


• Average current for (tx,tD) in C2
t x
 ( t D t x )




 C1  c 
RC1  
RC1
2

2 RC1t x  2( RC1 ) 1  e
 1 e
I C1 (t ) 



tD  tx 




 ( t D t x )


RC1 
RC1

 2ct x C1 1 
1 e

 t D  t x 

• Average current for (0,tD) in C2
 ( t D t x )
t D


 2C1  c  t x2
RC1
RC1 
2
  t x t D  t x  RC1   ( RC1 ) e

I C1 (t ) 
e


tD  2



Computation of Effective Capacitance
• Equating average currents
Ceff
t D
2   ( t D t x )


RC1
( RC1 )  RC1
RC1 

 C2  C1 1 

e
e
tx
tx 

 t D  2 t x (t D  2 ) 

• Problem: tD and tx are not known a priori
• Solution: iterative computation
– Set the load capacitance equal to total capacitance
– Use table-lookup or K-factor equations to obtain tD and tx
t D  t d   in / 2
tx  tD   / 2
– Equate average currents and calculate effective capacitance
– Set load capacitance equal to effective capacitance and
iterate
Stage Delay Computation
• Calculate output waveform at gate
– Using Ceff model to model interconnect
• Use the output waveform at gate as the input
waveform for interconnect tree load
• Apply interconnect reduced-order modeling technique
to obtain output waveform at receiver pins
Download