Timing Closure

advertisement
Timing Closure and
the constant delay paradigm
Problem: (timing closure problem)
• It has been difficult to get a circuit that meets
delay requirements because of inaccuracies in
delay models and wire load estimates.
• Iteration between logic synthesis and layout does
not converge
Solution: (sizing)
• Get good circuit by logic synthesis
• Lay it out and get good numbers on wire loads.
• Size each gate to
1. Meet timing constraints
2. Use as little area as possible
1
Timing Closure and
the constant delay paradigm
Single buffer
delay model:
r
R o
s
Cin  sco
CL
Assumptions:
1. Transistor modelled as an effective resistance
inversely proportional to device width
R = r0/s
2. Discharge network modelled as a linear
capacitance composed of constant part CL and
device dependent part, Cp = scp
3. Gate delay approximated by summing RiCi over all
nodes where Ri represents the total resistance
between node i and the output
Cp  scp
2
Buffer delay


bR (C L  C P )

Cin  sco
bRC

bRC
L
P


CL
 bRC P
bRC IN
C IN

 r
r0
CL
0
 b sc p
b sc 0
C IN
s
 s

C
br0c 0 L  br0c p
C IN

 1
g  p
 f
r
R o
s
Cp  scp
CL
buffer
g
 
 p
f
3
Sutherland delay equation


CL
1
br0c 0
 br0c p  g  p
C IN
f
g = br0c0 - computing effort
– size independent
– depends on:
• function
• topology
• relative transistor dimensioning in the gate type
p = br0cp - inherent (parasitic) delay
– size independent
1/f = CL/CIN - restoring effort
g/f = effort delay
4
Capacitance and Area
All input capacitances scale linearly
( Cin = fj Cj ) with the load
For input i, the input capacitance, gi, is
proportional to fj Cj
j
g i  i f j C j
Assume that size of a gate is
proportional to sum of its input
capacitances
Aj  

i FI j

( )
aj  
g1
g2
gate
j
general
gate
Cj
gk
i j f j C j  a j f j C j

i FI j

( )
i j
depends on the gate type of j
5
Capacitances in networks
C i  qi 

p FO i

( )
i
 i p f pC p
j
Cj
output
capacitances
given
qi
k
Ck
qi - imposed
capacitances
(e.g. wire load)
6
Equations
C i  qi 
Aj  

k FO i

i FI j


( )
k
i k
 f Ck
j
( )
i f j C j  a j f j C j
gj
j 
 pj
fj
where f j
1
Cj

 restoring effort of gate j
C IN
7
Problem
Find { fj } (gate sizes) to minimize the total area:
j a j f j C j
while meeting delay requirements:
required out ( path )

  arrival in ( path ) 


on all PI -> PO paths

gj
(
 p j )  0


j path f j

8
Heuristic for distributing
restoring efforts
Sutherland’s hypothesis of uniform restoring effort
(1/f ) :
Given:
1. a network with an equal number of gates on every path
from PI to PO,
2. a capacitance at every PO, and
3. a driving capability at each PI,
the network is smallest (and meets delay
constraints) when every stage on each PI -> PO
path has the same restoring effort.
9
Heuristic solution
Assign delays to gates so that:
1. slack on each gate’s output is 0
2. restoring efforts are uniformly distributed to all gates
as much as possible
Iteratively,
1. find longest paths (in # gates).
2. assign 0 slack and uniform restoring effort
to path:
gj
required out  arrival in   (
 pj )
j path
f j  f path 
fj
gj

j path

required out  arrival in 
pj

j path

10
Solving for the Ci and Ai
C i  qi 

k FO i

( )
ik f k C k
Given:
• { qi }
• { fk } (just solved for)
• Ci for each of the primary outputs
Find: Ci for all i
Ci can be computed in reverse topological order
Areas are
A a f C
j
j j
j
11
Constant delay synthesis
1.
logic synthesis
–
–
–
structures network for speed (technology independent)
does load independent technology mapping
inserts buffers (heuristically)
2. Layout
–
wire loads are extracted
3. Delays on each gate are assigned to a constant by zero slack
and uniform restoring effort heuristic
4. Gate areas change, but this does not perturb layout
significantly
Yields better circuit properties
–
–
–
smaller area
lower power consumption
timing closure (no need to iterate logic synthesis)
Library (continuous sizes?)
–
method requires sizing to meet delays
12
Constant delay synthesis
• Delay requirements are always met as long as they
are not less that parasitic delay
• Area will depend on delay requirements
– Area-delay tradeoff curve
• Heuristic may not yield minimum area.
– Could solve the nonlinear program for minimizing area
• What technology independent and dependent logic
synthesis techniques lead to smaller areas?
– is the final area very sensitive to these?
• The problem of timing closure is alleviated.
– fix delay first and then find area
– other way is (classical approach)
• guess at loads and synthesize to meet delays
• update loads and resynthesize etc.
13
Download