Timing Closure and the constant delay paradigm Problem: (timing closure problem) • It has been difficult to get a circuit that meets delay requirements because of inaccuracies in delay models and wire load estimates. • Iteration between logic synthesis and layout does not converge Solution: (sizing) • Get good circuit by logic synthesis • Lay it out and get good numbers on wire loads. • Size each gate to 1. Meet timing constraints 2. Use as little area as possible 1 Timing Closure and the constant delay paradigm Single buffer delay model: r R o s Cin sco CL Assumptions: 1. Transistor modelled as an effective resistance inversely proportional to device width R = r0/s 2. Discharge network modelled as a linear capacitance composed of constant part CL and device dependent part, Cp = scp 3. Gate delay approximated by summing RiCi over all nodes where Ri represents the total resistance between node i and the output Cp scp 2 Buffer delay bR (C L C P ) Cin sco bRC bRC L P CL bRC P bRC IN C IN r r0 CL 0 b sc p b sc 0 C IN s s C br0c 0 L br0c p C IN 1 g p f r R o s Cp scp CL buffer g p f 3 Sutherland delay equation CL 1 br0c 0 br0c p g p C IN f g = br0c0 - computing effort – size independent – depends on: • function • topology • relative transistor dimensioning in the gate type p = br0cp - inherent (parasitic) delay – size independent 1/f = CL/CIN - restoring effort g/f = effort delay 4 Capacitance and Area All input capacitances scale linearly ( Cin = fj Cj ) with the load For input i, the input capacitance, gi, is proportional to fj Cj j g i i f j C j Assume that size of a gate is proportional to sum of its input capacitances Aj i FI j ( ) aj g1 g2 gate j general gate Cj gk i j f j C j a j f j C j i FI j ( ) i j depends on the gate type of j 5 Capacitances in networks C i qi p FO i ( ) i i p f pC p j Cj output capacitances given qi k Ck qi - imposed capacitances (e.g. wire load) 6 Equations C i qi Aj k FO i i FI j ( ) k i k f Ck j ( ) i f j C j a j f j C j gj j pj fj where f j 1 Cj restoring effort of gate j C IN 7 Problem Find { fj } (gate sizes) to minimize the total area: j a j f j C j while meeting delay requirements: required out ( path ) arrival in ( path ) on all PI -> PO paths gj ( p j ) 0 j path f j 8 Heuristic for distributing restoring efforts Sutherland’s hypothesis of uniform restoring effort (1/f ) : Given: 1. a network with an equal number of gates on every path from PI to PO, 2. a capacitance at every PO, and 3. a driving capability at each PI, the network is smallest (and meets delay constraints) when every stage on each PI -> PO path has the same restoring effort. 9 Heuristic solution Assign delays to gates so that: 1. slack on each gate’s output is 0 2. restoring efforts are uniformly distributed to all gates as much as possible Iteratively, 1. find longest paths (in # gates). 2. assign 0 slack and uniform restoring effort to path: gj required out arrival in ( pj ) j path f j f path fj gj j path required out arrival in pj j path 10 Solving for the Ci and Ai C i qi k FO i ( ) ik f k C k Given: • { qi } • { fk } (just solved for) • Ci for each of the primary outputs Find: Ci for all i Ci can be computed in reverse topological order Areas are A a f C j j j j 11 Constant delay synthesis 1. logic synthesis – – – structures network for speed (technology independent) does load independent technology mapping inserts buffers (heuristically) 2. Layout – wire loads are extracted 3. Delays on each gate are assigned to a constant by zero slack and uniform restoring effort heuristic 4. Gate areas change, but this does not perturb layout significantly Yields better circuit properties – – – smaller area lower power consumption timing closure (no need to iterate logic synthesis) Library (continuous sizes?) – method requires sizing to meet delays 12 Constant delay synthesis • Delay requirements are always met as long as they are not less that parasitic delay • Area will depend on delay requirements – Area-delay tradeoff curve • Heuristic may not yield minimum area. – Could solve the nonlinear program for minimizing area • What technology independent and dependent logic synthesis techniques lead to smaller areas? – is the final area very sensitive to these? • The problem of timing closure is alleviated. – fix delay first and then find area – other way is (classical approach) • guess at loads and synthesize to meet delays • update loads and resynthesize etc. 13