Chapter 2: Static Timing Analysis Massoud Pedram Dept. of EE University of Southern California Outline Background Gate Delay Analysis Wire-Load Delay Analysis M. Pedram K-factor Approximation Effective Capacitance Approach Interconnect Modeling Transmission Line Equations Elmore Delay S2P Approach Appendices Motivation M. Pedram Does the design meet a given timing requirement?!! How fast can I run the design?!!! The Problem and Its “Solution” What’s the problem? Delays on signals due to wires no longer negligible Modern designs must meet tight timing specifications Layout tools must guarantee these timing specifications How can we address the problem during physical design? By ignoring it, mostly Implicitly, qualitatively M. Pedram We try to make layout area small and wires short We rely on cell libraries with many cell drivers strengths We employ accurate, yet efficient, timing analysis tools We develop timing-aware design optimization methodologies, flows and tools Background 15% delay Mid 80 Scenario 85% delay 50% delay Mid 90 Scenario 50% delay 80% delay M. Pedram Half of input to output delay of the logic is due to wire delay Today’s Scenario 20% delay Most of the input to output delay of the logic is due to gate delay Most of input to output delay of the logic is due to wire delay Guidelines to Meet Timing With higher chip speeds and densities on the horizon, getting quick and accurate feedback on signal delays during design has become a critical issue. With everincreasing time-to-market pressures, designers must be able to: M. Pedram Analyze timing early and often throughout the design process by using high-fidelity and efficient timing analysis tools Decrease the number of independent optimization steps in the design cycle (unification-based approach) Eliminate the time spent debugging erroneous timing results Have the capability to make design changes that can be retimed quickly without having to entirely re-run static timing analysis from scratch in a separate environment Ensure that all intermediate timing analysis results correlate well with the final timing results Verify the delay and timing of the finished product Input-to-Output Propagation Delay A B Inv 1 C Inv 2 Delay AC Delay AB Delay BC The circuit delay in VLSI circuits consists of two components: M. Pedram The 50% propagation delay of the driving gates (known as the gate delay) The delay of electrical signals through the wires (known as the interconnect delay) Gate Delay and Output Transition Time T in Gate/Cell Cload Gate Delay f (Tin , Cload ) Output M. Pedram Transition Time f (Tin , Cload ) The gate delay and the output transition time are functions of both input slew and the output load General Model of a Cell M. Pedram Definitions Output Transition Time Vout Wp Vin Vout Vin 90% CM Cdiff tin Cload Wn tout 10% Time Cout Gate Delay M. Pedram Output Response for Different Loads M. Pedram Output Transition Time 11 x 10 9 8 7 6 5 2 2.4 2.8 3.2 3.6 4 -10 10-10 Input Transition Time (s) x 10 x 10 10 Output Transition time (s) Output Transition time (s) 11 Size=69 Cout=23fF Size=48 Cout=15fF Size=90 Cout=18fF 10 M. Pedram -11 -11 Size=60 Tin=300pS Size=81 Tin=350pS Size=45 Tin=200pS 9 8 7 6 1 1.4 1.8 2.2 CLoad (F) 2.6 3 -14 10-14 x 10 Output transition time as a function of input transition time and output load ASIC Cell Delay Model Three approaches for gate propagation delay computation are based on: M. Pedram Delay look-up tables K-factor approximation Use of a Thevenin equivalent circuit composed of a voltage source and a resistance in series with the gate load. Although the first approach is currently in wide use especially in the ASIC design flow, the third approach promises to be more accurate when the load is not purely capacitive. This is because it directly captures the interaction between the load and the gate/cell structure. The resistance value in the Thevenin model is strongly dependent on the input slew and output load and requires output voltage fitting. Table Look-Up Method Cload (fF ) 0 5 10 500 505 510 50 70 T in (pS) 90 115pS 110 310 330 M. Pedram What is the delay when Cload is 505f F and Tin is 90pS? K-factor Approximation x 10 Output Transition time (s) 10 -11 11 Size=69 Cout=23fF Size=48 Cout=15fF Size=90 Cout=18fF Output Transition time (s) 11 9 8 7 6 5 2 2.4 2.8 3.2 3.6 4 -10 Input Transition Time (s) x 1010-10 x 10 10 -11 Size=60 Tin=300pS Size=81 Tin=350pS Size=45 Tin=200pS 9 8 7 6 1 1.4 1.8 2.2 CLoad (F) 2.6 3 -14 x 10 10 -14 According to above, we can write the output transition time as a function of input transition time and output load as a polynomial functions with curve fitting. As an example, consider: Toutput k1 k2Cload Tin (k3 k4Cload ) k5Cload 2 M. Pedram A similar equation (with different coefficients, of course, gives the gate delay One Dimensional Table tr (C1) tr1 tr (C2 ) tr 2 tr (C ) a1C a2 t tr1 a1 r 2 C2 C1 tr1C2 tr 2C1 a2 C2 C1 t tr1 t C tr 2C1 tr (C L ) r 2 C L r1 2 C2 C1 C2 C1 M. Pedram Two Dimensional Table D2 D1 D3 D4 D(C , tin ) k1 k2CL k3tin k4CLtin k1 ( D4C1t1 D3C2t1 D2C1t2 D1C2t2 ) / W k 2 ( D3t1 D4t1 D1t2 D2t2 ) / W k 3 ( D2C1 D4C1 D1C2 D3C2 ) / W k 4 ( D1 D2 D3 D4 ) / W M. Pedram W (C1 C2 )(t1 t2 ) Second-order RC- Model Using Taylor Expansion around s = 0 T in R Gate /Cell T in Gate /Cell C1 Yin (s) A1s A2s A3s ..... 2 M. Pedram 3 A22 C1 A1 A3 C2 Y in (s) (C1 C2 ) s R C22s 2 R 2C23s3 ..... R A32 A23 A22 C2 A3 Second-order RC- Model (Cont’d) T in R Gate /Cell C1 C2 Gate Delay f (Tin , C1, R , C2 ) This equation requires creation of a fourdimensional table to achieve high accuracy M. Pedram This is however costly in terms of memory space and computational requirements Effective Capacitance Approach T in R Gate /Cell C2 C1 T in Gate /Cell C eff M. Pedram The “Effective Capacitance” approach attempts to find a single capacitance value that can be replaced instead of the RC- load such that both circuits behave similarly during transition Output Response for Effective Capacitance M. Pedram Effective Capacitance (Cont’d) M. Pedram Effective Capacitance (Cont’d) T in T in R Gate /Cell Gate /Cell C1 C2 Ceff C1 kC2 M. Pedram C eff 0<k<1 Because of the shielding effect of the interconnect resistance , the driver will only “see” a portion of the far-end capacitance C2 R 0 k=1 R ∞ k=0 Effective Capacitance for Different Resistive Shielding M. Pedram Macy’s Approach Tin1 Tin2 R GATE 1 GATE 2 C1 M. Pedram R C2 C1 C2 Assumption: If two circuits have the same loads and output transition times, then their effective capacitances are the same. In other words, the effective capacitance is only a function of the output transition time and the load Macy’s Approach (Cont’d) Normalized Effective Capacitance Function C1 C1 C2 T out R C2 M. Pedram Ceff C1 C2 0 1 Macy’s Iterative Solution 1. Compute from C1 and C2 Tin1 R GATE 1 C1 2. Choose an initial value for Ceff 3. Compute Tout for the given Ceff and Tin C1 C1 C2 Ceff 4. Compute 5. Compute from and 6. Find new Ceff 7. Go to step 3 until Ceff converges M. Pedram C1 C2 C2 Tout R C2 USC’s Approach T in R Gate /Cell C2 C1 Rd TR R M C1 Vdd t t B Ae Cosh(t ) T VM (t ) R Vdd T A'e t Cosh(t ' ) TR R M. Pedram C2 0 t TR TR t USC’s Approach (Cont’d) T in Gate /Cell C eff TR Rd N Ceff M. Pedram t Vdd Rd Ceff t R C R C e ) d eff d eff T R VN (t ) TR t Vdd T R C (1 e Rd Ceff )e Rd Ceff d eff TR R 0 t TR TR t Eff_Cap Equation 1 e t Ceff (C1 C2 ) Cosh(t ) Cosh( ) (1 e M. Pedram t Rd Ceff ) See expressions for , , and in the paper by Abbaspour/Pedram This is a non-Linear algebraic equation, which must be solved by iteration A good initial value for Ceff can speed up the procedure to find the answer Initial Guess 35 35 30 30 Ceff (fF) Ceff (fF) (C1=15fF, C2=20fF) 25 20 15 0 20 20 40 60 80 15 0 20 40 60 80 Rp (K) Rp (K) (a) driver size=500l,TR=100pS (b) driver size=100l,TR= 200pS Ceff M. Pedram 25 Rd C1 C2 Rd R USC’s Iterative Solution 1. 2. 3. 4. Start with the initial guess for Ceff Obtain t0-50% based on values of Ceff and TR Obtain Rd based on values of Ceff and TR Compute a new value of Ceff from the Eff_Cap equation 5. Record the previous value of t0-50% . Find current t050% based on the new Ceff and given TR 6. Compare the previous and current values of t0-50% from step 5 7. If not within acceptable tolerance, then return to step 3 until t0-50% converges 8. Report t50% propagation delay and t0-80% from the table M. Pedram Interconnect Analysis M. Pedram So far, we have only discussed the gate delay How do we calculate the interconnect delay? Precise delay calculation needs transmission line analysis Transmission Line Equations Drop across R and L is : V I RI L x t Current through C and G is : I V GV C x t Infinitesimal Model of a Transmission Line The resulting equation is as follows: 2V V 2V RGV ( RC LG) LC t x2 t 2 M. Pedram Lumped Model of a Transmission Line M. Pedram Impedance of an Infinite Line An infinite length of RLCG transmission line has an impedance: R Ls Z0 G Cs Driving a line terminated in Z0 is the same as driving Z0 In general, Z0 is complex and frequency dependent. For LC lines, Z0 is real and independent of frequency and is given by: L Z0 M. Pedram C Simple Transmission Line Models Generally speaking, G=0 Ideal Lumped Wires (purely C, R or L) RC Transmission Lines Long on-chip wires; L=0; Diffusion Equation C Most off-chip wires; R=0; Wave equation Reflections and the Telegrapher’s equation L Lossy RLC Transmission Lines M. Pedram R Lossless LC Transmission Lines Pure C section: most short signal lines and short sections of low-impedance transmission lines Pure R section: on-chip power supply wires Pure L section: off-chip power supply wires and short sections of high-impedance transmission lines Wave attenuation and DC attenuation Combined traveling wave and diffusive response The skin effect C R L C Low-Frequency RC Line, 24AWG Twisted Pair M. Pedram R=0.08 /m, C=40 pF/m and L=400 nH/m 1 f0=R/(2L)=33Khz 0.08 400 109 2 fj 2 Below f0, Line is RC with Z0 12 2 fj 40 10 Above f0, Line is LC with Z0 100 Lossy RC Transmission Lines Most real lines dissipates power. The loss is due to resistance of the conductor and conductance of the insulators. RC lines are an extreme case: R >> L Propagation is governed by the diffusion equation: Typical of on-chip wires: 2V V M. Pedram R=150k/m L=600nH/m f=40Ghz x2 RC t Step Response of a Lossy RC Line Signal is dispersed as it propagates down the line: Delay: td 0.4d 2 RC Rise Time: tr d 2 RC M. Pedram R increases with length, d C increases with length, d Delay and rise time are proportional to RC and then increases with d2 In many cases the degradation of the rise time caused by the diffusive nature of RC lines as much a problem as the delay Lossless LC lines If R and G are negligible, then line is lossless (i.e., no heat generation and the line is governed by the wave equation) Waves propagate in both directions without any loss Line is described by its impedance and velocity 2V 2V LC 2 x t 2 x V f ( x, t ) V 0, t v 1 v ( LC ) 2 M. Pedram x x Vr ( x, t ) V xmax , t max v 1 L 2 Z0 C Reflections and the Telegrapher’s Equation in a Lossless LC Line When a traveling wave reaches the end of the line with impedance Z0 terminated in an impedance ZT, then it is reflected Telegrapher’s equation relates the magnitude and phase of the incident wave to those of the reflected wave as follows: Kr = Ir/Ii = Vr/Vi = (ZT-Z0)/(ZT+Z0) Some common terminations M. Pedram Open-circuit Short-circuit Matched Source termination and multiple reflections Lossy RLC Transmission Lines LC lines with resistance in conductors and conductance in dielectrics: Combined traveling wave and diffusive response The amplitude of the traveling wave is reduced exponentially with distance along the line For a line with matched termination, the steady-state response is attenuated by a DC amount proportional to the inverse of the length of the line Disperses the signal M. Pedram Fast rise due to traveling wave behavior Slow tail due to diffusive relaxation Resistance (due to the skin effect) and conductance (due to dielectric absorption) are in fact dependent on the frequency. Both effects result in increased attenuation at higher frequencies d J exp( ) ( f )1/ 2 Skin effect: High frequency current density falls off exponentially within depth into conductor Step Response of 1 meter of 8mil Stripguide M. Pedram Lumped RC Model for On-chip Wires vs(t) R v1(t) C M. Pedram Impulse response and Step Response of a lumped RC circuit RC Model (cont’d) Transfer Function of RC Networks 1 v ( s) 1/ sC 1 H (s) 1 sRC vs ( s ) R 1/ sC 1 sRC 1 1 sRC for for M. Pedram s0 H ( s ) 1 ( RC ) s ( RC ) 2 s 2 ... m0 m1s m2 s 2 ... s 1 H ( s) sRC 1 1 1 2 1 s s ... 2 ( RC ) ( RC ) RC Model: Important Notes Impulse response: h(t )e st dt 0 M. Pedram h(t ) dt 0 0 1 e RC t RC 1 1 1 (1 t t 2 ....) RC RC 2!( RC ) 2 0 h(t ) h(t )tdt h(t )( s 0 H (s) d st e )dt ds s 0 s 0 0 dk k k h(t )t dt ( 1) H (s) k ds s 0 mk 1 d H (s) ds s 0 RC Elmore Delay 1v t V (t ) 1 e RC V(t) R C Time Constant=RC What is the time constant for more complex circuits? A B Inv 1 M. Pedram C Inv 2 Elmore Delay (Cont’d) Resistance-oriented Formula: Tdelay Ri Cdownstream,i i on path Tdelay,4=R1(C1+C2+C3+C4+C5)+R2(C2+C4+C5)+R4C4 M. Pedram Elmore Delay (Cont’d) The Elmore delay is negative of the first moment of the impulse response, -m1 m(1) 0 M. Pedram th(t )dt and tmedian h(t )dt 0.5 0 If the impulse response is symmetric, then –m1 = tmedian However, in RC network, tmedian < -m1 Thus, the Elmore delay gives an upper bound of the delay Elmore Delay Approximation M. Pedram 3-pole Reduced Order Approximation M. Pedram Stable 2-Pole RC delay calculation (S2P) M. Pedram The Elmore delay is the metric of choice for performance-driven design applications due to its simple, explicit form and ease with which sensitivity information can be calculated However, for deep submicron technologies (DSM), the accuracy of the Elmore delay is insufficient Driving Point Admittance Let Y(s) be an driving point admittance function of a general RC circuit. Consider its representation in terms poles and residues: q Y (s) n 1 kn k0 s pn Moments of Y(s) can be written as: q mi M. Pedram where q is the exact order of the circuit kn pni 1 n 1 i0 Recursive Calculation of Moments of Y(s) i Ri Li j Yi Ci Yj Yi ( s) sm1,i s 2 m2,i s3m3,i .... m1,i m1, j Ci mk ,i mk , j Ri k 1 l 1 M. Pedram k 2 ml ,i mk l , j Li l 1 ml ,i mk l 1, j Ri Ci mk 1,i Li Ci mk 2,i k2 Moments of Impulse Response M. Pedram The first moment of the impulse response, H(s), is known as Elmore delay which is shown before The computation of the additional moments beyond the first moment comes with very little incremental cost. This process can be implemented using a vectorized path tracing algorithm like the one described in the paper “RICE:Rapid Interconnect circuit evaluation using AWE” Moments of H(s) Moments of H(s) are coefficients of the Taylor’s Expansion of H(s) about s=0 d H ( s) H 0 s H ( s) ds s0 1 d2 2 s H ( s) 2 2! ds m(0) sm(1) s2m(2) s3m(3) ... 1 dj ( j ) m H (s) j j ! ds M. Pedram s0 s0 1 d3 3 s H ( s) 3 3! ds s0 ... S2P Algorithm S2P Algorithm 1. 2. 3. Compute m1, m2, m3 and m4 for Y(s) Find the two poles at the driving point admittance as follows: To match the voltage moments at the response nodes, solve the Vandermonde equations to get the following result: Notice that m0* and m1* are the moments of H(s). m0* is the Elmore delay. 4. M. Pedram The S2P approximation is then expressed as: Appendix I: Characteristics of Moments H ( s) 0 h(t )est dt 1 1 h(t ) 1 st s 2 t 2 s3 t 3 ... dt 2! 3! 0 m(0) 0 m(1) h(t )dt th(t )dt 0 1 (2) m t 2 h(t )dt 2! 0 1 (3) m t 3h(t )dt 3! 0 ......... (1) j ( j ) m t j h(t )dt j! 0 M. Pedram j 0 (1) j s t j h(t )dt j! 0 Appendix II: Pade Approximation Pade approximation of the transfer function H(s) is a rational function as shown below: Pp ( s ) a0 sa1 s 2 a2 ...s p a p H p, q ( s ) H ( s) O( s p q 1) Qq ( s ) 1 sb s 2b ... s q b 1 2 q Pp ( s ) s p q 1r ( s) (0) (1) 2 (2) p q ( p q ) m sm s m ..... s m Qq ( s ) Qq ( s ) a0 sa1 s 2 a2 ... s p a p (1 sb1 s 2b2 ... s q bq ) m(0) sm(1) s 2m(2) ..... s p q m( p q) s p q 1r ( s ) M. Pedram Pade Approximation (Cont’d) m(0) m(1) ... (q 2) m m(q 1) m(1) m(2) ... m(q 1) m( q ) m(2) m(3) ... m( q ) m(q 1) ... m(q 1) bq m(q) ... m(q) bq 1 m(q 1) ... ... ... ... m(2q 3) b1 m(2q 2) b ... m(2q 2) 0 m(2q 1) a0 m(0) a1 m(1) m(0)b1 a2 m(2) m(1)b1 m(0)b2 ... min( p, q) ap m( p) m( p j ) b i 1 M. Pedram j Appendix III: RLC Model vs(t) v1(t) R L C Overdamped v1( s ) , or 1 v1(t ) vdd 1 2 2 R , 2L 1 LC 2 vdd 2 2 s( s 2 s ) L R 2C / 4 2 2 t e 1 2 2 2 2 1 Critically damped Step Response , or 2 2 t e 2 2 1 L R 2C / 4 v1(t ) vdd 1 (t 1)et Underdamped , or L R 2C / 4 jd ( jd )t jd ( jd )t v1(t ) vdd 1 e e 2 jd 2 jd M. Pedram d 2 2 Voltage across the Capacitance R L vC(t) vR(t) vL(t) C L=0.0nH L=1.0nH L=2.0nH Time(s) M. Pedram Inductance changes the behavior of the voltage from underdamped to overdamped Voltage across the Resistance R L vC(t) vR(t) vL(t) C L=0.0nH L=1.0nH L=2.0nH Time(s) M. Pedram Voltage across the Inductance R L vC(t) vR(t) vL(t) C L=0.0nH L=1.0nH L=2.0nH Time(s) M. Pedram