Fast Static Timing Analysis 鍾逸亭 Outline 1.前情提要 Timing delay issue Floating mode sensitization 2.探究[1]的Circuit Delay Satisfiability Model 3.如何快速的計算與解決timing problems 計算電路的true delay 判斷一條path是不是false path 找出所有delay >= D的true path 4.以上方法的缺點與折衷方案 Three Value Simulation Model Timing Delay Issue • Topological Delay Arrival time PI Required time Slack PO • False path may cause inaccurate delay estimation Max delay Floating mode sensitization • Floating mode sensitization – on-input is the earliest controlling-value. – on-input is the latest nc and side-inputs are nc. (nc, -) or (c, ≥ t) (c, t) (nc, -) or (c, ≥ t) f (nc, ≤ t) (nc, t) (nc, ≤ t) • Build satisfiability models – c = controlling value of f – nc = non-controlling value of f – d = delay of f f 簡介 • f ,t (c) = under input pattern c, f is sensitized after or at time t. (在input是c之下,訊號 f 在時間>= t 時才決定值) • 善用它, 我們可以… 1. 計算電路的true delay 2. 判斷一條path是不是false path 3. 找出所有delay >= D的true path 4. 其他timing issue… 1. 計算電路的true delay • 若有input pattern c可使某個PO在時間>= t 時才 決定值,則電路的delay必>=t, 流程如下: 1. 先讓t = maxDelay (topological delay) 2. 看 gPO g, t (c) 是否SAT – SAT: 電路delay>=t, 且已知delay不會>t delay = t – UNSAT: 電路delay < t 3. t= t-1, 重複上面步驟直到SAT 1. 計算電路的true delay • 怎麼建 f ,t (c) model: 之後投影片以X(f,t)表示 • 以右圖為例 A f B • X(f,t) = 1 if A sensitizes f and X(A,t-d)=1 or B sensitizes f and X(B,t-d)=1 (如果f被A決定而且A在時間>=t-d才決定 或者f被B決定而且B在時間>=t-d才決定 則f在時間>=t 才決定值) A f B 1. 計算電路的true delay • Let c = controlling value of f, nc = non-controlling • X(f,t) = [f被A決定]*X(A,t-d) + [f被B決定]*X(B,t-d) • f被A決定= A 是最早到的c或最晚到的nc = A 是c且其他 input 比A晚或是同時到 +A 跟其他 input 都是nc = [ A’ ( X(A,t-d)+A ) ( X(B,t-d)+B ) + AB ] • 所以Paper[1]的floating model公式為 f ,t gFI ( f ) g ,t d {( g c ) ( hFI ( f ) h ,t d h nc( f )) (h nc)} hFI ( f ) 細看公式 • X( f, t) = f is sensitized after or at time t = there is a fanin g can sensitize f ^ g is sensitized after or at time t-d f ,t g ,t d gFI ( f ) {( g c ) ( hFI ( f ) h ,t d h nc( f )) (h nc)} hFI ( f ) 1. g=controlling value, other fanins are either “sensitized after or at time t-d” or “non-c value” (g is the earliest c) g f 2. g and other fanins are non-c values. (all fanin are non-c values, and one is ready after or at time t-d, so f must be sensitized after or at time t) Example We build X(g,3) to check whether delay is >= topological delay. initial : a , 0 ( c ) 1, a , 1 ( c ) 0 b, 0 ( c ) 1, b, 1 (c ) 0 g , 3 (c ) d , 2 (c )[ d ( d , 2 ( c) d )( f , 2 (c) f ) df ] Atwo input AND gate model ( c )[ f ( (c ) d )( (c ) f ) df ] build 2 INV + 5 OR + 5 AND d , 2 ( c) b, 1 (c )[b ( b, 1 (c ) b)( a , 1 (c ) a ) ab] f ,2 d,2 f ,2 a , 1 ( c )[a ( b, 1 ( c ) b)( a , 1 ( c ) a ) ab] e, 1 ( c ) b, 0 ( c)[b ( b, 0 ( c ) b)( a , 0 (c ) a ) ab] a , 0 (c )[a ( b, 0 ( c ) b)( a , 0 ( c ) a ) ab] f , 2 (c ) b, 1 ( c )[b( b, 1 ( c ) b )( e, 1 (c) e ) b e ] e, 1 ( c )[e( b, 1 (c ) b )( e , 1 ( c ) e ) b e ] g, 3 0 There is no input vector can make delay ≥ 3 ! gPO Try to construct X(g,2) recursively 1. 計算電路的true delay • Paper[2]的實驗數據: • 為何不跑大電路? 1個2 input AND對應的 X model有12 gates 大電路難建且難解SAT 30 25 20 [1] 15 [2] 10 our 5 0 c1355 c1908 c2670 c3540 c5315 c6288 c7552 1. 計算電路的true delay • 觀察一些可化簡的現象: 1. 若t<=0 X(f,t)=1 (任何訊號都會在時間>=0時決定) 2. 若t>0且f是PI X(f,t)=0 (PI在時間=0 <t時就已決定) 3. 若fanin number = 1 X(f,t) = X(fanin,t) 4. 若已知B是較短路徑,從B走沒辦法讓X(f,t)=1 則可以設 X(B,any time)=0 重要! A f B 1. 計算電路的true delay • Slack化簡術: 當我們算電路的delay, 其實只關心X(PO,t)是否能為1 因此所有較短路徑,沒辦法讓電路delay>=t的gate和net 其X model 可全設成constant 0 (就算能從較短路徑sensitize PO,最後delay也<t,不如不讓 它sensitize ) Slack<=S Slack>S 要讓delay >= t的gate/net 其slack必 <= S (S=maxDelay-t) 所有slack>S的X model都當作0 (不讓電路從這條短路徑sensitize) 可少建很多X Flowchart ∆ = largest topological delay S=0 Generate models for slack<=S gate g , (c) gPO circuit delay = ∆ yes witness (a input pattern) can active some paths with delay= ∆ SAT? no Decrease ∆ S++ Experimental result (slack化簡術) Total time (Small case) 1 0.9 my_about 0.8 0.7 time (s) 0.6 fast_float float 0.5 0.4 0.3 0.2 0.1 0 Gate Num * (slack+1) Experimental result (slack化簡術) Total time (med case) 5 4.5 my_about 4 fast_float 3.5 float time (s) 3 2.5 2 1.5 1 0.5 0 b05 b21_1 b20 b21 b17_1 b22_1 Gate Num * (slack+1) b22 c6288 s38417 s35932 Experimental result (slack化簡術) Total time (large case) 120 my_about 100 fast_float 80 time (s) float 60 40 20 0 b17_0 b18 Gate Num * (slack+1) b19 2. 判斷是不是false path • 這次只要知道某條path是否sensitizable 1. 先算這條path的topological delay = D 2. 求X(out,D)是否SAT (out可在時間>=D到達) 3. 但這次要確保out是由這條path sensitize的,因此在這 條path上的X model要另外考慮 (不在上面的X照常建造) out in path Circuit 2. 判斷是不是false path • f須由path 上的a sensitize A f B • X(f,t) = [ A’ ( X(A,t-d)+A ) ( X(B,t-d)+B ) + AB ] * X(A,t-d) = [ A’ X(B,t-d) + A’B + AB ] * X(A,t-d) = [ A’ X(B,t-d) + B ] * X(A,t-d) 同理,可使用slack化簡術 2. 判斷是不是false path • Example a1 a2 n2 A n3 a6 n7 n5 B C a3 n4 out n8 n1 1. Topological delay of the path: D = 9 讓所有path side input為nc 2. X=0 for slack > 0 3. Construct X(out,9) Slack化簡術 X(out,9) = [n8’ + n7·X(n8,8)]·X(n7,8) = n8’·c·n4’·n1 X(n7,8) = [c + n6’·X(c,7)]·X(n6,7) = c·n4’·n1 X(n6,7) = X(a3,6) = X(n5,5) = [n4’ + n3·X(n4,4)]·X(n3,4) = n4’·n1 X(n3,4) = [n1 + n2’·X(n1,3)]·X(n2,3) = n1 X(n2,3) = X(a2,2) = X(a1,1) = X(A,0) = 1 4. X(out,9) is UNSAT This is a false path! 2. 判斷是不是false path • Example a1 a2 n2 A n3 a6 n7 n5 B C a3 n4 out n8 n1 1. Topological delay of the path: D = 7 2. X=0 for slack > 2 3. Construct X(out,7) X(out,7) = [n7’ + n8·X(n7,6)]·X(n8,6) X(n8,6) = [n1 + n5’·X(n1,5)]·X(n5,5) … 4. X(out,7) is SAT This is a true path! 不能化簡,要建正常的X model 但複雜度還是比”算電路delay>=7”低 因為建的model數量較少 (只需建path output的fanin cone,其 中有些X還可以化簡) 3. 找delay >= D的true paths • 1. 建X(PO,D) • 2. 一樣可用slack化簡術 • 3. 做constant propagation • 4. 剩下的X解SAT並記錄結果 • 5. 根據SAT的X還原路徑 3. 找delay >= D的true paths 雖然圖中所有點,線的slack<=1, 但不代表 delay都>=maxDelay-slack=5-1=4 • 1. 建X(PO,D) A b c d e f - 假設D=4: 想找delay>=4的true paths - X = 0 for slack > S = 5-4 = 1 (此例無) - Build X(f,4) X(b,2) X(A,1) X(c,2) X(b,1) X(A,0) X(b,1) X(A,0) X(c,1) X(b,0) X(d,3) X(f,4) X(e,3) X(d,2) 3. 找delay >= D的true paths • 1. 建X(PO,D) A b c d e f • 3. 做constant propagation X(b,2)=0 X(A,1)=0 X(c,2)=1 X(b,1)=1 X(A,0)=1 X(b,1)=1 X(A,0)=1 X(c,1)=1 X(b,0)=1 X(d,3) X(f,4) X(e,3)=1 X(d,2)=1 3. 找delay >= D的true paths • 1. 建X(PO,D) A b c d e f • 3. 做constant propagation • 4. 剩下的X解SAT並記錄結果 X(b,2)=0 X(A,1)=0 X(c,2)=1 X(b,1)=1 X(A,0)=1 X(b,1)=1 X(A,0)=1 X(c,1)=1 X(b,0)=1 X(d,3) X(f,4) X(e,3)=1 X(d,2)=1 3. 找delay >= D的true paths • 4. 剩下的X解SAT並記錄結果 A b c d e f • 若用手算: X(d,3) = 0+1[c’(0+b)(1+c)+bc] = b X(f,4) = b[d’(b+d)1+de] = b+d b=1 要等c來 d=1 d or 要等e來 f 一定要d從c來 或 f從e來 可避開Abdf (d從b來 且 f從d來)這條delay=3的路徑! 3. 找delay >= D的true paths • 5. 根據SAT的X還原路徑 A b c d e f X(d,3) X(c,2) X(b,1) X(A,0) X(b,1) X(A,0) X(f,4) X(e,3) X(d,2) d c f e X(c,1) X(b,0) b A b A c b d A 自動補滿all path到PI • There are 3 paths (Abcdef, Abdef, Abcdf) with delay>=4 Drawback • Drawback of fast float model – Float mode sensitization needs to recursively build X from PO to PI by exhausting all paths in the critical region. – May build many X for a gate f: X(f,t1), X(f,t2), … – Exponential building complexity n2 n3 n1 n4 in1 In2 If we want to check delay>=3 Build X(out,3) out X(n3,2), X(n4,2) X(n2,1), X(n1,1) X(n1,0)=1, X(in1,0)=1, X(in2,0)=1 TVS model • Fast estimate delay upper bound – Only build one model for a gate/net – Linear building complexity • Three Value Simulation Model – – – – – – – 0 00, AND: NAND: OR: NOR: BUF: NOT: 1 11, Unknown 01 c0=a0b0 c1=a1b1 c0=a1’+b1’ c1=a0’+b0’ c0=a0+b0 c1=a1+b1 c0=a1’b1’ c1=a0’b0’ c0=a0 c1=a1 c0=a1’ c1=a0’ a c b a0a1 b0b1 c0c1 00 -- 00 -- 00 00 01 01 01 01 11 01 11 01 01 11 11 11 TVS model • Check if circuit delay < maxDelay - S • Copy the critical region (slack<=S) to TVS model TVS model copy Critical region Circuit TVS model • Apply Unknown value (01) to TVS’s PI • Apply normal signal values to normal region Unknown Unknown Unknown TVS model Non-critical fanin nets 0 1 1 0 .. Normal circuit TVS model • Build miter=TVS can propagate U to PO – At least one TVS’s PO==U • If miter is UNSAT, circuit delay < maxDelay - S 01 01 01 PI TVS model Normal circuit miter TVS model Property 1 • Pick slack<=S gates to TVS region all paths with delay>=maxDelay-S are in TVS region Proof: – If a path p with delay d>=maxDelay-S but not in the TVS region – Then p has a gate g not in critical region (slack>S) – The delay of p <= maxDelay-S – conflict! Property 2 TVS model • In TVS region, if path p cannot propagate unknown to PO p is a false path Proof: (ab) = prove ab’ is impossible – Under a input pattern a, path p cannot propagate unknown to PO, that means existing a side input i2 (of a gate f on p) is the controlling value of f – P is a true path, then on-input i1 must be controlling value and is ready before or equal than i2 – Arrival(i2)>=Arrival(i1), Required(i2)=Required(i1) – Slack(i2)<=Slack(i1) – Because i1 is in TVS region i2 also in TVS region – i2’s path p’ cannot propagate unknown to PO…(replace p by p’, repeat) TVS model Property 2 • In TVS region, if path p cannot propagate unknown to PO p is a false path eariler than i2 D0>0 U Unknown U i1 U f Not U D1>=D0-1 c(f) Unknown i1 D2>=D1-1 Unknown i2 f c(f) i2 Terminal case: Dn=0 there is a PI in TVS region with value = c != U Controdiction Non-terminal case: Dn!=0 forever infinite paths impossible Property 2 TVS model • In TVS region, if path p cannot propagate unknown to PO p is a false path – We have proved: under a input pattern a, if a path p cannot propagate U to PO, then p is not true path under a – Hence, if a path p cannot propagate U to PO under any input patterns, p is a false path. Theorem 1 • In TVS region, if no path cannot propagate unknown to PO circuit delay < maxDelay – S – All paths in TVS region are all false paths – All paths with delay>=maxDelay-S are in TVS region – All delay >= maxDelay-S are false paths Property 2 Property 1 TVS model Property 3 • When TVS region can propagate U to PO ! circuit delay >= maxDelay-S – All gates with slack <=S are in TVS region – But delay is not really >= maxDelay-S – For example, maxDelay=5, S=1 contain delay=3 path ! S=1 Theorem 2 S=0 S=1 S=0 S=0 S=0 • When TVS region can propagate U to PO ! circuit delay >= maxDelay-S – All gates with slack <=S are in TVS region Experimental result c17 b02 b01 b06 b03 b09 b08 b10 b13 c880 b07 c1355 s1494 s1488 b11 b04 s1423 c1908 b05 b12 c2670 c3540 c5315 c6288 c5378 c7552 s9234 b14_1 s13207 b14 s15850 b15_1 b21_1 s35932 b20 b21 s38417 s38584 b22_1 b22 b17_0 b17_1 b18 b19 delay gateNum S g*S float_pre float_SAT float my_pre my_SAT TVS combi_pre combi_SAT fast_float 3\3 5\5 6\5 5\5 10\10 9\9 16\16 12\12 20\14 24\24 31\31 24\24 17\17 17\17 34\32 28\27 59\59 40\37 54\42 19\19 32\30 47\46 49\47 124\123 25\25 43\42 58\58 51\51 59\59 60\60 82\81 51\51 65\65 29\26 67\67 68\68 47\40 56\56 67\67 68\68 92\83 51\51 164\159 168\158 13 32 53 64 190 198 204 223 415 469 490 619 686 692 801 803 827 938 1022 1195 1566 1741 2608 2480 3221 3827 6094 7145 9441 10343 11067 13547 14932 19876 20716 21061 25585 22447 22507 30686 33741 41080 117941 237959 0 0 1 0 0 0 0 0 6 0 0 0 0 0 2 1 0 3 12 0 2 1 2 1 0 1 0 0 0 0 1 0 0 3 0 0 7 0 0 0 9 0 5 10 13 32 106 64 190 198 204 223 2905 469 490 619 686 692 2403 1606 827 3752 13286 1195 4698 3482 7824 4960 3221 7654 6094 7145 9441 10343 22134 13547 14932 79504 20716 21061 204680 22447 22507 30686 337410 41080 707646 2617549 0 0 0 0 0 0 0 0 0.01 0 0 0.01 0 0 0.04 0.02 0.01 0.12 0.72 0.01 0.12 0.22 0.25 1.66 0.04 0.25 0.09 0.31 0.16 0.46 0.44 0.39 0.84 0.85 1.15 1.18 2.44 0.34 1.31 1.82 17.18 1.31 52.5 0 0 0 0 0 0 0 0 0.01 0 0 0.01 0.01 0.01 0 0.02 0.02 0.01 0.08 0.49 0 0.08 0.19 0.19 2.1 0.03 0.2 0.07 0.31 0.14 0.46 0.33 0.33 0.8 3.85 1.11 1.12 1.48 0.24 1.26 1.74 10.96 1.03 45.18 0 0 0 0 0 0 0 0 0.01 0.01 0 0.01 0.02 0.01 0 0.06 0.04 0.02 0.2 1.21 0.01 0.2 0.41 0.44 3.76 0.07 0.45 0.16 0.62 0.3 0.92 0.77 0.72 1.64 4.7 2.26 2.3 3.92 0.58 2.57 3.56 28.14 2.34 97.68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0.01 0.01 0.01 0.01 0.02 0.02 0.04 0.04 0.06 0.07 0.09 0.1 0.27 0.17 0.17 0.48 0.15 0.18 0.25 0.8 0.35 1.44 3.04 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.45 0 0 0 0 0 0.01 0.01 0.01 0.01 0.91 0.01 0 0.03 0.01 0.01 0.01 0.05 0.01 1.97 0.15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01(6) 0 0(1) 0.01 0.01(1) 0.46 0.01 0.02 0.02 0.04 0.04 0.07 0.08 0.1 0.11 1.18 0.18 0.17 0.51 0.16 0.19 0.26 0.85(2) 0.36 3.41(4) 3.19(9) 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0 0 0.01 0.27 0 0.01 0.01 0.03 0.07 0.01 0.02 0.03 0.04 0.04 0.06 0.09 0.09 0.11 0.54 0.16 0.16 0.49 0.15 0.18 0.26 1.42 0.36 2.96 10.64 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0 0 0 0 0.2 0 0.01 0 0.01 0.61 0 0.01 0.01 0 0 0 0.02 0 0 2.77 0.01 0 0.05 0.01 0.01 0 0.48 0.01 5.87 54.02 0 0 0 0 0 0 0 0 0 0 0 0.02 0 0 0 0 0 0.01 0.47 0 0.02 0.01 0.04 0.68 0.01 0.03 0.04 0.04 0.04 0.06 0.11 0.09 0.11 3.31 0.17 0.16 0.54 0.16 0.19 0.26 1.9 0.37 8.83 64.66 Experimental result Pre time (Small case) SAT time (Small case) 0.5 0.45 0.4 0.5 my_about 0.45 0.4 fast_float fast_float 0.35 0.35 float 0.3 time (s) time (s) 0.3 my_about 0.25 float 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 Gate Num * (slack+1) Gate Num * (slack+1) Experimental result Pre time (med case) SAT time (med case) 3 4.5 my_about 4 my_about 2.5 3.5 fast_float 2 3 float float time (s) time (s) fast_float 1.5 1 2.5 2 1.5 1 0.5 0.5 0 0 Gate Num * (slack+1) Gate Num * (slack+1) Experimental result Pre time (large case) SAT time (large case) 300 60 my_about my_about 250 50 fast_float fast_float 200 40 float time (s) time (s) float 150 30 100 20 50 10 0 0 b17_0 b18 Gate Num * (slack+1) b19 b17_0 b18 Gate Num * (slack+1) b19 Future Work • Other timing issue problems • Other delay model •… Reference [1] Satisfiability Models and Algorithms for Circuit Delay Computation [2] Efficient Boolean Characteristic Function for Timed Automatic Test Pattern Generation Thank you