Digital Integrated Circuits Lecture 5: Logical Effort Chih-Wei Liu VLSI Signal Processing LAB National Chiao Tung University cwliu@twins.ee.nctu.edu.tw DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 1 Outline RC Delay Estimation Logical Effort Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Example Summary DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 2 RC Delay Model Use equivalent circuits for MOS transistors Ideal switch + capacitance and ON resistance Unit nMOS has resistance R, capacitance C Unit pMOS has resistance 2R (mobility較小), capacitance C Capacitance proportional to width Resistance inversely proportional to width DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3 RC Values Capacitance 假設nMOS/pMOS寄生電容與閘極電容相同 C = Cg = Cs = Cd = 2 fF/μm of gate width Values similar across many processes Resistance R ≈ 6 KΩ*μm in 0.6um process Improves with shorter channel lengths Unit transistors May refer to minimum contacted device (4/2 λ) Or maybe 1 μm wide device Doesn’t matter as long as you are consistent DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 4 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter by using the basic RC delay model of nMOS and pMOS RC model A 2 Y 2 1 1 s kC 2R/k kC g kC d DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 5 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C Y R C C 假設A電位提昇 C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 6 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C 2C Y R A=1 2C C R C C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw C C output 7 Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter 2C R A 2 Y 2 1 1 2C 2C 2C 2C Y R C R C C C C tpd = 6RC 若沒有寄生電容的ideal Inverter則為tpd = 3RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 8 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 9 Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). Step 1 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 10 3-input NAND Caps We still assume C=Cg=Cdiff Annotate the 3-input NAND gate with gate and diffusion capacitance. 2 2 2 3 3 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 11 3-input NAND Caps We assume C=Cg=Cdiff Annotate the 3-input NAND gate with gate and diffusion capacitance. 2C 2 2C 2C Step 2 2C 2 2C 2C 2 2C 3C 3C 3C 2C 2C 3 3 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3C 3C 3C 3C 12 3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. 2 Final result 2 2 3 5C 5C 5C 3 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 9C 3C 3C 13 Elmore Delay Model ON transistors look like resistors Pullup or pulldown network modeled as RC ladder Elmore delay of RC ladder t pd ≈ ∑ Ri −to− sourceCi nodes i = R1C1 + ( R1 + R2 ) C2 + ... + ( R1 + R2 + ... + RN ) C N R1 R2 R3 RN RC ladder C1 C2 C3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw CN 14 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x Y h copies 2-input NAND DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 15 Example: 2-input NAND Estimate rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x 6C Y 4hC h copies 2C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 16 Example: 2-input NAND Estimate worst-case rising and falling propagation delays of a 2-input NAND driving h identical gates. 2 2 A 2 B 2x 6C Y 4hC h copies 2C Worst case rising: 只有一個pMOS開啟 R Y (6+4h)C t pdr = ( 6 + 4h ) RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 17 Example: 2-input NAND Estimate worst-case rising and falling propagation delays of a 2input NAND driving h identical gates. 2 2 A 2 B 2x Y 4hC 6C h copies 2C Worst case falling:A已經為”1”;B的電位開始上升 x R/2 R/2 2C Y (6+4h)C t pdf R R R = ( )2C + ( + )[(6 + 4h)C ] 2 2 2 = (7 + 4h) RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 18 Delay Components Delay has two parts Parasitic delay 6 or 7 RC Independent of load Effort delay 4h RC Proportional to load capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 19 Contamination Delay Best-case (contamination) delay can be substantially less than worst-case propagation delay. Ex: If both inputs fall simultaneously 2 2 A 2 B 2x R R Y (6+4h)C 6C Y 4hC 2C tcdr R = ( )[(6 + 4h)C ] 2 = (3 + 2h) RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 20 Contamination Delay Best-case (contamination) falling delay can be substantially less than worst-case propagation delay. Ex: B已經為”1”;A的電位開始上升 2 2 A 2 B 2x 6C Y 4hC 2C tcdf R R = ( + )[(6 + 4h)C ] 2 2 = (6 + 4h) RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 21 Delay Components Again, Delay has two parts Parasitic delay 3 or 6 RC Independent of load Effort delay 4h or 2h RC Proportional to load capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 22 Diffusion Capacitance 2 3 5C 5C 2 3 5C 2 3 9C 3C 3C We assumed contacted diffusion on every s / d. Recall 3-input NAND Good layout minimizes diffusion area Ex: NAND3 layout shares one diffusion contact Reduces output capacitance by 2C Merged un-contacted diffusion might help too 2C Shared Contacted Diffusion 2C Gate capacitance: 由電晶體的寬度決定 Diffusion capacitance: 與佈局有關 Isolated Contacted Diffusion Merged Uncontacted Diffusion 2 2 2 3 3 3C 3C 3C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3 7C 3C 3C 23 Layout Comparison Which layout is better? VDD A VDD B Y GND A B Y GND DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 24 Introduction Chip designers face a bewildering array of choices What is the best circuit topology for a function? How many stages of logic give least delay? How wide should the transistors be? ??? Logical effort is a method to make these decisions Uses a simple model of delay Allows back-of-the-envelope calculations Helps make rapid comparisons between alternatives Emphasizes remarkable symmetries DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 25 Example Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. A[3:0] A[3:0] 32 bits 16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors 16 Register File 16 words 4:16 Decoder Decoder specifications: Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 26 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ τ = 3RC 沒有寄生電容的理想反向器 ≈ 12 ps in 180 nm process Normalized delay representation DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 40 ps in 0.6 μm process 27 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ 無關於製程型態的方式來表示Delay,可使電路能 依照佈局來做Delay的比較,而不必考慮不同製程 對電路速度的影響 Delay has two components Linear delay model d= f +p DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 28 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ Delay has two components d= f +p Effort delay: f = gh (a.k.a. stage effort) Again has two components DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 29 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ Delay has two components d= f +p Effort delay f = gh (a.k.a. stage effort) Again has two components g: logical effort Gate Complexity ↑, g ↑ i.e. 會耗費較多的時間去驅 動相同fanout數的電晶體 Measures relative ability of gate to deliver current g ≡ 1 for inverter DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 30 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ Delay has two components d= f +p Effort delay f = gh (a.k.a. stage effort) Again has two components h: electrical effort = Cout / Cin Fan-out ↑, h ↑ Ratio of output to input capacitance Sometimes called fanout DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 31 Delay in a Logic Gate Express delays in process-independent unit d= d abs τ Delay has two components d= f +p Parasitic delay p Represents delay of gate driving no load Set by internal parasitic capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 32 Delay in NAND2 Effort delay: 4hC (與外部電容 有關) Delay in 2-input NAND d= f+p 2 2 A 2 B 2x 6C Y 4hC 2C Normalized delay representation 4hC = +2 3C 4 = ( ) h + 2 = gh + p 3 註:Ideal Inverter (無寄生電容、 Single Fanout) τ = 3RC;g = 1 How to calculate the logical effort g ? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 33 Delay Plots =f+p = gh + p 2-input NAND 6 Normalized Delay: d d Inverter g= p= d= 5 g= p= d= 4 3 2 1 0 0 1 2 3 4 5 Electrical Effort: h = Cout / Cin DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 34 Delay Plots =f+p = gh + p What about NOR2? 2-input NAND 6 Normalized Delay: d d Inverter 5 g=1 p=1 d=h+1 4 3 g = 4/3 p=2 d = (4/3)h + 2 Effort Delay: f 2 1 Parasitic Delay: p 0 0 1 2 3 4 5 Electrical Effort: h = Cout / Cin DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 35 Computing Logical Effort DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay vs. fanout plots Or estimate by counting transistor widths 2 Y 2 A 2 Y 1 Cin = 3 g = 3/3 A 2 B 2 Cin = 4 g = 4/3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw A 4 B 4 Y 1 1 Cin = 5 g = 5/3 36 Catalog of Gates Logical effort of common gates Gate type Number of inputs 1 2 3 4 n NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 2 2 2 2 4, 4 6, 12, 6 8, 16, 16, 8 Inverter Tristate / mux XOR, XNOR 1 2 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 37 Recall: 4:1 Multiplexer 4:1 mux chooses one of 4 inputs using two selects S1S0 S1S0 S1S0 S1S0 D0 Logical Effort =2;與輸入的 數目無關 D1 D2 大型多工器的速度與小型多工器 的速度相當? D3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 38 Computing Parasitic Delay DEF: Represents delay of gate driving no load Set by internal parasitic capacitance Measure from delay vs. fanout plots Or estimate by counting total diffusion capacitances at output 2 Y 2 A 2 Y 1 C=3 p = 3/3 = 1 A 2 B 2 C=6 p = 6/3 = 2 A 4 B 4 Y 1 1 C=6 p = 6/3 =2 增加電晶體大小會降低電阻,但卻會增加寄生電容,但較 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 寬的電晶體可以摺疊,使內部的寄生電容可能變小 39 Catalog of Gates Parasitic delay of common gates In multiples of pinv (≈1) Gate type Number of inputs 1 2 3 4 n NAND 2 3 4 n NOR 2 3 4 n 4 6 8 2n 4 6 8 Inverter Tristate / mux XOR, XNOR 1 2 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 40 Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: g= h= p= d= DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 41 Example1: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: g=1 h=4 p=1 d = gh + p = 5 The FO4 delay is about 200 ps in 0.6 μm process 60 ps in a 180 nm process f/3 ns in an f μm process 經驗法則 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 42 Example2: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Odd-number stages Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay: Frequency: g=1 31 stage ring oscillator in h=1 0.6 μm process has p=1 frequency of ~ 200 MHz d=2 fosc = 1/(2*N*d) = 1/4N 如果g ≠ 1,那如何計算多級邏輯網路的延遲呢? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 43 Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort G= gi 與電晶體大小無關 ∏ Cout-path Path Electrical Effort H= Path Effort F = ∏ f i = ∏ gi hi 10 x g1 = 1 h1 = x/10 g2 = 5/3 h2 = y/x Cin-path y g3 = 4/3 h3 = z/y 與電晶體大小有關 z g4 = 1 h4 = 20/z 20 F=GH=(5/3)(4/3)(x/10)(y/x)(z/y)(20/z)=G(20/10) DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 44 Multistage Logic Networks Logical effort generalizes to multistage networks G= gi Path Logical Effort ∏ Cout-path Path Electrical Effort H= Path Effort F = ∏ f i = ∏ gi hi Can we always write F = GH? Cin-path DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 45 Branch Paths No! Consider paths that branch: 15 G H GH h1 h2 F = = = = = = GH? 90 5 15 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 90 46 Paths that Branch No! Consider paths that branch: 15 G H GH h1 h2 F =1 = 90 / 5 = 18 = 18 = (15 +15) / 5 = 6 = 90 / 15 = 6 = g1g2h1h2 = 36 = 2GH 90 5 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 15 90 47 Branching Effort Branching effort Accounts for branching between stages in path b= Con path + Coff path Con path B = ∏ bi Note: Path branching effort ∏h i = BH Now we compute the path effort F = GBH DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 48 Multistage Delays Path Effort Delay DF = ∑ f i Path Parasitic Delay P = ∑ pi Path Delay D = ∑ d i = DF + P Recall that F = GBH 各級path effort的乘積為F;path delay則為各級path effort delay的總和 Î最小值發生在各級的path effort delay相同,亦即 fi = F1/N DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 49 Designing Fast Circuits D = ∑ d i = DF + P Delay is smallest when each stage bears same effort fˆ = gi hi = F 1 N Thus minimum delay of N stage path is 1 N D = NF + P This is a key result of logical effort Find fastest possible delay 這種方式優於模擬 Doesn’t require calculating gate sizes DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 50 Choose Gate Sizes to Meet the Least Delay Design How wide should the gates be for least delay? ˆf = gh = g Cout Cin gi Couti ⇒ Cini = fˆ Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 51 Example: 3-stage path Select gate sizes x and y for least delay from A to B x x A 8 x DIC-Lec5 cwliu@twins.ee.nctu.edu.tw y 45 y B 45 52 Example: 3-stage path x x A 8 x Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay y 45 y B 45 G= H= B= F= fˆ = P= D= DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 53 Example: 3-stage path x x A 8 x Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay y 45 y B 45 G = (4/3)*(5/3)*(5/3) = 100/27 H = 45/8 B=3*2=6 F = GBH = 125 fˆ = 3 F = 5 P=2+3+2=7 D = 3*5 + 7 = 22 = 4.4 FO4 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 54 Example: 3-stage path Work backward for sizes y= x= x x A 8 x y 45 y DIC-Lec5 cwliu@twins.ee.nctu.edu.tw B 45 55 Example: 3-stage path fˆ = gh = g CCoutin gi Couti ⇒ Cini = fˆ Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15+15) * (5/3) / 5 = 10 45 A P: 4 N: 4 P: 4 N: 6 P: 12 N: 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw B 45 56 Example: 3-stage path Check the result: 1. the size of the NAND gate: (10+10+10)*(4/3) / 5 = 8 2. NAND gate delay: d1=g1h1+p1= (4/3)*(10+10+10) / 8 + 2 = 7 3. NAND3 gate delay: d2=g2h2+p2= (5/3)*(15+15) / 10 + 3 = 8 4. NOR gate delay: d3=g3h3+p3= (5/3)*(45/15) + 2 = 7 Î7+8+7 = 22 45 A P: 4 N: 4 P: 4 N: 6 P: 12 N: 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw B 45 57 Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter 1 InitialDriver D 1 1 1 = DatapathLoad N: f: D: 64 1 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 64 2 64 3 64 4 58 Best Number of Stages Example: drive 64-bit datapath with unit inverter InitialDriver D 1 = NF1/N + P = N(64)1/N + N 1 1 1 8 4 2.8 16 8 23 DatapathLoad N: f: D: 64 1 64 65 64 2 8 18 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 64 3 4 15 64 4 2.8 15.3 Fastest 59 Derivation for Best Number of Stages Consider adding inverters to end of path How many give least delay? n1 D = NF + ∑ pi + ( N − n1 ) pinv 1 N Logic Block: n1Stages Path Effort F N - n1 ExtraInverters i =1 1 1 1 ∂D N N N = − F ln F + F + pinv = 0 ∂N Define best stage effort ρ=F 1 N pinv + ρ (1 − ln ρ ) = 0 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 60 Best Number of Stages pinv + ρ (1 − ln ρ ) = 0 has no closed-form solution Neglecting parasitics (pinv = 0), we find ρ = 2.718 (e) For pinv = 1, solve numerically for ρ = 3.59 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 61 Sensitivity Analysis How sensitive is delay to using exactly the best number 1.6 1.51 of stages? D(N) /D(N) 1.4 1.26 1.2 1.15 If pinv = 1 1.0 Nˆ = log ρ F (ρ =2.4) (ρ=6) 0.0 0.5 0.7 1.0 1.4 2.0 N/ N 2.4 < ρ < 6 gives delay within 15% of optimal We can be sloppy! I like ρ = 4 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 62 Example, Revisited Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. A[3:0] A[3:0] 32 bits 16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors 16 Register File 16 words 4:16 Decoder Decoder specifications: Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 63 Possible 4:16 Decoder Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors Each input may drive 10 unit-sized transistors A[3] A[3] 10 10 A[2] A[2] 10 10 A[1] A[1] 10 10 A[0] A[0] 10 10 y z word[0] 96 units of wordline capacitance y z DIC-Lec5 cwliu@twins.ee.nctu.edu.tw word[15] 64 Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H = (32*3) / 10 = 9.6 Branching Effort: B=8 If we neglect logical effort (assume G = 1) Path Effort: F = GBH = 76.8 Number of Stages: N = log4F = 3.1 Try a 3-stage design DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 65 Gate Sizes & Delay Logical Effort: G = 1 * 6/3 * 1 = 2 Path Effort: F = GBH = 154 落在2.4與6之間 fˆ = F 1/ 3 = 5.36 Stage Effort: Path Delay: D = 3 fˆ + 1 + 4 + 1 = 22.1 Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7 A[3] A[3] 10 10 A[2] A[2] 10 10 A[1] A[1] 10 10 A[0] A[0] 10 10 y z word[0] 96 units of wordline capacitance y z word[15] DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 66 Comparison Compare many alternatives with a spreadsheet Design N G P D NAND4-INV 2 2 5 29.8 NAND2-NOR2 2 20/9 4 30.1 INV-NAND4-INV 3 2 6 22.1 NAND4-INV-INV-INV 4 2 7 21.1 NAND2-NOR2-INV-INV 4 20/9 6 20.5 NAND2-INV-NAND2-INV 4 16/9 6 19.7 INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4 NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 67 Review of Definitions Term Stage Path number of stages 1 N logical effort g G = ∏ gi electrical effort h= Cout Cin H= branching effort b= Con-path + Coff-path Con-path B = ∏ bi effort f = gh F = GBH effort delay f DF = ∑ f i parasitic delay p P = ∑ pi delay d= f +p DIC-Lec5 cwliu@twins.ee.nctu.edu.tw Cout-path Cin-path D = ∑ d i = DF + P 68 Method of Logical Effort 1) 2) 3) 4) 5) 6) Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort F = GBH N = log 4 F 1 N D = NF + P ˆf = F N1 gi Couti Cini = fˆ Find gate sizes DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 69 Limits of Logical Effort Chicken and egg problem Need path to compute G But don’t know number of stages without G Simplistic delay model Interconnect Neglects input rise time effects Iteration required in designs with wire Maximum speed only Not minimum area/power for constrained delay DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 70 Summary Logical effort is useful for thinking of delay in circuits Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesn’t mean faster paths Delay of path is about log4F FO4 inverter delays Inverters and NAND2 best for driving large caps Provides language for discussing fast circuits But requires practice to master DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 71