CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 10: The Inverter, A Dynamic View [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L10 S.1 Heads up This lecture Inverter dynamic view - Reading assignment – Rabaey, et al, 5.4.1-5.4.3 Next lecture Designing fast logic - Reading assignment – Rabaey, et al, 6.2.1, 9.2.2, 9.2.3 Sp12 CMPEN 411 L10 S.2 Inverter Propagation Delay Propagation delay is proportional to the time-constant of the network formed by the pull-down (or pull-up) resistor and the load capacitance VDD tpHL = f(Rn, CL) Vout = 0 Rn Vin = V DD CL tpHL = ln(2) Reqn CL = 0.69 Reqn CL tpLH = ln(2) Reqp CL = 0.69 Reqp CL tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2 To equalize rise and fall times make the on-resistance of the NMOS and PMOS approximately equal. Sp12 CMPEN 411 L10 S.3 Inverter Transient Response VDD=2.5V 0.25m W/Ln = 1.5 W/Lp = 4.5 Reqn= 13 k ( 1.5) Reqp= 31 k ( 4.5) 3 Vin 2.5 2 1.5 1 0.5 0 -0.5 0 0.5 1 1.5 t (sec) Sp12 CMPEN 411 L10 S.4 2 2.5 x 10-10 Inverter Transient Response VDD=2.5V 0.25m W/Ln = 1.5 W/Lp = 4.5 Reqn= 13 k ( 1.5) Reqp= 31 k ( 4.5) 3 Vin 2.5 2 1.5 tf tpHL 1 tpLH tr tpHL = 36 psec tpLH = 29 psec 0.5 so 0 tp = 32.5 psec -0.5 0 0.5 1 1.5 t (sec) Sp12 CMPEN 411 L10 S.5 2 2.5 x 10-10 Design for Performance Increase W/L ratio of the transistor Reduce CL the most powerful and effective performance optimization tool in the hands of the designer watch out for self-loading! – when the intrinsic capacitance dominates the extrinsic capacitance keep drain diffusions small limit interconnect capacitance limit fan-out Increase VDD trade-off energy for performance increasing VDD above a certain level yields minimal improvements reliability concerns enforce a firm upper bound on VDD Sp12 CMPEN 411 L10 S.7 5.5 5 4.5 4 3.5 3 2.5 2 1.5 1 0.8 1 1.2 1.4 1.6 1.8 VDD (V) 2 2.2 2.4 Impacts of NMOS/PMOS Ratio So far have sized the PMOS and NMOS so that the Req’s match (ratio ~ 3) symmetrical VTC equal high-to-low and low-to-high propagation delays If speed is the only concern, reduce the width of the PMOS device! widening the PMOS degrades the tpHL due to larger intrinsic capacitance = (W/Lp)/(W/Ln) r = Reqp/Reqn (resistance ratio of identically-sized PMOS and NMOS) opt = r when wiring capacitance is negligible Sp12 CMPEN 411 L10 S.8 PMOS/NMOS Ratio Effects 5 x 10 -11 = (W/Lp)/(W/Ln) tpLH 4.5 tpHL of 2.4 (= 31 k/13 k) gives symmetrical response tp 4 of 1.6 to 1.9 gives optimal performance 3.5 3 1 2 3 = (W/Lp)/(W/Ln) Sp12 CMPEN 411 L10 S.9 4 5 Device Sizing for Performance Divide capacitive load, CL, into Cint : intrinsic - diffusion and Miller effect (Cg) Cext : extrinsic - wiring and fanout tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint) where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the gate Sp12 CMPEN 411 L10 S.10 Device Sizing for Performance Divide capacitive load, CL, into Cint : intrinsic - diffusion and Miller effect (Cg) Cext : extrinsic - wiring and fanout tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint) where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the gate Widening both PMOS and NMOS by a factor S Req1 = ________, Cint1= _________ tp = __________________________________ tp0 is independent of the sizing of the gate; with no load the drive of the gate is totally offset by the increased capacitance any S sufficiently larger than (Cext/Cint) yields the best performance gains with least area impact Sp12 CMPEN 411 L10 S.11 Sizing Impacts on Delay The majority of the improvement is already obtained for S = 5. Sizing factors larger than 10 barely yield any extra gain (and cost significantly more area). x 10-11 3.8 for a fixed load 3.6 3.4 3.2 3 2.8 2.6 2.4 2.2 2 1 3 5 7 9 S Sp12 CMPEN 411 L10 S.12 11 13 15 self-loading effect (intrinsic capacitance dominates) Impact of Fanout on Delay tp = tp0 (1 + Cext/Cint ) Extrinsic capacitance, Cext, is a function of the fanout of the gate - the larger the fanout, the larger the external load. First determine the input loading effect of the inverter. Both Cg and Cint are proportional to the gate sizing, so Cint = Cg , is independent of gate sizing and tp = tp0 (1 + Cext/ Cg) = tp0 (1 + f/) The delay of an inverter is a function of the ratio between its external load capacitance and its input gate capacitance, or the gate’s effective fan-out f f = Cext/Cg Sp12 CMPEN 411 L10 S.13 Inverter Chain Real goal is to minimize the delay through an inverter chain In Out Cg,1 1 2 N CL the delay of the j-th inverter stage is tp,j = tp0 (1 + Cg,j+1/(Cg,j)) = tp0(1 + fj/ ) and tp = tp1 + tp2 + . . . + tpN so tp = tp,j = tp0 (1 + Cg,j+1/(Cg,j)) If CL is given How should the inverters be sized? How many stages are needed to minimize the delay? Sp12 CMPEN 411 L10 S.14 Sizing the Inverters in the Chain of N inverters The optimum size of each inverter is the geometric mean of its neighbors – meaning that if each inverter is sized up by the same factor f wrt the preceding gate, it will have the same effective fan-out and the same delay N N f = CL/Cg,1 = F where the overall effective fan-out of the circuit is F = CL/Cg,1 and the minimum delay through the inverter chain is N tp = N tp0 (1 + ( F ) / ) The relationship between tp and F is linear for one inverter, square root for two, etc. Sp12 CMPEN 411 L10 S.15 Example of Inverter Chain Sizing In Out Cg,1 1 CL = 8 Cg,1 CL/Cg,1 has to be evenly distributed over N = 3 inverters F = CL/Cg,1 = 8/1 f = Sp12 CMPEN 411 L10 S.16 Example of Inverter Chain Sizing In Out Cg,1 1 f=2 f2 = 4 CL = 8 Cg,1 CL/Cg,1 has to be evenly distributed over N = 3 inverters F = CL/Cg,1 = 8/1 3 f = 8 = 2 Sp12 CMPEN 411 L10 S.17 Determining N: Optimal Number of Inverters What is the optimal value for N given F (= fN) ? if the number of stages is too large, the intrinsic delay of the stages becomes dominate if the number of stages is too small, the effective fan-out of each stage becomes dominate The optimum N is found by differentiating the minimum delay expression divided by the number of stages and setting the result to 0, giving N N + F - ( F ln(F))/N = 0 and f = e(1 + /f) For = 0 (ignoring self-loading) N = ln(F) and the effective-fan out (tapering factor) is f = e = 2.718 For = 1 (the typical case) N = ln(F) - 1 and the effective fan-out (tapering factor) is f = 3.6 Sp12 CMPEN 411 L10 S.18 Optimum Effective Fan-Out 5 7 6 4.5 5 4 4 3.5 3 2 3 1 2.5 0 0 0.5 1 1.5 2 2.5 3 1 1.5 2 2.5 3 3.5 4 4.5 f Choosing f larger than optimum has little effect on delay and reduces the number of stages (and area). So it is common practice to use f = 4 (for = 1) and reduce N Too many stages has a substantial negative impact on delay Sp12 CMPEN 411 L10 S.19 5 Example of Inverter (Buffer) Staging F = 64 N = ln(F) – 1 = 3.16 1 Cg,1 = 1 Cg,1 = 1 16 Cg,1 = 1 1 tp 1 64 65 2 8 18 3 4 15 4 2.8 15.3 CL = 64 Cg,1 4 1 f CL = 64 Cg,1 8 1 N 2.8 Cg,1 = 1 Sp12 CMPEN 411 L10 S.20 CL = 64 Cg,1 8 22.6 CL = 64 Cg,1 Impact of Buffer Staging for Large CL F ( = 1) 10 Unbuffered Two Stage Chain 11 8.3 Opt. Inverter Chain 8.3 100 101 22 16.5 1,000 1001 65 24.8 10,000 10,001 202 33.1 Impressive speed-ups with optimized cascaded inverter chain for very large capacitive loads. Sp12 CMPEN 411 L10 S.21 Input Signal Rise/Fall Time In reality, the input signal changes gradually (and both PMOS and NMOS conduct for a brief time). This affects the current available for charging/discharging CL and impacts propagation delay. x 10-11 5.4 5.2 5 4.8 4.6 4.4 4.2 tp increases linearly with increasing input slope, ts, once ts > tp 4 3.8 3.6 0 ts is due to the limited driving capability of the preceding gate Sp12 CMPEN 411 L10 S.22 2 4 6 8 x 10-11 ts(sec) for a minimum-size inverter with a fan-out of a single gate Design Challenge A gate is never designed in isolation: its performance is affected by both the fan-out and the driving strength of the gate(s) feeding its inputs. tip = tistep + ti-1step Keep signal rise times smaller than or equal to the gate propagation delays. ( 0.25) good for performance good for power consumption Keeping rise and fall times of the signals small and of approximately equal values is one of the major challenges in high-performance designs - slope engineering. Sp12 CMPEN 411 L10 S.23 Delay with Long Interconnects When gates are farther apart, wire capacitance and resistance can no longer be ignored. (rw, cw, L) Vin cint Vout cfan tp = 0.69RdrCint + (0.69Rdr+0.38Rw)Cw + 0.69(Rdr+Rw)Cfan where Rdr = (Reqn + Reqp)/2 = 0.69Rdr(Cint+Cfan) + 0.69(Rdrcw+rwCfan)L + 0.38rwcwL2 Wire delay rapidly becomes the dominate factor (due to the quadratic term) in the delay budget for longer wires. Sp12 CMPEN 411 L10 S.24 Next Lecture and Reminders Next lecture Designing fast logic - Reading assignment – Rabaey, et al, 6.2.1,9.2.2,9.2.3 Sp12 CMPEN 411 L10 S.25