COMP 103 Lecture 10 Inverter Dynamics: The Quest for Performance Section 5.4.2, 5.4.3 [All lecture notes are adapted from Mary Jane Irwin, Penn State, which were adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.] COMP103-L10.1 What is this lecture+ about? PERFORMANCE 1. Factors affecting performance: CL , W/L, VDD, 2. The ratio of the PMOS to NMOS could be optimized for symmetrical tpHL & tpLH and symmetric VTC… but here we learn how to set the ratio to optimize tp 3. While sizing up an inverter reduces its delay, it also increase its input capacitance – impacting the delay of the driving gate! (selfOut loading). What’s the best sizing? In 1 Cg,1 4. Now we can size a chain of inverters.. If CL is given - How should the inverters be sized? How many stages are needed to minimize the delay? 5. What about input slope impact (instead of a step)? 6. What about inverters with long wire delays inbetween? COMP103-L10.2 CL = 8 Cg,1 Inverter Propagation Delay, revisited Propagation delay is proportional to the time-constant of the network formed by the pull-down resistor and the load capacitance VDD tpHL = f(Rn, CL) Vout = 0 Rn CL Vin = V DD tpHL = ln(2) Reqn CL = 0.69 Reqn CL tpLH = ln(2) Reqp CL = 0.69 Reqp CL tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2 To equalize rise and fall times make the on-resistance of the NMOS and PMOS approximately equal. COMP103-L10.3 Inverter Propagation Delay, Revisited To see how a designer can optimize the delay of a gate have to expand the Req in the delay equation 5.5 For VGS = VDD, VDS = VDD-> VDD/2 5 tp(normalized) 4.5 tpHL = 0.69 Reqn CL 4 3.5 3 2.5 2 1.5 = 0.69 (3/4 (VDD)/IDSATn ) CL ≈ 0.52 CL / (W/Ln k’n VDSATn ) COMP103-L10.4 1 0.8 1 1.2 1.4 1.6 1.8 VDD (V) 2 2.2 2.4 Derivation of Reqn COMP103-L10.5 Yes…. That should have looked familiar.. Same results with simulated equivalent resistance of a minimum size NMOS transistor Req (Ohm) 7 x105 (for VGS = VDD, VDS = VDD→VDD/2) 6 5 4 3 2 1 0 0.5 1 1.5 2 2.5 VDD (V) VDD(V) NMOS(kΩ) PMOS (kΩ) COMP103-L10.6 1 35 115 1.5 19 55 2 15 38 2.5 13 31 Design for Performance Reduce CL z internal diffusion capacitance of the gate itself - keep the drain diffusion as small as possible z z interconnect capacitance fanout Increase W/L ratio of the transistor z z the most powerful and effective performance optimization tool in the hands of the designer watch out for self-loading! – when the intrinsic capacitance dominates the extrinsic load Increase VDD z z z can trade-off energy for performance increasing VDD above a certain level yields only very minimal improvements reliability concerns enforce a firm upper bound on VDD COMP103-L10.7 NMOS/PMOS Ratio Concerns: z z z symmetrical VTC equal high-to-low and low-to-high propagation delays speed, tp So far have sized the PMOS and NMOS so that the Req’s match (ratio of 2-3) ⇒ symmetric VTC, equal tpHL & tpLH If speed is the only concern, reduce the width of the PMOS device! z What happens? There must be a ratio β = (W/Lp)/(W/Ln) that optimizes tp!! r = Reqp/Reqn (resistance ratio of identically-sized PMOS and NMOS) βopt = √r when wiring capacitance is negligible COMP103-L10.8 Derivation of βopt COMP103-L10.9 PMOS/NMOS Ratio Effects -- Simulation -11 5 x 10 tpLH tp(sec) 4.5 tpHL β of 2.4 (= 31 kΩ/13 kΩ) gives symmetrical response tp 4 β of 1.6 to 1.9 gives optimal performance 3.5 3 1 2 3 β = (W/Lp)/(W/Ln) COMP103-L10.10 4 5 Device Sizing for Performance Divide capacitive load, CL, into z Cint : intrinsic - diffusion and Miller effect Cext : extrinsic - wiring and fanout z where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the gate z tp = tp0 (1 + Cext/Cint) Widening both PMOS and NMOS by a factor S reduces Req by an identical factor (Req = Rref/S), but raises the intrinsic capacitance by the same factor (Cint = SCiref) tp = 0.69 Rref Ciref (1 + Cext/(SCiref)) = tp0(1 + Cext/(SCiref)) z z z tp0 is independent of the sizing of the gate with no load, there is no gain. The drive of the gate is totally offset by the increased capacitance any S sufficiently larger than (Cext/Cint) yields the best performance gains with least area impact COMP103-L10.11 Example of finding S Given a time budget of 4 ps, tp0 = 2ps, Cext=9 ff, Cint = 3ff, determine the smallest S that would allow tp to meet the timing budget. COMP103-L10.12 Sizing Impacts on Delay The majority of the improvement is already obtained for S = 5. Sizing factors larger than 10 barely yield any extra gain (and cost significantly more area). x 10-11 3.8 for a fixed load 3.6 3.4 3.2 tp(sec) 3 2.8 2.6 2.4 2.2 2 1 3 5 7 9 11 13 15 S self-loading effect (intrinsic capacitance dominates) COMP103-L10.13 Can’t study delay in isolation Simplest case studied in an inverter chain… but basics apply (creatively) to other cases… In Out Cg,1 COMP103-L10.14 1 2 N CL Impact of Fanout on Delay Extrinsic capacitance, Cext, is a function of the fanout of the gate - the larger the fanout, the larger the external load. Two stages: First, determine the relationship between input loading Cg and output loading Cint , both are proportional to the gate sizing. Define: γ = Cint /Cg Second, determine the relationship between the Cext and Cg f = Cext/Cg (f is the effective fan-out) tp = tp0 (1 + Cext/ Cint) = Because γ is close to 1 in most processes, the delay of an inverter is a function of the ratio between its external load capacitance and its input gate capacitance: f COMP103-L10.15 Inverter Chain Our goal is to minimize the delay through an inverter chain In Out Cg,1 1 2 N CL The delay through the stages:of the j-th inverter stage is tp, total = ∑tp,j = tp0 ∑ (1 + Cg,j+1/(γCg,j)) COMP103-L10.16 Sizing Inverter Chains: The Questions If Cg,1 and CL is given z Given a fixed number of inverter stages, how should the inverters be sized? z How many stages are needed to minimize the delay? And what sizes should they be? COMP103-L10.17 Sizing the Inverters in the Chain How many unknowns are there? (check with the tp, total equation) Take N-1 partial derivatives, and equate to 0. Result: constraints: z Cg, j+1 / Cg, j = Cg, j / Cg, j -1, with j = 2, .. N The optimum size of each inverter is the geometric mean of its neighbors – Each gate will have the same effective fan-out and the same delay. If each inverter is sized up by the same factor f wrt the preceding gate, then, N N f = √CL/Cg,1 = √F where F represents the overall effective fan-out of the circuit (F = CL/Cg,1) and the minimum delay through the inverter chain is tp = N tp0 (1 + ( √F ) / γ) COMP103-L10.18 Example of Inverter Chain Sizing In Out Cg,1 1 CL = 8 Cg,1 CL/Cg,1 has to be evenly distributed over N = 3 inverters CL/Cg,1 = 8/1 f = COMP103-L10.19