EE141- Spring 2003 Lecture 14 CMOS Logic EE141 Static Complementary CMOS VDD … In1 In2 PUN InN F(In1,In2,…InN) … In1 In2 InN PMOS only PDN NMOS only PUN and PDN are dual logic networks EE141 1 Standard Cells N Well VDD 2λ Cell height 12 metal tracks Metal track is approx. 3λ + 3λ Pitch = repetitive distance between objects Cell height is “12 pitch” In Out GND Cell boundary Rails ~10λ EE141 Standard Cells VDD 2-input NAND gate VDD B A B Out A GND EE141 2 Multi-Fingered Transistors One finger Two fingers (folded) Less capacitance, Less resistance EE141 Stick Diagrams Contains no dimensions Represents relative positions of transistors VDD VDD Inverter NAND2 Out Out In GND GND A B EE141 3 Stick Diagrams Logic Graph A j X C C B X = C • (A + B) C i X i A PUN B VDD j A B PDN GND A B C EE141 Two Versions of C • (A + B) A C B A B C VDD VDD X GND X GND EE141 4 Consistent Euler Path X C i X B VDD j A A B C GND EE141 OAI22 Logic Graph A C B D X C A D B C D X = (A+B)•(C+D) VDD X B A B C D PUN A GND PDN EE141 5 Example: x = ab+cd x x c b VDD x a c b VD D x a d GND d GND (a) Logic graphs for (ab+cd) (b) Euler Paths {a b c d} VD D x GND a b c d (c) stick diagram for ordering {a b c d} EE141 CMOS Properties Full rail-to-rail swing; high noise margins Logic levels not dependent upon the relative device sizes; ratioless Always a path to Vdd or Gnd in steady state; low output impedance Extremely high input resistance; nearly zero steady-state input current No direct path steady state between power and ground; no static power dissipation Propagation delay function of load capacitance and resistance of transistors EE141 6 VTC of Complementary CMOS Gates V DD A M3 B 3.0 M4 A ⫽ B ⫽ 0→1 2.0 F ,V A M2 int B M1 A ⫽ 1, B ⫽ 0→1 o u t V 1.0 B ⫽ 1, A ⫽ 0→1 0.0 0.0 1.0 2.0 3.0 Vin, V EE141 Body Effect EE141 7 Switch Delay Model Req A A Rp Rp Rp A B B Rp Rp A Rn CL Rn B A Rn Cint A Cint A CL Rn Rn A B CL EE141 Input Pattern Effects on Delay Rp A Rp B Rn CL B Delay is dependent on the pattern of inputs Low to high transition » both inputs go low – delay is 0.69 Rp/2 CL » one input goes low Rn A Cint – delay is 0.69 Rp CL High to low transition » both inputs go high – delay is 0.69 2Rn CL EE141 8 Delay Dependence on Input Patterns 3 A=B=1→0 2.5 A=1, B=1→0 Voltage [V] 2 Delay (psec) A=B=0→1 69 A=1, B=0→1 62 A= 0→1, B=1 50 A=B=1→0 35 A=1, B=1→0 76 A= 1→0, B=1 57 VDD 1.5 A=1→0, B=1 A M3 B M4 1 F A M2 int 0.5 B M1 0 -0.5 Input Data Pattern 0 100 200 300 400 time [ps] NMOS = 0.5µm/0.25 µm PMOS = 0.75µm/0.25 µm CL = 100 fF EE141 Transistor Sizing Rp A Rp Rp B Rn B Rp CL A B Rn A Cint Cint Rn Rn A B CL EE141 9 Transistor Sizing a Complex CMOS Gate A B 8 12 C 8 12 4 6 D 4 6 OUT = D + A • (B + C) A D 2 1 B 2C 2 EE141 Fan-In Considerations A B C D A CL B C3 C C2 D C1 Distributed RC model (Elmore delay) tpHL = 0.69 Reqn(C1+2C2+3C3+4CL) Propagation delay deteriorates rapidly as a function of fan-in – quadratically in the worst case. EE141 10 tp as a Function of Fan-In 1250 quadratic tp (psec) 1000 Gates with a fan-in greater than 4 should be avoided. 750 tpHL 500 250 tp tpLH linear 0 2 4 6 8 10 12 14 16 fan-in EE141 tp as a Function of Fan-Out tpNOR2 tpNAND2 tpINV tp (psec) 2 All gates have the same drive current. Slope is a function of “driving strength” 4 6 8 10 12 14 16 eff. fan-out EE141 11 tp as a Function of Fan-In and Fan-Out Fan-in: quadratic due to increasing resistance and capacitance Fan-out: each additional fan-out gate adds two gate capacitances to CL tp = a1FI + a2FI2 + a3FO EE141 Fast Complex Gates: Design Technique 1 Transistor sizing » as long as fan-out capacitance dominates Progressive sizing InN CL MN In3 M3 C3 In2 M2 C2 In1 M1 C1 Distributed RC line M1 > M2 > M3 > … > MN (the fet closest to the output is the smallest) Can reduce delay by more than 20%; decreasing gains as technology shrinks EE141 12 Fast Complex Gates: Design Technique 2 Transistor ordering critical path In3 1 M3 charged CL In2 1 M2 C2 charged In1 M1 0→1 C1 charged delay determined by time to discharge CL, C1 and C2 critical path 0→1 In1 M3 charged CL In2 1 M2 C2 discharged In3 1 M1 C1 discharged delay determined by time to discharge CL EE141 Fast Complex Gates: Design Technique 3 Alternative logic structures F = ABCDEFGH EE141 13 Fast Complex Gates: Design Technique 4 Isolating fan-in from fan-out using buffer insertion CL CL EE141 Fast Complex Gates: Design Technique 5 Reducing the voltage swing tpHL = 0.69 (3/4 (CL VDD)/ IDSATn ) = 0.69 (3/4 (CL Vswing)/ IDSATn ) » linear reduction in delay » also reduces power consumption But the following gate is much slower! Or requires use of “sense amplifiers” on the receiving end to restore the signal level (memory design) EE141 14 Sizing Logic Paths for Speed Frequently, input capacitance of a logic path is constrained Logic has to drive some capacitance Example: ALU load in an Intel’s microprocessor is 0.5pF How do we size the ALU datapath to achieve maximum speed? We have already solved this for the inverter chain – can we generalize it for any type of logic? EE141 Logical Effort for Inverter Chain In Out 1 2 CL N D = D1 + D2 + …+ DN Di How do we extend this to any logic network? C ~ τ 0 1 + i +1 γCi Delay = ∑ N i =1 Ti ~ τ 0 ∑ 1 + CγC i =1 N i +1 i EE141 15 Logical Effort gf γ Delay = k ⋅τ 0 p + p – parasitic delay - gate parameter ≠ f(W) g – logical effort – gate parameter ≠ f(W) f – electrical effort (effective fanout) Normalize everything to an inverter: ginv =1, pinv = 1 Everything is measured in unit delays τ0 EE141 Delay in a Logic Gate Gate delay: d=h+p effort delay intrinsic delay Effort delay: h=gf logical effort effective fanout = Cout/Cin Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size EE141 16 Buffer Example In Out 1 Delay = 2 ∑ pi i =1 N N + gi ⋅ fi γ CL (in units of τ0) pi, gi are constant (and equal to 1) Variables are fi Minimum delay is when fi’s are equal (each stage bears the same effort) EE141 Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity EE141 17 Calculating Logical Effort Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current g=1 g = 4/3 g = 5/3 EE141 Normalized delay (d) Logical Effort of Gates g=4/3 p=2 d=(4/3)f+2 t pNAND t pINV g=1 p=1 d=f+1 F(Fan-in) 1 2 3 4 5 Fan-out (f) 6 7 EE141 18 Logical Effort EE141 Add Branching Effort Branching effort: b= Con − path + Coff − path Con − path EE141 19 Multistage Networks Delay = ∑ pi i =1 N + gi ⋅ fi γ Stage effort: hi = gifi Path electrical effort: F = Cout/Cin Path logical effort: G = g1g2…gN Branching effort: B = b1b2…bN Path effort: H = GFB Path delay D = Σdi = Σpi + Σfigi/γ EE141 Optimum Effort per Stage When each stage bears the same effort: hˆ N = H hˆ = N H Effective fanout of each stage: f = hˆ g i i Minimum path delay Dˆ = ∑ gi fi γ NH 1/ N γ + pi = +P EE141 20 Example: Optimize Path f F=2 G = 20/9 H = 40/9 h =1.45 x =10f1 = 10h/g1=14.5 y = 14.5(1.45*3/5)=12.6 z = 12.6(1.45*3/4)=13.7 Assume that size factors relate to gate with some input cap as inverter From David Harris EE141 Multi-level logic: What is best? g = 10/3 g =1 G = 10/3 g = 2 g =5/3 G = 10/3 g =10/3 g=5/3 g=4/3 g=1 G = 80/27 EE141 21 Handling Wires & Fixed Loads CL Cw Delay = ∑ N i =1 pi + g i ⋅ ( f i + wi ) γ EE141 Summary Logical Effort Stage g Electrical Effort f = Cout/Cin Branching Effort n/a Effort Effort Delay h = fg h Number of Stages 1 Parasitic Delay p Delay d=h+p Path G = ∏ gi F = Cout / Cin B = ∏ bi H = FGB DH = N P= ∑h i ∑p D = DH + P i EE141 22 Method of Logical Effort Compute the path effort: H = GBF Find the best number of stages N ~ log4H Compute the stage effort h = H1/N Sketch the path with this number of stages Work either from either end, find sizes: Cin = Cout*g/h Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999. EE141 23