EE141 EE141-Spring 2007 Digital Integrated Circuits Lecture 22 I/O, Power Distribution Adders 1 EE141 EECS141 Announcements Homework 9 has been posted Due Tu. Apr. 24, 5pm Project Phase 4 (Final) Report due Mo. Apr. 30, noon Poster presentations Tu. May 1st, 3-6pm Final Exam Mo. May 14, 5-8pm, 145 McCone EE141 EECS141 2 1 EE141 Class Material Last lecture Timing Clock distribution Today’s lecture I/O Power distribution Intro to adders Reading Chapter 11 3 EE141 EECS141 Impact of Interconnect Design Issues EE141 EECS141 4 2 EE141 Impact of Capacitance Capacitive Crosstalk-Dynamic Node ∆VY = V DD CLK In 1 In 2 In 3 C XY C XY ∆VX CY + C XY Y CY X PDN 2. 5 V 0V CLK 3 x 1 µm overlap: 0.19 V disturbance 5 EE141 EECS141 Capacitive Cross Talk Driven Node 0.5 0.45 0.4 X VX RY CXY 0.3 Y CY tr↑ 0.35 0.25 V τXY = RY(CXY+CY) ( Vo l t ) 0.2 0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 1 t (nsec) Keep time-constant smaller than rise time EE141 EECS141 6 3 EE141 How to Battle Capacitive Crosstalk Shielding wire GND VDD Shielding layer GND Substrate (GND) Avoid large crosstalk cap’s Avoid floating nodes Isolate sensitive nodes Control rise/fall times -> large Do not run wires together on long distances Shield! Differential signaling 7 EE141 EECS141 Delay Degradation Cc - Impact of neighboring signal activity on switching delay - When neighboring lines switch in opposite direction of victim line, delay increases Miller Effect - Both terminals of capacitor are switched in opposite directions (0 → Vdd, Vdd → 0) - Effective voltage is doubled and additional charge is needed (from Q=CV) - Wire length = 100 µm in 0.25 µm results in worst-case 80% tp degradation!8 EE141 EECS141 4 EE141 Driving Large Capacitances V DD V in V out CL • Transistor Sizing • Cascaded Buffers 9 EE141 EECS141 Using Cascaded Buffers In Out 1 2 0.25 µm process Cin = 2.5 fF tp0 = 30 ps N CL = 20 pF F = CL/Cin = 8000 fopt = 3.6 N = 7 tp = 0.76 ns (See Chapter 5) EE141 EECS141 10 5 EE141 Output Driver Design Trade off Performance for Area and Energy Given tpmax find N and f Area f −1 F −1 A = (1 + f + f + ... + f )A = A = A f −1 f −1 min Energy N N −1 2 driver ( Edriver = 1 + f + f 2 + ... + f N −1 min )C V i 2 DD = min F −1 C 2 2 CiVDD ≈ L VDD f −1 f −1 11 EE141 EECS141 Delay as a Function of F and N 10,000 F = 10,000 0 1000 p tp/tp0 t / p t 100 F = 1000 10 1 3 5 7 F = 100 9 11 Number of buffer stages N EE141 EECS141 12 6 EE141 I/O Design 13 EE141 EECS141 Bonding Pad Design Bonding Pad GND 100 µm Out VDD EE141 EECS141 In GND Out 14 7 EE141 ESD Protection When a chip is connected to a board, there is unknown (potentially large) static voltage difference Equalizing potentials requires (large) charge flow through the pads Diodes sink this charge into the substrate – need guard rings to pick it up. 15 EE141 EECS141 Pads + ESD Protection V DD PAD R D1 X D2 C Diode EE141 EECS141 16 8 EE141 Chip Packaging Bonding wire •Bond wires (~25µm) are used to connect the package to the chip Chip L Mounting cavity Lead frame L´ • Pads are arranged in a frame around the chip • Pads are relatively large (~100µm in 0.25µm technology), with large pitch (100µm) Pin •Many chips areas are ‘pad limited’ 17 EE141 EECS141 Pad Frame Layout EE141 EECS141 Die Photo 18 9 EE141 Chip Packaging An alternative is ‘flipchip’: Pads are distributed around the chip The soldering balls are placed on pads The chip is ‘flipped’ onto the package Can have many more pads 19 EE141 EECS141 Power Distribution EE141 EECS141 20 10 EE141 Impact of Resistance We have already learned how to drive RC interconnect Impact of resistance is commonly seen in power supply distribution: IR drop Voltage variations Power supply is distributed to minimize the IR drop and the change in current due to switching of gates 21 EE141 EECS141 RI Introduced Noise V DD f pre I R V DD - ∆ V X M1 I ∆V ∆V R EE141 EECS141 22 11 EE141 Resistance and the Power Distribution Problem After Before • Requires fast and accurate peak current prediction • Heavily influenced by packaging technology EE141 EECS141 23 Source: Cadence Power Distribution Low-level distribution is in Metal 1 Power has to be ‘strapped’ in higher layers of metal. The spacing is set by IR drop, electromigration, inductive effects Always use multiple contacts on straps EE141 EECS141 24 12 EE141 Power and Ground Distribution GND VDD Logic Logic VDD VDD GND GND (a) Finger-shaped network (b) Network with multiple supply pins 25 EE141 EECS141 3 Metal Layer Approach (EV4) 3rd “coarse and thick” metal layer added to the technology for EV4 design Power supplied from two sides of the die via 3rd metal layer 2nd metal layer used to form power grid 90% of 3rd metal layer used for power/clock routing Metal 3 Metal 2 Metal 1 EE141 EECS141 Courtesy Compaq 26 13 EE141 4 Metal Layers Approach (EV5) 4th “coarse and thick” metal layer added to the technology for EV5 design Power supplied from four sides of the die Grid strapping done all in coarse metal 90% of 3rd and 4th metals used for power/clock routing Metal 4 Metal 3 Metal 2 Metal 1 EE141 EECS141 Courtesy Compaq 27 6 Metal Layer Approach – EV6 2 reference plane metal layers added to the technology for EV6 design Solid planes dedicated to Vdd/Vss Significantly lowers resistance of grid Lowers on-chip inductance RP2/Vdd Metal 4 Metal 3 RP1/Vss Metal 2 Metal 1 EE141 EECS141 Courtesy Compaq 28 14 EE141 Electromigration (1) Limits dc-current to 1 mA/µm EE141 EECS141 29 Electromigration (2) EE141 EECS141 30 15 EE141 Decoupling Capacitors 1 Board wiring Bonding wire Cd SUPPLY CHIP 2 Decoupling capacitor Decoupling capacitors are added: On the board (right under the supply pins) On the chip (under the supply straps, near large buffers) 31 EE141 EECS141 Decoupling Capacitors Under EE141 EECS141 the die 32 16 EE141 33 EE141 EECS141 Adders EE141 EECS141 34 17 EE141 a CARRYGEN g64 SUMSEL node1 2-1 Mux 9-1 Mux ck1 SUMGEN + LU b REG 5-1 Mux 9-1 Mux An Intel Microprocessor sum sumb to Cache s0 s1 LU : Logical Unit 1000um Itanium has 6 64-bit integer execution units like this one 35 EE141 EECS141 Bit-Sliced Design Control Bit 2 Bit 1 Data-Out Multiplexer Shifter Adder Register Data-In Bit 3 Bit 0 Tile identical processing elements EE141 EECS141 36 18 EE141 Bit-Sliced Datapath From register files / Cache / Bypass Multiplexers Shifter Adder stage 1 Wiring Loopback Bus Loopback Bus Loopback Bus Wiring Adder stage 2 Bit slice 0 Bit slice 1 Bit slice 2 Bit slice 63 Adder stage 3 Sum Select To register files / Cache EE141 EECS141 37 Itanium Integer Datapath EE141 EECS141 Fetzer, Orton, ISSCC’02 38 19 EE141 Full-Adder A Cin B Full adder Cout Sum 39 EE141 EECS141 The Binary Adder A Cin B Full adder Cout Sum S = A ⊕ B ⊕ Ci = ABC i + ABC i + ABCi + ABCi C o = AB + BCi + ACi EE141 EECS141 40 20 EE141 Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G) = AB Propagate (P) = A ⊕ B Delete = A B Can also derive expressions for S and C o based on D and P Note that we will be sometimes using an alternate definition for Propagate (P) = A + B 41 EE141 EECS141 The Ripple-Carry Adder B0 A0 Ci,0 FA S0 A1 Co,0 (= Ci,1) B1 FA A2 Co,1 S1 B2 FA A3 Co,2 S2 B3 FA Co,3 S3 Worst case delay linear with the number of bits td = O(N) tadder = (N-1)tcarry + tsum Goal: Make the fastest possible carry path circuit EE141 EECS141 42 21 EE141 Complementary Static CMOS Full Adder VDD VDD Ci A A B B A B B Ci A X Ci VDD Ci S A Ci A B B VDD A B Ci A Co B 28 Transistors 43 EE141 EECS141 Inversion Property A Ci FA S EE141 EECS141 A B Co Ci B FA Co S 44 22 EE141 Minimize Critical Path by Reducing Inverting Stages Even cell A0 Ci,0 B0 A1 Co,0 FA B1 FA S0 A2 Co,1 Odd cell B2 Co,2 FA S1 A3 S2 B3 Co,3 FA S3 Exploit Inversion Property 45 EE141 EECS141 A Better Structure: The Mirror Adder VDD VDD A B VDD A B B Kill "0"-Propagate A Ci S Ci A "1"-Propagate B Ci Co Ci A Generate A B B A B Ci A B 24 transistors EE141 EECS141 46 23 EE141 The Mirror Adder •The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carrygeneration circuitry. •When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important. •The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell . •The transistors connected to Ci are placed closest to the output. •Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size. 47 EE141 EECS141 Next Lecture Adders, EE141 EECS141 Multipliers 48 24