COMP 103 Lecture 13 Adder Design Reading: Chapter 11, up to page. 577 [All lecture notes are adapted from Mary Jane Irwin, Penn State, which were adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.] COMP103- L13 Adder Design.1 Review: Basic Building Blocks Datapath z Execution units z Register file and pipeline registers Multiplexers, decoders - Adder, multiplier, divider, shifter, etc. z Control z Interconnect z Finite state machines (PLA, ROM, random logic) Switches, arbiters, buses Memory z Caches (SRAMs), TLBs, DRAMs, buffers COMP103- L13 Adder Design.2 The 1-bit Binary Adder Cin A B 1-bit Full Adder (FA) S Cout G = A&B P=A⊕B K = !A & !B A B Cin Cout S carry status 0 0 0 0 0 kill 0 0 1 0 1 kill 0 1 0 0 1 propagate 0 1 1 1 0 propagate 1 0 0 0 1 propagate 1 0 1 1 0 propagate 1 1 0 1 0 generate 1 1 1 1 1 generate S = A ⊕ B ⊕ Cin = P ⊕ Cin Cout = A&B | A&Cin | B&Cin (majority function) = G | P&Cin How can we use it to build a 64-bit adder? How can we modify it easily to build an adder/subtractor? How can we make it better (faster, lower power, smaller)? COMP103- L13 Adder Design.3 FA Gate Level Implementations The way you learned to design digital logic A B Cin Cout S COMP103- L13 Adder Design.4 Review: Pass Transistor Logic B B B B B A A A F=A+B B F=AB B A A F=AB B A F=A⊕B A F=A+B B F=A⊕B A OR/NOR AND/NAND B XOR/XNOR COMP103- L13 Adder Design.5 PT FA !B !Cin B Cin A !S !A S B !B Cin A !Cin !Cout B Cin !A Cout !Cin !B 20+8 transistors, dual rail – beware of threshold drops COMP103- L13 Adder Design.6 Review: XOR FA Cin A S B Cout 16 transistors COMP103- L13 Adder Design.7 Complimentary Static CMOS Full Adder Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin) VDD VDD Ci A A B B A B Ci A B VDD X Ci Ci A S Ci A B B VDD A B Ci Co 28 Transistors COMP103- L13 Adder Design.8 A B Mirror Adder 24+4 transistors B B A B A B Cin A kill 0-propagate A A 1-propagate B A Cin !Cout Cin generate B A !S Cin B A Cin B Cout = A&B | B&Cin | A&Cin SUM = A&B&Cin | COUT&(A | B | Cin) COMP103- L13 Adder Design.9 Inversion Property Inverting all inputs to a FA results in inverted values for all outputs A Cout B FA A Cin ≡ Cout S B FA S !S (A, B, Cin) = S(!A, !B, !Cin) !Cout (A, B, Cin) = Cout (!A, !B, !Cin) COMP103- L13 Adder Design.10 Cin Exploiting the Inversion Property A3 B3 A2 B2 A1 B1 A0 B0 FA’ FA’ FA’ FA’ S3 S2 S1 S0 Cout=C4 C0=Cin inverted cell regular cell Minimizes the critical path (the carry chain) by eliminating inverters between the FAs (will need to increase the transistor sizing on the carry chain portion of the mirror adder). Now need two “flavors” of FAs COMP103- L13 Adder Design.11 Mirror Adder 24+4 transistors A 8 B 8 B 8 8 A 8 4 A 4 4 B 4 B 4 Cin A A 4 B 4 Cin 4 Cout = A&B | B&Cin | A&Cin 2 B 6 A 6 4 Cin 6 2 Cin 3 A 3 B 3 !Cout A B 2 Cin 2 !S SUM = A&B&Cin | COUT&(A | B | Cin) Sizing: Since !Cout drives 2 internal and 2 inverter transistor gates (to form Cin for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2. COMP103- L13 Adder Design.12 Mirror Adder Features The NMOS and PMOS chains are completely symmetrical with a maximum of two series transistors in the carry circuitry, guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized. When laying out the cell, the most critical issue is the minimization of the capacitances at node !Cout (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances. The transistors connected to Cin are placed closest to the output. Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size. COMP103- L13 Adder Design.13 A 64-bit Adder/Subtractor add/subt C0=Cin Ripple Carry Adder (RCA) built out of 64 FAs Subtraction – complement all subtrahend bits (xor gates) and set the low order carry-in RCA z advantage: simple logic, so small (low cost) z disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption) COMP103- L13 Adder Design.14 A0 1-bit FA C1 S0 A1 1-bit FA C2 S1 A2 1-bit FA C3 S2 B0 B1 B2 ... C63 A63 B63 1-bit FA S63 C64=Cout Ripple Carry Adder (RCA) A3 B3 A2 B2 A1 B1 A0 B0 FA FA FA FA S3 S2 S1 S0 Cout=C4 C0=Cin Tadder ≈ TFA(A,B→Cout) + (N-2)TFA(Cin→Cout) + TFA(Cin→S) T = O(N) worst case delay Real Goal: Make the fastest possible carry path COMP103- L13 Adder Design.15 Fast Carry Chain Design The key to fast addition is a low latency carry network What matters is whether in a given position a carry is z generated Gi = Ai & Bi = AiBi z propagated annihilated (killed) Pi = Ai ⊕ Bi (sometimes use Ai | Bi) Ki = !Ai & !Bi z Giving a carry recurrence of Ci+1 = Gi | PiCi C1 = G0 | P0C0 C2 = G1 | P1G0 | P1P0 C0 C3 = G2 | P2G1 | P2P1G0 | P2P1P0 C0 C4 = G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 C0 COMP103- L13 Adder Design.16 Manchester Carry Chain Switches controlled by Gi and Pi !Ci+1 !Ci Gi Pi clk Total delay of z z z time to form the switch control signals Gi and Pi setup time for the switches signal propagation delay through N switches in the worst case COMP103- L13 Adder Design.17 4-bit Sliced MCC Adder A3 B3 A2 B2 A1 B1 A0 B0 clk & ⊕ G P & ⊕ G P & ⊕ G P & ⊕ G P !C4 !C0 !C3 !C1 !C2 ⊕ ⊕ ⊕ ⊕ S3 S2 S1 S0 COMP103- L13 Adder Design.18 Domino Manchester Carry Chain Circuit P3 P2 P1 clk P0 Ci,4 G3 G2 G1 G0 Ci,0 clk !(G0 | P0 Ci,0) !(G2 | P2G1 | P2P1G0 | P2P1P0 Ci,0) !(G1 | P1G0 | P1P0 Ci,0) !(G3 | P3G2 | P3P2G1 | P3P2P1G0 | P3P2P1P0 Ci,0) COMP103- L13 Adder Design.19 Domino Manchester Carry Chain Circuit Progressive Sizing 3 1 Ci,4 COMP103- L13 Adder Design.20 3 P3 P2 3 2 P1 3 3 P0 3 clk 4 1 G3 2 G2 3 G1 4 G0 5 Ci,0 2 3 4 5 6 clk Carry-Bypass (carry-skip) Adder A3 B3 A2 B2 A1 B1 A0 B0 FA FA FA FA S3 S2 S1 S0 Co,3 Ci,0 Co,3 BP = P0 P1 P2 P3 “Block Propagate” If (P0 & P1 & P2 & P3 = 1) then Co,3 = Ci,0 otherwise the block itself kills or generates the carry internally COMP103- L13 Adder Design.21 Carry Skip Chain Implementation block carry-out carry-out BP block carry-in P3 P2 P1 P0 !Cout Cin G3 BP COMP103- L13 Adder Design.22 G2 G1 G0 4-bit Block Carry-Skip Adder bits 12 to 15 bits 8 to 11 bits 4 to 7 bits 0 to 3 Setup Setup Setup Setup Carry Propagation Carry Propagation Carry Propagation Carry Propagation Ci,0 Sum Sum Sum Sum Worst-case delay → carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15 Tadd = tsetup + B tcarry + ((N/B) -1) tskip +B tcarry + tsum COMP103- L13 Adder Design.23 Carry-Skip Adder Extensions Variable block sizes z A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay Cout Cin Multiple levels of skip logic Cout Cin skip level 1 skip level 2 COMP103- L13 Adder Design.24 AND of the first level skip signals (BP’s) Carry-Skip Adder Comparisons 70 60 50 RCA CSkA VSkA 40 30 20 10 0 8 bits COMP103- L13 Adder Design.25 16 bits 32 bits 48 bits 64 bits