Lecture 17: Adders Outline Single-bit Addition Carry-Ripple Adder Carry-Skip Adder Carry-Lookahead Adder Carry-Select Adder Carry-Increment Adder Tree Adder 17: Adders CMOS VLSI Design 4th Ed. 2 Single-Bit Addition A Half Adder S A B B Full Adder S A B C Cout Cout A B S A B Cout Cout MAJ ( A, B, C ) C S A B Cout S A B C Cout S 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 17: Adders CMOS VLSI Design 4th Ed. 3 PGK For a full adder, define what happens to carries (in terms of A and B) – Generate: Cout = 1 independent of C • G=A•B – Propagate: Cout = C • P=AB – Kill: Cout = 0 independent of C • K = ~A • ~B 17: Adders CMOS VLSI Design 4th Ed. 4 Full Adder Design I Brute force implementation from eqns S A B C Cout MAJ ( A, B, C ) A A A S C A B B C C MAJ C B C B B B C A C A A A 17: Adders C B S Cout B A B A B C A B C B B Cout B A CMOS VLSI Design 4th Ed. 5 Full Adder Design II Factor S in terms of Cout S = ABC + (A + B + C)(~Cout) Critical path is usually C to Cout in ripple adder MINORITY A B C Cout S S Cout 17: Adders CMOS VLSI Design 4th Ed. 6 Layout Clever layout circumvents usual line of diffusion – Use wide transistors on critical path – Eliminate output inverters 17: Adders CMOS VLSI Design 4th Ed. 7 Full Adder Design III Complementary Pass Transistor Logic (CPL) – Slightly faster, but more area B B B C B C B C B C A S B C B C B C B C Cout A A S A Cout B B 17: Adders CMOS VLSI Design 4th Ed. 8 Full Adder Design IV Dual-rail domino – Very fast, but large and power hungry – Used in very fast multipliers C_h A_h B_h A_h C_l B_h A_l C_h B_h A_h C_l B_l Cout _l A_l B_l B_l S_l 17: Adders Cout _h S_h C_h B_h A_l CMOS VLSI Design 4th Ed. 9 Carry Propagate Adders N-bit adder called CPA – Each sum bit depends on all previous carries – How do we compute all these carries quickly? AN...1 BN...1 Cout Cout + SN...1 17: Adders Cin Cin 00000 1111 +0000 1111 CMOS VLSI Design 4th Ed. Cout 11111 1111 +0000 0000 Cin carries A4...1 B4...1 S4...1 10 Carry-Ripple Adder Simplest design: cascade full adders – Critical path goes from Cin to Cout – Design full adder to have fast carry delay A4 B4 Cout B3 C3 S4 17: Adders A3 A2 B2 C2 S3 A1 B1 Cin C1 S2 CMOS VLSI Design 4th Ed. S1 11 Inversions Critical path passes through majority gate – Built from minority + inverter – Eliminate inverter and use inverting full adder A4 B4 Cout B3 C3 S4 17: Adders A3 A2 B2 C2 S3 A1 B1 Cin C1 S2 S1 CMOS VLSI Design 4th Ed. 12 Generate / Propagate Equations often factored into G and P Generate and propagate for groups spanning i:j Gi: j Gi:k Pi:k Pi: j Pi:k Gk 1: j Pk 1: j Base case Gi:i Gi Ai Bi Pi:i Pi Ai Bi 0 GCP 0:00:0 in G0:0 G0 Cin P0:0 P0 0 Sum: Si Pi Gi 1:0 17: Adders CMOS VLSI Design 4th Ed. 13 PG Logic A4 B4 A3 B3 A2 B2 A1 B1 Cin 1: Bitwise PG logic G4 P4 G3 P3 G2 P2 G1 P1 G0 P0 2: Group PG logic G3:0 G2:0 G1:0 G0:0 C3 C2 C1 C0 3: Sum logic C4 Cout 17: Adders S4 S3 S2 S1 CMOS VLSI Design 4th Ed. 14 Carry-Ripple Revisited Gi:0 Gi Pi Gi 1:0 A4 B4 G4 P4 A3 B3 G3 P3 A2 B2 G2 P2 A1 B1 G1 P1 Cin G0 G3:0 G2:0 G1:0 G0:0 C3 C2 C1 C0 P0 C4 Cout 17: Adders S4 S3 S2 S1 CMOS VLSI Design 4th Ed. 15 Carry-Ripple PG Diagram Bit Position 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 tripple t pg ( N 1)tAO txor Delay 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 16 PG Diagram Notation Black cell i:k Gray cell k-1:j i:k i:j Gi:k Pi:k Gk-1:j Pk-1:j 17: Adders Buffer k-1:j i:j i:j i:j Gi:j Gi:k Pi:k Gk-1:j Pi:j CMOS VLSI Design 4th Ed. Gi:j Gi:j Gi:j Pi:j Pi:j 17 Carry-Skip Adder Carry-ripple is slow through all N stages Carry-skip allows carry to skip over groups of n bits – Decision based on n-bit propagate signal Cout A16:13 B16:13 A12:9 B12:9 A8:5 B8:5 A4:1 P16:13 P12:9 P8:5 P4:1 1 0 C12 + S16:13 17: Adders 1 0 C8 + S12:9 1 0 C4 + S8:5 CMOS VLSI Design 4th Ed. B4:1 1 0 + Cin S4:1 18 Carry-Skip PG Diagram 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 For k n-bit groups (N = nk) tskip t pg 2 n 1 ( k 1) t AO txor 17: Adders CMOS VLSI Design 4th Ed. 19 Variable Group Size 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 Delay grows as O(sqrt(N)) 17: Adders CMOS VLSI Design 4th Ed. 20 Carry-Lookahead Adder Carry-lookahead adder computes Gi:0 for many bits in parallel. Uses higher-valency cells with more than two inputs. A16:13 B16:13 Cout G16:13 P16:13 + S16:13 17: Adders C12 A12:9 B12:9 G12:9 P12:9 + S12:9 A8:5 B8:5 C8 A4:1 C4 G8:5 P8:5 B4:1 G4:1 P4:1 + + S8:5 S4:1 CMOS VLSI Design 4th Ed. Cin 21 CLA PG Diagram 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 22 Higher-Valency Cells i:k k-1:l l-1:m m-1:j i:j Gi:k Pi:k Gk-1:l Pk-1:l Gl-1:m Pl-1:m Gm-1:j Gi:j Pi:j Pm-1:j 17: Adders CMOS VLSI Design 4th Ed. 23 Carry-Select Adder Trick for critical paths dependent on late input X – Precompute two possible outputs for X = 0, 1 – Select proper output when X arrives Carry-select adder precomputes n-bit sums – For both possible carries into n-bit group A16:13 B16:13 0 + Cout 1 + B8:5 C8 1 + CMOS VLSI Design 4th Ed. B4:1 C4 + Cin 0 S12:9 A4:1 0 + 1 0 1 0 1 S16:13 A8:5 0 + C12 1 + 17: Adders A12:9 B12:9 S8:5 S4:1 24 Carry-Increment Adder Factor initial PG and final XOR out of carry-select 15 14 13 12 11 10 13:12 8 7 6 9:8 14:12 15:12 9 4 3 2 1 0 5:4 10:8 11:8 5 6:4 7:4 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 tincrement t pg n 1 (k 1) t AO txor 17: Adders CMOS VLSI Design 4th Ed. 25 Variable Group Size Also buffer noncritical signals 15 14 13 12 11 10 9 12:11 8 7 6 8:7 13:11 4 5:4 9:7 14:11 5 3 2 1 0 3:2 6:4 10:7 15:11 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15 14 13 12 11 10 9 12:11 7 6 8:7 13:11 14:11 8 9:7 10:7 5 5:4 6:4 4 3 3:2 2 1 0 1:0 3:0 6:0 15:11 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 26 Tree Adder If lookahead is good, lookahead across lookahead! – Recursive lookahead gives O(log N) delay Many variations on tree adders 17: Adders CMOS VLSI Design 4th Ed. 27 Brent-Kung 15 14 13 12 11 10 15:14 13:12 15:12 11:10 9 9:8 11:8 8 7 7:6 6 5 5:4 7:4 15:8 4 3 3:2 2 1 0 1:0 3:0 7:0 11:0 13:0 9:0 5:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 28 Sklansky 15 14 13 12 11 10 15:14 13:12 11:10 15:12 14:12 15:8 14:8 11:8 10:8 13:8 9 9:8 8 7 6 7:6 7:4 5 5:4 6:4 4 3 2 3:2 3:0 1 0 1:0 2:0 12:8 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 29 Kogge-Stone 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 3:0 2:0 15:12 14:11 13:10 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 13:6 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 15:8 14:7 1 2 3 4 5 6 7 8 9 15 14 13 12 11 10 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 30 Tree Adder Taxonomy Ideal N-bit tree adder would have – L = log N logic levels – Fanout never exceeding 2 – No more than one wiring track between levels Describe adder with 3-D taxonomy (l, f, t) – Logic levels: L+l – Fanout: 2f + 1 – Wiring tracks: 2t Known tree adders sit on plane defined by l + f + t = L-1 17: Adders CMOS VLSI Design 4th Ed. 31 Tree Adder Taxonomy l (Logic Levels) 3 (7) Brent-Kung f (Fanout) 2 (6) Sklansky 3 (9) 1 (5) 2 (5) 1 (3) 0 (2) 0 (4) 0 (1) 1 (2) 2 (4) Kogge-Stone 3 (8) t (Wire Tracks) 17: Adders CMOS VLSI Design 4th Ed. 32 Han-Carlson 15 14 13 12 11 10 9 8 7 6 5 4 3 15:14 13:12 11:10 9:8 7:6 5:4 3:2 15:12 13:10 11:8 9:6 7:4 5:2 3:0 15:8 13:6 11:4 9:2 7:0 5:0 2 1 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 33 Knowles [2, 1, 1, 1] 15 14 13 12 11 10 9 8 7 6 5 4 3 2 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 15:12 14:11 13:10 3:0 2:0 15:8 14:7 13:6 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 1 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 34 Ladner-Fischer 15 14 13 12 11 10 15:14 13:12 15:12 11:10 9 9:8 11:8 15:8 13:8 15:8 13:0 8 7 7:6 5 5:4 7:4 7:0 11:0 6 4 3 3:2 2 1 0 1:0 3:0 5:0 9:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders CMOS VLSI Design 4th Ed. 35 Taxonomy Revisited (f)Ladner-Fischer (b) Sklansky 15 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 14 15:14 15:14 13:12 11:10 9:8 7:6 5:4 3:2 13 12 15:8 14:8 11:8 10:8 13:8 7:4 6:4 3:0 BrentKung LadnerFischer 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 LadnerFischer f (Fanout) 13:12 11:10 4 3 2 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 15:12 14:11 13:10 9:6 8:5 7:4 6:3 5:2 4:1 3:0 2:0 12:9 11:8 10:7 1 15:8 13:8 15:8 13:0 15:14 0 (2) 0 1:0 0 (4) 0 (1) 13:12 11:10 15:12 14:7 13:6 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4 3 2 7:6 5:4 3:2 7:0 Knowles [4,2,1,1] 9:0 8:0 7:0 6:0 5:0 4:0 7 6 5 4 3 2 3:0 2:0 9 8 1 0 HanCarlson 9:8 7:6 5:4 3:2 7:4 15:8 1:0 3:0 7:0 11:0 9:0 5:0 2 (4) (d) Han-Carlson 15 14 13 (c) Kogge-Stone 15 14 13 12 11 10 13:6 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 HanCarlson Knowles [2,1,1,1] 14:7 9 8 7 6 5 4 3 2 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 3:0 2:0 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 1 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1:0 Kogge3 (8) Stone 15:14 13:12 11:10 9:8 7:6 5:4 3:2 15:12 13:10 11:8 9:6 7:4 5:2 3:0 15:8 13:6 11:4 9:2 7:0 5:0 1:0 t (Wire Tracks) 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 17: Adders 0:0 5:0 New (1,1,1) 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15:8 1:0 1:0 1 (2) 15:12 14:11 13:10 0 3:0 4:0 15:14 14:13 13:12 12:11 11:10 10:9 1 9:0 11:8 13:0 15:8 5 7:4 11:0 15 14 13 12 11 10 1 (3) 5 9:8 11:8 15:0 14:0 13:0 12:0 11:0 10:0 2 (5) 6 6 1 (5) (e) Knowles [2,1,1,1] 7 7 (a) Brent-Kung 2 (6) 3 (9) 8 8 3 (7) Sklansky 9 9 l (Logic Levels) 2:0 12:8 15 14 13 12 11 10 10 1:0 15:12 15:12 14:12 11 0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 CMOS VLSI Design 4th Ed. 36 Summary Adder architectures offer area / power / delay tradeoffs. Choose the best one for your application. Architecture Classification Logic Levels Max Tracks Fanout Cells Carry-Ripple N-1 1 1 N Carry-Skip n=4 N/4 + 5 2 1 1.25N Carry-Inc. n=4 N/4 + 2 4 1 2N Brent-Kung (L-1, 0, 0) 2log2N – 1 2 1 2N Sklansky (0, L-1, 0) log2N N/2 + 1 1 0.5 Nlog2N Kogge-Stone (0, 0, L-1) log2N 2 N/2 Nlog2N 17: Adders CMOS VLSI Design 4th Ed. 37