Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004 Outline q q q q q q q Single-bit Addition Carry-Ripple Adder Carry-Skip Adder Carry-Lookahead Adder Carry-Select Adder Carry-Increment Adder Prefix Adder 11: Adders CMOS VLSI Design Slide 2 Single-Bit Addition A Half Adder S= B Full Adder S= Cout Cout = S Cout = C 0 0 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 11: Adders Cout S CMOS VLSI Design C S B B B Cout A A A Cout S Slide 3 Single-Bit Addition A Half Adder S = A⊕B Cout = Ag B B Full Adder S = A⊕ B ⊕C Cout S A B Cout Cout = MAJ ( A , B ,C ) C S A B Cout S A B C Cout S 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 11: Adders CMOS VLSI Design Slide 4 PGK q For a full adder, define what happens to carries – Generate: Cout = 1 independent of C • G= – Propagate: Cout = C • P= – Kill: Cout = 0 independent of C • K= 11: Adders CMOS VLSI Design Slide 5 PGK q For a full adder, define what happens to carries – Generate: Cout = 1 independent of C • G=A•B – Propagate: Cout = C • P=A⊕B – Kill: Cout = 0 independent of C • K = ~A • ~B 11: Adders CMOS VLSI Design Slide 6 Full Adder Design I q Brute force implementation from eqns S = A⊕ B ⊕C Cout = MAJ ( A, B , C ) A A A Cout C B C C C B A C B B A A 11: Adders C B S MAJ A B C S B A B A B C B B B C A C A B Cout B A CMOS VLSI Design Slide 7 Full Adder Design II q Factor S in terms of Cout S = ABC + (A + B + C)(~Cout) q Critical path is usually C to Cout in ripple adder MINORITY A B C Cout S S Cout 11: Adders CMOS VLSI Design Slide 8 Layout q Clever layout circumvents usual line of diffusion – Use wide transistors on critical path – Eliminate output inverters 11: Adders CMOS VLSI Design Slide 9 Full Adder Design III q Complementary Pass Transistor Logic (CPL) – Slightly faster, but more area B B B C B C B C B C A S B C B C B C B C Cout A A S A Cout B B 11: Adders CMOS VLSI Design Slide 10 Full Adder Design IV q Dual-rail domino – Very fast, but large and power hungry – Used in very fast multipliers φ C_h A_h B_h A_h C_l B_h A_l C_h B_h A_h C_l B_l Cout _l A_l B_l B_l φ S_l 11: Adders φ Cout _h S_h C_h B_h A_l CMOS VLSI Design Slide 11 Carry Propagate Adders q N-bit adder called CPA – Each sum bit depends on all previous carries – How do we compute all these carries quickly? AN...1 BN...1 Cout Cout + SN...1 11: Adders Cin Cin 00000 1111 +0000 1111 CMOS VLSI Design Cout 11111 1111 +0000 0000 Cin carries A4...1 B4...1 S4...1 Slide 12 Carry-Ripple Adder q Simplest design: cascade full adders – Critical path goes from Cin to Cout – Design full adder to have fast carry delay A4 B4 Cout B3 C3 S4 11: Adders A3 A2 B2 C2 S3 A1 B1 Cin C1 S2 CMOS VLSI Design S1 Slide 13 Inversions q Critical path passes through majority gate – Built from minority + inverter – Eliminate inverter and use inverting full adder A4 B4 Cout B3 C3 S4 11: Adders A3 A2 B2 C2 S3 A1 B1 Cin C1 S2 CMOS VLSI Design S1 Slide 14 Generate / Propagate q Equations often factored into G and P q Generate and propagate for groups spanning i:j Gi: j = Pi: j = q Base case Gi:i ≡ Gi = Pi:i ≡ Pi = 0 GCP 0:00:0 in G0:0 ≡ G0 = P0:0 ≡ P0 = q Sum: Si = 11: Adders CMOS VLSI Design Slide 15 Generate / Propagate q Equations often factored into G and P q Generate and propagate for groups spanning i:j Gi: j = Gi:k + Pi:k g Gk −1: j Pi: j = Pi:k g Pk −1: j q Base case Gi:i ≡ Gi = Ai g Bi Pi:i ≡ Pi = Ai ⊕ Bi 0 GCP 0:00:0 in G0:0 ≡ G0 = Cin P0:0 ≡ P0 = 0 q Sum: Si = Pi ⊕ Gi −1:0 11: Adders CMOS VLSI Design Slide 16 PG Logic A4 B4 A3 B3 A2 B2 A1 B1 Cin 1: Bitwise PG logic G4 P4 G3 P3 G2 P2 G1 P1 G0 P0 2: Group PG logic G3:0 G2:0 G1:0 G0:0 C3 C2 C1 C0 3: Sum logic C4 Cout 11: Adders S4 S3 S2 S1 CMOS VLSI Design Slide 17 Carry-Ripple Revisited Gi:0 = Gi + Pi g Gi −1:0 A4 B4 G4 P4 A3 B3 G3 P3 A2 B2 G2 P2 A1 B1 G1 P1 Cin G0 G 3:0 G2:0 G1:0 G0:0 C3 C2 C1 C0 P0 C4 Cout 11: Adders S4 S3 S2 S1 CMOS VLSI Design Slide 18 Carry-Ripple PG Diagram Bit Position 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 tripple = Delay 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 19 Carry-Ripple PG Diagram Bit Position 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 tripple = t pg + ( N − 1)t AO + txor Delay 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 20 PG Diagram Notation Black cell i:k Gray cell k-1:j i:k i:j Gi:k Pi:k Gk-1:j Pk-1:j 11: Adders Buffer k-1:j i:j i:j i:j Gi:j Gi:k Pi:k Gk-1:j Pi:j CMOS VLSI Design Gi:j Gi:j Gi:j Pi:j Pi:j Slide 21 Carry-Skip Adder q Carry-ripple is slow through all N stages q Carry-skip allows carry to skip over groups of n bits – Decision based on n-bit propagate signal Cout A16:13 B16:13 A12:9 B12:9 A8:5 B8:5 A4:1 P16:13 P12:9 P8:5 P4:1 1 0 C12 + S16:13 11: Adders 1 0 C8 + 1 0 S12:9 CMOS VLSI Design C4 + S8:5 B4:1 1 0 + Cin S4:1 Slide 22 Carry-Skip PG Diagram 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 For k n-bit groups (N = nk) tskip = 11: Adders CMOS VLSI Design Slide 23 Carry-Skip PG Diagram 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 For k n-bit groups (N = nk) tskip = t pg + 2 ( n − 1) + (k − 1) t AO + txor 11: Adders CMOS VLSI Design Slide 24 Variable Group Size 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 Delay grows as O(sqrt(N)) 11: Adders CMOS VLSI Design Slide 25 Carry-Lookahead Adder q Carry-lookahead adder computes Gi:0 for many bits in parallel. q Uses higher-valency cells with more than two inputs. A16:13 B16:13 Cout G16:13 P16:13 + S16:13 11: Adders C12 A12:9 B12:9 G12:9 P12:9 A8:5 B8:5 C8 + S12:9 CMOS VLSI Design A4:1 C4 G8:5 P8:5 B4:1 G4:1 P4:1 + + S8:5 S4:1 Cin Slide 26 CLA PG Diagram 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 27 Higher-Valency Cells i:k k-1:l l-1:m m-1:j i:j Gi:k Pi:k Gk-1:l Pk-1:l Gl-1:m Pl-1:m Gm-1:j Gi:j Pi:j Pm-1:j 11: Adders CMOS VLSI Design Slide 28 Carry-Select Adder q Trick for critical paths dependent on late input X – Precompute two possible outputs for X = 0, 1 – Select proper output when X arrives q Carry-select adder precomputes n-bit sums – For both possible carries into n-bit group A16:13 B16:13 0 + Cout B4:1 1 C4 + Cin 0 CMOS VLSI Design A4:1 0 + 1 0 S12:9 B8:5 + C8 1 + 1 0 1 S16:13 A8:5 0 + C12 1 + 11: Adders A12:9 B12:9 S8:5 S4:1 Slide 29 Carry-Increment Adder q Factor initial PG and final XOR out of carry-select 15 14 13 12 11 10 13:12 8 7 6 9:8 14:12 15:12 9 4 3 2 1 0 5:4 10:8 11:8 5 6:4 7:4 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 tincrement = 11: Adders CMOS VLSI Design Slide 30 Carry-Increment Adder q Factor initial PG and final XOR out of carry-select 15 14 13 12 11 10 13:12 8 7 6 9:8 14:12 15:12 9 4 3 2 1 0 5:4 10:8 11:8 5 6:4 7:4 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 tincrement = t pg + ( n − 1) + (k − 1) t AO + txor 11: Adders CMOS VLSI Design Slide 31 Variable Group Size q Also buffer noncritical signals 15 14 13 12 11 10 9 12:11 8 7 8:7 13:11 5 4 5:4 9:7 14:11 6 3 2 1 0 3:2 6:4 10:7 15:11 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15 14 13 12 11 10 9 12:11 7 6 8:7 13:11 14:11 8 9:7 10:7 5 5:4 6:4 4 3 3:2 2 1 0 1:0 3:0 6:0 15:11 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 32 Tree Adder q If lookahead is good, lookahead across lookahead! – Recursive lookahead gives O(log N) delay q Many variations on tree adders 11: Adders CMOS VLSI Design Slide 33 Brent-Kung 15 14 13 15:14 13:12 15:12 12 11 10 11:10 9 9:8 11:8 8 7 6 7:6 5 5:4 7:4 15:8 4 3 3:2 2 1 0 1:0 3:0 7:0 11:0 13:0 9:0 5:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 34 Sklansky 15 14 13 12 11 10 15:14 13:12 11:10 15:12 14:12 15:8 14:8 11:8 10:8 13:8 9 9:8 8 7 6 7:6 7:4 5 5:4 6:4 4 3 2 3:2 3:0 1 0 1:0 2:0 12:8 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 35 Kogge-Stone 15 14 13 12 11 10 9 8 7 6 5 4 3 2 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 15:12 14:11 13:10 3:0 2:0 15:8 14:7 13:6 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 1 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 36 Tree Adder Taxonomy q Ideal N-bit tree adder would have – L = log N logic levels – Fanout never exceeding 2 – No more than one wiring track between levels q Describe adder with 3-D taxonomy (l, f, t) – Logic levels: L+l – Fanout: 2f + 1 – Wiring tracks: 2t q Known tree adders sit on plane defined by l + f + t = L-1 11: Adders CMOS VLSI Design Slide 37 Tree Adder Taxonomy l (Logic Levels) 3 (7) f (Fanout) 2 (6) 3 (9) 1 (5) 2 (5) 1 (3) 0 (2) 0 (4) 0 (1) 1 (2) 2 (4) 3 (8) t (Wire Tracks) 11: Adders CMOS VLSI Design Slide 38 Tree Adder Taxonomy l (Logic Levels) 3 (7) Brent-Kung f (Fanout) 2 (6) Sklansky 3 (9) 1 (5) 2 (5) 1 (3) 0 (2) 0 (4) 0 (1) 1 (2) 2 (4) Kogge-Stone 3 (8) t (Wire Tracks) 11: Adders CMOS VLSI Design Slide 39 Han-Carlson 15 14 13 12 11 10 9 8 7 6 5 4 3 15:14 13:12 11:10 9:8 7:6 5:4 3:2 15:12 13:10 11:8 9:6 7:4 5:2 3:0 15:8 13:6 11:4 9:2 7:0 5:0 2 1 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 40 Knowles [2, 1, 1, 1] 15 14 13 12 11 10 9 8 7 6 5 4 3 2 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 15:12 14:11 13:10 3:0 2:0 15:8 14:7 13:6 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 1 0 1:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 41 Ladner-Fischer 15 14 13 12 11 10 15:14 13:12 15:12 11:10 9 9:8 11:8 15:8 13:8 15:8 13:0 8 7 6 7:6 5:4 7:4 7:0 11:0 5 4 3 3:2 2 1 0 1:0 3:0 5:0 9:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders CMOS VLSI Design Slide 42 Taxonomy Revisited (f)Ladner-Fischer (b) Sklansky 15 15 14 13 12 1 1 10 9 8 7 6 5 4 3 2 1 14 15:14 15:14 13:12 11:10 9:8 7:6 5:4 3:2 13 12 15:8 14:8 11:8 10:8 13:8 7:4 6:4 3:0 BrentKung LadnerFischer 15:014:013:012:011:010:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 LadnerFischer f (Fanout) 13:12 11:10 4 3 2 15:14 14:13 13:12 12:11 11:10 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 15:12 14:11 13:10 9:6 8:5 7:4 6:3 5:2 4:1 3:0 2:0 12:9 11:8 10:7 1 15:8 13:8 15:8 13:0 15:14 0 (2) 0 1:0 0 (4) 0 (1) 13:12 11:10 15:12 14:7 13:6 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 Knowles [4,2,1,1] HanCarlson 4:0 3:2 0 1:0 0:0 1:0 5:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 7 6 5 4 3 2 3:0 2:0 9 8 1 0 9:8 7:6 5:4 3:2 7:4 1:0 3:0 7:0 9:0 5:0 (d) Han-Carlson 9 8 7 6 5 4 3 2 10:9 9:8 8:7 7:6 6:5 5:4 4:3 3:2 2:1 12:9 11:8 10:7 9:6 8:5 7:4 6:3 5:2 4:1 3:0 2:0 12:5 11:4 10:3 9:2 8:1 7:0 6:0 5:0 4:0 1 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1:0 Kogge3 (8) Stone 15:14 13:12 11:10 9:8 7:6 5:4 3:2 15:12 13:10 11:8 9:6 7:4 5:2 3:0 15:8 13:6 11:4 9:2 7:0 5:0 1:0 t (Wire Tracks) 15:014:013:0 12:011:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15:0 14:0 13:0 12:0 11:010:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 11: Adders 1 3:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15 14 13 15 14 13 12 11 10 13:6 5:4 7:0 15:8 HanCarlson (c) Kogge-Stone 14:7 7:6 1 (2) 2 (4) 15:8 2 11:0 Knowles [2,1,1,1] 15:12 14:11 13:10 3 New (1,1,1) 15:014:013:012:011:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0 15:14 14:13 13:12 12:11 11:10 4 9:0 11:8 13:0 15:8 5 7:4 11:0 15 14 13 12 11 10 1 (3) 5 9:8 11:8 15:0 14:0 13:0 2 (5) 6 6 1 (5) (e) Knowles [2,1,1,1] 7 7 (a) Brent-Kung 2 (6) 3 (9) 8 8 3 (7) Sklansky 9 9 l (Logic Levels) 2:0 12:8 15 14 13 1 2 11 10 10 1:0 15:12 15:12 14:12 11 0 CMOS VLSI Design Slide 43 Summary Adder architectures offer area / power / delay tradeoffs. Choose the best one for your application. Architecture Classification Logic Levels Max Tracks Fanout Cells Carry-Ripple N-1 1 1 N Carry-Skip n=4 N/4 + 5 2 1 1.25N Carry-Inc. n=4 N/4 + 2 4 1 2N 1 2N Brent-Kung (L-1, 0, 0) 2log2N – 1 2 Sklansky (0, L-1, 0) log2N N/2 + 1 1 0.5 Nlog2N Kogge-Stone (0, 0, L-1) log2N 2 Nlog2N 11: Adders CMOS VLSI Design N/2 Slide 44