Carry-Lookahead Addition - University of Wisconsin

advertisement
Carry-Lookahead Addition
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Ripple-Carry Adder
–
Current design uses a “ripple-carry” adder technique
• Cout propagates into the Cin of next adder
–
What is the associated electrical delay for this scheme?
• Assume each gate (AND/OR only) has a delay of T units
–
Two level logic implementation of a single FA:
• delay of 2T to compute Cout:
A
B
Cin
A
Cout
Cin
B
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Carry-Lookahead Adder
–
A 16-bit Ripple-Carry adder has 15 * 2T + T = 31 T total delay to compute the sum!
• Grows linearly with size of adder
–
Is there a faster way to add? yes.
–
Faster design uses a “carry-lookahead” adder technique
–
Real ALUs use this style
–
Idea is to compute needed carry-in to a bit position with only a very small delay (smaller than in the R.C.
case)
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Generating a Carry
–
An adder will “generate” a carry-out on the sum of the bits
ai and bi if ai • bi = 1 (i.e. a and b are both 1)
Define gi = ai • bi
(generate)
–
Hence: couti  cini+1 = 1 if gi = 1
–
Let ci = “carry-in to position i”
–
Note ci+1 = carry-in to position i+1 = carry-out from position i
–
Delay to compute each g = 1T
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Propagating a Carry
–
An adder will “propagate” a carry-in (ci) by the sum of the bits ai and bi if ci = 1 and ai + bi = 1 (i.e. cin is 1
and at least one of a or b is 1)
Define pi = ai + bi
–
Hence: couti  ci+1 = gi + pi • ci
–
A carry-out occurs from position i if it is either
• generated by position i, or
• a carry-in is propagated by position i
–
Delay to compute each p = 1T
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
(propagate)
Dan Ernst
• Propagate / Generate
–
Ex: using 4 bits
c0 = initial carry-in
c1 = g0 + p0 c0
c2 = g1 + p1 c1 = g1 + p1 (g0 + p0 c0)
= g1 + p1 g0 + p1 p0 c0
c3 = g2 + p2 c2 = g2 + p2 (g1 + p1 g0 + p1 p0 c0)
= g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 c0
–
Delay to compute each c = 2T  fixed delay!
• I am assuming that g and p are pre-computed
• They take a total of 1T to pre-compute
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• An Abstraction of Propagate / Generate
–
These equations require large gate “fan-in” to implement in 2T delay  therefore stop expansion at 4 bits
as above
–
Delay to compute each c = 2T  fixed delay!
• pre-computed: 1T for each p and g (in parallel!)
• 1T for the AND to create the subgroups (minterms)
• 1T for the OR of all subgroups
–
Each sum bit Si can now be computed in 3T delay:
Ai
Bi
Ci
Si
2T delay to compute
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• 4-Bit Carry-Lookahead Adder
–
Combine these ideas to design a 4-bit adder with 3T delay for the entire 4-bit Sum (assuming p & g are
pre-computed)
Cin
A0
B0
A1
B1
A2
B2
A3
B3
–
–
What are P0 and G0?
• P0 = p3 p2 p1 p0
• G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0
4-Bit C.L.
Adder
S0
S1
S2
S3
P0
G0
(super-propagate)
(super-generate)
Note: the device has no “carry-out”, only P0 and G0
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Super-Generate / Super-Propagate
P0 = p3 p2 p1 p0
G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0
(super-propagate)
(super-generate)
–
P0 represents the propagate for the entire 4-bit unit
• P0 takes 1T delay units to compute
–
G0 represents the generate for the entire 4-bit unit
• G0 takes 2T delay units to compute
–
P0 and G0 represent a higher level of hardware abstraction of propagation and generation
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
Cin
• 16-bit C.L. Adder
A0
B0
A1
B1
A2
B2
A3
B3
Carry-Lookahead Logic
implements:
–
–
–
–
Cin(0) = c0 (initial carry-in)
Cin(1) = G0 + P0 c0
Cin(2) = G1 + P1 G0 + P1 P0 c0
Cin(3) = G2 + P2 G1 + P2 P1 G0
+ P2 P1 P0 c0
A0
B0
A1
B1
A2
B2
A3
B3
Delay for Cin is (2T + 2T) = 4T
Delay for Sum = 4T + 3T (per unit) +
1T (for pre-computation of p & g) = 8T
Compare to 31T for R.C. adder
A0
B0
A1
B1
A2
B2
A3
B3
P0
G0
Cin
–
4-Bit C.L.
Adder
S0
S1
S2
S3
4-Bit C.L.
Adder
S4
S5
S6
S7
P1
G1
Cin
CarryLookahead
Logic
–
–
–
4-Bit C.L.
Adder
S8
S9
S10
S11
P2
G2
Cin
A0
B0
A1
B1
A2
B2
A3
B3
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
4-Bit C.L.
Adder
S12
S13
S14
S15
P3
G3
Dan Ernst
• 16-bit C.L. Adder Example (1)
A: 0110 0011 1101 0101
B: 1110 1101 1000 0011
g: 0110 0001 1000 0001
p: 1110 1111 1101 0111
 1T

P0:
P1:
P2:
P3:
0·1·1·1 = 0
1·1·0·1 = 0
1·1·1·1 = 1
1·1·1·0 = 0
 1T (2T total)



G0:
G1:
G2:
G3:
0+00+010+0111=0
1+10+110+1100=1
0+10+110+1111=1
0+11+111+1110=1
 2T (3T total)



CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• 16-bit C.L. Adder Example (2)
–
Computing the actual sum (red bits only):
A: 0110 0011 1101 0101
B: 1110 1101 1000 0011
g: 0110 0001 1000 0001
p: 1110 1111 1101 0111
P: 0100
a6  b6  c6
G:
1110
= 1  0  ( g5 + p5g4 + p5p4Cin(1) )
= 1  0  (0 + 0 0 + 0 1 (G0 + P0 c0))
= 1  0  (0 + 0 0 + 0 1 (0 + 0 0)) = 1
Delay to compute S6 = 1T + (2T + 2T) + (2T + 1T) = 8T
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
• Test Yourself
–
Compute sum bit S10 (red bits only):
A: 0110 1001 1001 0101
B: 1011 0101 1000 1011
g:
p:
P:
G:
a10  b10  c10 =
CS 352 : Computer Organization and Design
University of Wisconsin-Eau Claire
Dan Ernst
Download