Number System Review

advertisement
Shift Operations
Source: David Harris
Aug 2007
1
Shifter Implementation
Regular layout, can be
compact, use
transmission gates to
avoid threshold drop.
Source: David Harris
Aug 2007
Not amenable to
synthesis, high
capacitive loading for
large arrays.
2
Shifter Implementation
Each level shifts by two.
Amenable to synthesis, fast.
Aug 2007
3
Multiplication
Source: David Harris
Aug 2007
4
Array Multiplier with CPAs
Array adder with
Carry propagate
adders (CPA),
multiple near-critical
paths
Source: Jan Rabaey
Aug 2007
5
Array Multiplier with CSAs
Only one
critical path
Source: Jan Rabaey
Aug 2007
6
How do CSAs work?
CSA: Carry Save Adder
Want to add these four
numbers together (same
problem as adding partial
products in a multiplier)
Source: David Harris
Aug 2007
7
How do CSAs work? (cont)
Can use a full adder
network to add three
numbers together if
we view the carry-in
inputs as a bus that
contains the third
number.
The output produces a
sum vector and a carry
vector, and these have
to be added to produce
the final result.
Source: David Harris
Aug 2007
8
How do CSAs work? (cont)
carry vector
has to be
shifted to left
by 1 before
being added
to the sum
because the
COUT bit has
a weight of
2x that of the
sum bit.
Source: David Harris
Aug 2007
9
CSA Multiplier
Carry is
shifted to
left before
being
added.
This final addition is always N/2 in size if the product
has N bits. For large multipliers, need to use a fast
adder structure to do this addition.
Source: Jan Rabaey
Aug 2007
10
Multiplier Layout
Layout can be
made to be
rectangular
Source: David Harris
Source: David Harris
Aug 2007
11
Source: David Harris
2’s Complement Multiply Definition
MSb has
negative weight
MSb has
negative weight
4 bit 2’s complement example:
= -5 = 0xB = 1011 = -1*23 + 0*22 +1*21 +1*20 =-8+0+2+1=-5
Source: David Harris
Aug 2007
12
2’s Complement Multiplication
Source: David Harris
2’s complement
Aug 2007
Source: David Harris
13
Modified Baugh-Wooley Multiplier
(2’s complement)
Source: David Harris
Pre-compute sums of constant ‘1’, push some terms upwards.
Aug 2007
14
Multiplier Layout For Two’s
Complement
Shaded Cells
are modified
cells for BaughWooley.
Source: David Harris
Aug 2007
15
Booth Encoding
Previous multipliers use radix-2, one bit of the multiplier is
observed at a time.
In general, radix-2r multipliers produce N/r partial products
(assuming NxN multiplier).
Fewer partial products lead to smaller/faster CSA arrays.
A radix-4 = radix-22 multiplier produces N/2 partial products.
Two-bits * two bits = Y1Y0 * X1X0 = Y*X
= Y*0, Y*1, Y*2, Y*3
Y*0, Y*1, Y*2 are easy/fast (Y*2 is a shift).
Y*3 is hard, has to be done Y*3= Y*(2+1)= 2Y + Y,
involves a carry propagate.
Aug 2007
16
Radix-4 Partial Products
Y
*
XN-1XN-2...X3X2 X1X0
Y* X1X0
+
+
Y* X3X2
Number of
partial products
is reduced.
Y* XN-1XN-2
Source: David Harris
Aug 2007
17
Booth Encoding (cont.)
Observe that 2Y = 4Y – 2Y and 3Y = 4Y – Y
4Y is simply the next row in the partial product, so just add Y
to next row. In both cases, Y has to be added to current
partial product.
Booth encoding looks at current 2 bits, and MSB of previous
2 bits, and modifies the partial product.
If the MSB of the previous pair is ‘1’, add in ‘Y’ to current
value.
Aug 2007
18
Booth Encoding (cont)
PP =0*Y
PP =0*Y +Y = Y
PP =Y +0 = Y
PP =Y +Y = 2Y
PP =-2Y +0 = -2Y
PP =-2Y +Y = -Y
PP =-Y +0 = -Y
PP =-Y +Y = 0
Negative operations are done
at bit level as complements
with +1 added to PP to
complete 2’s complement
1Y
select
Aug 2007
2Y
select
Sign bit
select
Source: David Harris
19
Booth Selection Logic
Replaces AND gates in
CSA array
When –Y is chosen, have a problem in
that a ‘1’ has to be added to complete
two’s complement
Source: David Harris
Aug 2007
20
Unsigned R-4 Booth Array (16 x 16)
sign extension, either all 1’s or all 0’s for
-Y terms
Extra PP in case last PP needed
a ‘Y’ added in here (last two X
bits were either 2 or 3)
Source: David Harris
‘1’ or ‘0’ needed
to complete 2’s
complement
Aug 2007
21
Optimized R-4 Booth Array (unsigned)
SSSS = 1111 + S
additional
reduction
produces this.
Source: David Harris
Aug 2007
22
Signed R-4 Booth Array (16 x 16)
ei = Mi xor y15
Last PP8 is not needed for signed multiply
Source: David Harris
Aug 2007
23
Booth Speedup
• Radix-4 arrays 20-to-50% smaller than CSA
arrays and up to 20% faster.
• Higher Radix multipliers are possible, but not
worth it except for larger multipliers (at least 64
bits).
Aug 2007
24
Wallace Trees
A CSA adder just adds the PPs together one at a time:
3,2 Counter is
another name
for a full adder
Source: David Harris
Aug 2007
25
Wallace Trees (cont).
A Wallace tree adds the partial products in parallel!
Number of
levels is:
Layout is not regular,
long wires can cause
delay.
Source: David Harris
Aug 2007
26
4-2 Compressor
Used to reduce the number of levels in a Wallace Tree
Number of levels is:
Layout is more regular.
Source: David Harris
Logic more
complex than
Full AdderAug 2007
27
Multiplier Summary
• CSA’s – simple, but many partial products
• Booth Encoding – reduces number of required
PPs, achieves speedup over CSAs
• Wallace Trees – adds PPs in parallel
Aug 2007
28
Download