Lecture 18:
Datapath
Functional
Units
Comparators
Shifters
Multi-input Adders
Multipliers
Outline
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
2
Comparators
0’s detector:
1’s detector:
A = 00…000
A = 11…111
Equality comparator: A = B
Magnitude comparator: A < B
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
3
1’s & 0’s Detectors
1’s detector: N-input AND gate
0’s detector: NOTs + 1’s detector (N-input NOR)
A
7
A
6
A
5
A
4
A
3
A
2
A
1
A
0
A
3
A
2
A
1
A
0
A
7
A
6
A
5
A
4 allones
A
A
2
3
A
A
0
1 allones allzeros
CMOS VLSI Design 4th Ed.
18: Datapath Functional Units 4
Equality Comparator
Check if each bit is equal (XNOR, aka equality gate)
1’s detect on bitwise equality
B[3]
A[3]
B[2]
A[2]
B[1]
A[1]
B[0]
A[0]
A = B
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
5
Magnitude Comparator
Compute B – A and look at sign
B – A = B + ~A + 1
For unsigned numbers, carry out is sign bit
A
B
C
N
A
B B
3
A
3
B
2
A
2
B
1
A
1
B
0
A
0
Z
A = B
CMOS VLSI Design 4th Ed.
18: Datapath Functional Units 6
Signed vs. Unsigned
For signed numbers, comparison is harder
– C: carry out
– Z: zero (all bits of A –
B are 0)
– N: negative (MSB of result)
– V: overflow (inputs had different signs, output sign
B)
– S: N xor V (sign of result)
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
7
Shifters
Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = 0101 1011 LSL1 = 0110
Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = 1101
Rotate:
1011 ASL1 = 0110
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = 1101 1011 ROL1 = 0111
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
8
Funnel Shifter
A funnel shifter can do all six types of shifts
Selects N-bit field Y from 2N –1-bit input
– Shift by k bits (0 k < N)
– Logically involves N N:1 multiplexers
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
9
Funnel Source Generator
Rotate Right
Logical Right
Arithmetic Right
Rotate Left
Logical/Arithmetic Left
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
10
Array Funnel Shifter
N N-input multiplexers
– Use 1-of-N hot select signals for shift amount
– nMOS pass transistor design (V t k[1:0] drops!) left Inverters & Decoder s
3 s
2 s
1 s
0
Z
6
Z
5
Z
4
Z
3
Z
2
Z
1
Z
0
18: Datapath Functional Units
Y
3
Y
2
Y
1
Y
0
CMOS VLSI Design 4th Ed.
11
Logarithmic Funnel Shifter
Log N stages of 2-input muxes
– No select decoding needed
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
12
32-bit Logarithmic Funnel
Wider multiplexers reduce delay and power
Operands > 32 bits introduce datapath irregularity
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
13
Barrel Shifter
Barrel shifters perform right rotations using wraparound wires.
Left rotations are right rotations by N – k = k + 1 bits.
Shifts are rotations with the end bits masked off.
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
14
Logarithmic Barrel Shifter
Right shift only
Right/Left shift
18: Datapath Functional Units
Right/Left Shift & Rotate
CMOS VLSI Design 4th Ed.
15
32-bit Logarithmic Barrel
Datapath never wider than 32 bits
First stage preshifts by 1 to handle left shifts
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
16
Multi-input Adders
Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = 10111
Straightforward solution: k-1 N-input CPAs
– Large and slow
0001 0111 1101 0010
1000
+
10101
+
+
10111
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
17
Carry Save Addition
A full adder sums 3 inputs and produces 2 outputs
– Carry output has twice weight of sum output
N full adders in parallel are called carry save adder
– Produce N sums and N carry outs
X
4
Y
4
Z
4
X
3
Y
3
Z
3
X
2
Y
2
Z
2
X
1
Y
1
Z
1
C
4
S
4
C
3
S
3
C
2
S
X
N...1
Y
N...1
Z
N...1
2 n-bit CSA
C
N...1
S
N...1
18: Datapath Functional Units
C
1
S
1
CMOS VLSI Design 4th Ed.
18
CSA Application
Use k-2 stages of CSAs
– Keep result in carry-save redundant form
Final CPA computes actual result
0001 0111 1101 0010
0001
0111
+1101
1011
0101_
0101_
4-bit CSA
1011
5-bit CSA
01010_ 00011
0101_
1011
+0010
00011
01010_
+
10111
01010_
+ 00011
10111
X
Y
Z
S
C
X
Y
Z
S
C
A
B
S
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
19
Multiplication
Example:
1100 : 12
10
0101 : 5
10
1100
0000
1100
0000
00111100 : 60
10 multiplicand multiplier partial products product
M x N-bit multiplication
– Produce N M-bit partial products
– Sum these to produce M+N-bit product
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
20
General Form
Multiplicand: Y = (y
M-1
, y
M-2
, …, y
1
, y
0
)
Multiplier: X = (x
N-1
, x
N-2
, …, x
1
, x
0
)
Product: P
M j
1
0 y j
2 j
N i
1
0 x i
2 i
N
1 M
1 i
0 j
0 x y i j
2 multiplicand multiplier p
11 x
5 y
5 p
10 x
4 y
5 x
5 y
4 p
9 x
3 y
5 x
4 y
4 x
5 y
3 p
8 x
2 y
5 x
3 y
4 x
4 y
3 x
5 y
2 p
7 x
1 y
5 x
2 y
4 x
3 y
3 x
4 y
2 x
5 y
1 p
6 y
5 x
5 x
0 y
5 x
1 y
4 x
2 y
3 x
3 y
2 x
4 y
1 x
5 y
0 p
5 y
4 x
4 x
0 y
4 x
1 y
3 x
2 y
2 x
3 y
1 x
4 y
0 y
3 x
3 x
0 y
3 x
1 y
2 x
2 y
1 x
3 y
0 y
2 x
2 x
0 y
2 x
1 y
1 x
2 y
0 p
4 p
3 p
2 y
1 x
1 x
0 y
1 x
1 y
0 y
0 x
0 x
0 y
0 p
1 p
0 partial products product
CMOS VLSI Design 4th Ed.
18: Datapath Functional Units 21
Dot Diagram
Each dot represents a bit partial products x
0
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
x
15
22
x
2 x
3 x
0 x
1
Array Multiplier y
3 y
2 y
1 y
0
CSA
Array
CPA p
7 p
6 p
5
A B p
4 p
3
B
Sin A Cin
Cout Sout
=
Cout
Sin
Cin
Sout critical path
Cout p
2 p
1 p
0
A
Sout
B
Cin = Cout
A
Sout
B
Cin
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
23
Rectangular Array
Squash array to fit rectangular floorplan y
3 y
2 y
1 y
0 x
0 p
0 x
1 p
1 x
2 p
2 x
3 p
3
18: Datapath Functional Units p
7 p
6 p
5 p
4
CMOS VLSI Design 4th Ed.
24
Fewer Partial Products
Array multiplier requires N partial products
If we looked at groups of r bits, we could form N/r partial products.
– Faster and smaller?
– Called radix-2 r encoding
Ex: r = 2: look at pairs of bits
– Form partial products of 0, Y, 2Y, 3Y
– First three are easy, but 3Y requires adder
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
25
Booth Encoding
Instead of 3Y, try –Y, then increment next partial product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
26
Booth Hardware
Booth encoder generates control lines for each PP
– Booth selectors choose PP bits
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
27
Sign Extension
Partial products can be negative
– Require sign extension, which is cumbersome
– High fanout on most significant bit
0 x
-1 x
0 s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s s PP
0
PP
1
PP
2
PP
3 s
PP
4
PP
5
PP
6
PP
7
PP
8
0
0 x
15 x
16 x
17
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
28
Simplified Sign Ext.
Sign bits are either all 0’s or all 1’s
– Note that all 0’s is all 1’s + 1 in proper column
– Use this to reduce loading on MSB
1 1 1 1 1 1 1 1 1 1 1 1
1
1 s
1
1 s
1
1 1
1 1
1 1
1 1
1 1
1 s
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
1
1 s
1
1
1
1
1 s
1
1 1
1
1
1 s
1
1
1
1 s
1
1 s
1
1 s s s s s s s s
PP
4
PP
5
PP
6
PP
7
PP
8
PP
0
PP
1
PP
2
PP
3
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
29
Even Simpler Sign Ext.
No need to add all the 1’s in hardware
– Precompute the answer!
s
1 s
1 s
1 s
1 s
1 s
1 s s s s s s s s s s s s
PP
0
PP
1
PP
2
PP
3
PP
4
PP
5
PP
6
PP
7
PP
8
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
30
Advanced Multiplication
Signed vs. unsigned inputs
Higher radix Booth encoding
Array vs. tree CSA networks
18: Datapath Functional Units CMOS VLSI Design 4th Ed.
31