ppt - CMOS VLSI Design

advertisement
Introduction to
CMOS VLSI
Design
Lecture 12:
Datapath Functional Units
David Harris
Harvey Mudd College
Spring 2004
Outline




Comparators
Shifters
Multi-input Adders
Multipliers
12: Datapath Functional Units
CMOS VLSI Design
Slide 2
Comparators




0’s detector:
A = 00…000
1’s detector:
A = 11…111
Equality comparator: A = B
Magnitude comparator: A < B
12: Datapath Functional Units
CMOS VLSI Design
Slide 3
1’s & 0’s Detectors
 1’s detector: N-input AND gate
 0’s detector: NOTs + 1’s detector (N-input NOR)
A7
A6
A5
A4
A3
A2
A3
A2
allones
allzeros
A1
A0
A1
A0
A7
A6
A5
A4
A3
A2
allones
A1
A0
12: Datapath Functional Units
CMOS VLSI Design
Slide 4
Equality Comparator
 Check if each bit is equal (XNOR, aka equality gate)
 1’s detect on bitwise equality
B[3]
A[3]
B[2]
A[2]
A=B
B[1]
A[1]
B[0]
A[0]
12: Datapath Functional Units
CMOS VLSI Design
Slide 5
Magnitude Comparator
 Compute B-A and look at sign
 B-A = B + ~A + 1
 For unsigned numbers, carry out is sign bit
A B
C
B3
A B
N
A3
B2
A2
B1
Z
A=B
A1
B0
A0
12: Datapath Functional Units
CMOS VLSI Design
Slide 6
Signed vs. Unsigned
 For signed numbers, comparison is harder
–
–
–
–
C: carry out
Z: zero (all bits of A-B are 0)
N: negative (MSB of result)
V: overflow (inputs had different signs, output sign  B)
12: Datapath Functional Units
CMOS VLSI Design
Slide 7
Shifters
 Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = ____
1011 LSL1 = ____
 Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = ____
1011 ASL1 = ____
 Rotate:
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = ____
1011 ROL1 = ____
12: Datapath Functional Units
CMOS VLSI Design
Slide 8
Shifters
 Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = 0101
1011 LSL1 = 0110
 Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = 1101
1011 ASL1 = 0110
 Rotate:
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = 1101
1011 ROL1 = 0111
12: Datapath Functional Units
CMOS VLSI Design
Slide 9
Funnel Shifter
 A funnel shifter can do all six types of shifts
 Selects N-bit field Y from 2N-bit input
– Shift by k bits (0  k < N)
2N-1
N-1
B
0
C
offset + N-1
offset
Y
12: Datapath Functional Units
CMOS VLSI Design
Slide 10
Funnel Shifter Operation
 Computing N-k requires an adder
12: Datapath Functional Units
CMOS VLSI Design
Slide 11
Funnel Shifter Operation
 Computing N-k requires an adder
12: Datapath Functional Units
CMOS VLSI Design
Slide 12
Funnel Shifter Operation
 Computing N-k requires an adder
12: Datapath Functional Units
CMOS VLSI Design
Slide 13
Funnel Shifter Operation
 Computing N-k requires an adder
12: Datapath Functional Units
CMOS VLSI Design
Slide 14
Funnel Shifter Operation
 Computing N-k requires an adder
12: Datapath Functional Units
CMOS VLSI Design
Slide 15
Simplified Funnel Shifter
 Optimize down to 2N-1 bit input
12: Datapath Functional Units
CMOS VLSI Design
Slide 16
Simplified Funnel Shifter
 Optimize down to 2N-1 bit input
12: Datapath Functional Units
CMOS VLSI Design
Slide 17
Simplified Funnel Shifter
 Optimize down to 2N-1 bit input
12: Datapath Functional Units
CMOS VLSI Design
Slide 18
Simplified Funnel Shifter
 Optimize down to 2N-1 bit input
12: Datapath Functional Units
CMOS VLSI Design
Slide 19
Simplified Funnel Shifter
 Optimize down to 2N-1 bit input
12: Datapath Functional Units
CMOS VLSI Design
Slide 20
Funnel Shifter Design 1
 N N-input multiplexers
– Use 1-of-N hot select signals for shift amount
– nMOS pass transistor design (Vt drops!)
k[1:0]
left
Inverters & Decoder
s3
s2
s1
s0
Y3
Y2
Z6
Y1
Z5
Y0
Z4
Z3
Z2
Z1
12: Datapath Functional Units
Z0
CMOS VLSI Design
Slide 21
Funnel Shifter Design 2
 Log N stages of 2-input muxes
– No select decoding needed
k1
k0
left
Z0
Z1
Z2
Z3
Y0
Y1
Y2
Y3
Z4
Z5
Z6
12: Datapath Functional Units
CMOS VLSI Design
Slide 22
Multi-input Adders
 Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = _____
12: Datapath Functional Units
CMOS VLSI Design
Slide 23
Multi-input Adders
 Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = 10111
12: Datapath Functional Units
CMOS VLSI Design
Slide 24
Multi-input Adders
 Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = 10111
 Straightforward solution: k-1 N-input CPAs
– Large and slow
0001 0111 1101 0010
+
+
10101
+
10111
12: Datapath Functional Units
CMOS VLSI Design
Slide 25
Carry Save Addition
 A full adder sums 3 inputs and produces 2 outputs
– Carry output has twice weight of sum output
 N full adders in parallel are called carry save adder
– Produce N sums and N carry outs
X4
C4 S4
Y4 Z4 X3
Y3 Z3 X2
C3 S3
C2 S2
Y2 Z2 X1
Y 1 Z1
C1 S1
XN...1 YN...1 ZN...1
n-bit CSA
CN...1
12: Datapath Functional Units
SN...1
CMOS VLSI Design
Slide 26
CSA Application
 Use k-2 stages of CSAs
– Keep result in carry-save redundant form
 Final CPA computes actual result
0001
0001 0111 1101 0010 0111
+1101
1011
4-bit CSA
0101_
0101_ 1011
0101_
1011
5-bit CSA
+0010
X
Y
Z
S
C
X
Y
Z
S
C
+
A
B
S
12: Datapath Functional Units
CMOS VLSI Design
Slide 27
CSA Application
 Use k-2 stages of CSAs
– Keep result in carry-save redundant form
 Final CPA computes actual result
0001
0001 0111 1101 0010 0111
+1101
1011
4-bit CSA
0101_
0101_ 1011
0101_
1011
5-bit CSA
+0010
01010_
00011 00011
01010_
+
01010_
+ 00011
12: Datapath Functional Units
X
Y
Z
S
C
X
Y
Z
S
C
A
B
S
CMOS VLSI Design
Slide 28
CSA Application
 Use k-2 stages of CSAs
– Keep result in carry-save redundant form
 Final CPA computes actual result
0001
0001 0111 1101 0010 0111
+1101
1011
4-bit CSA
0101_
0101_ 1011
0101_
1011
5-bit CSA
+0010
01010_
00011 00011
01010_
+
01010_
10111
+ 00011
10111
12: Datapath Functional Units
X
Y
Z
S
C
X
Y
Z
S
C
A
B
S
CMOS VLSI Design
Slide 29
Multiplication
 Example:
12: Datapath Functional Units
1100 : 1210
0101 : 510
CMOS VLSI Design
Slide 30
Multiplication
 Example:
12: Datapath Functional Units
1100 : 1210
0101 : 510
1100
CMOS VLSI Design
Slide 31
Multiplication
 Example:
12: Datapath Functional Units
1100 : 1210
0101 : 510
1100
0000
CMOS VLSI Design
Slide 32
Multiplication
 Example:
1100 : 1210
0101 : 510
1100
0000
1100
12: Datapath Functional Units
CMOS VLSI Design
Slide 33
Multiplication
 Example:
1100 : 1210
0101 : 510
1100
0000
1100
0000
12: Datapath Functional Units
CMOS VLSI Design
Slide 34
Multiplication
 Example:
1100 : 1210
0101 : 510
1100
0000
1100
0000
00111100 : 6010
12: Datapath Functional Units
CMOS VLSI Design
Slide 35
Multiplication
 Example:
1100 : 1210
0101 : 510
1100
0000
1100
0000
00111100 : 6010
multiplicand
multiplier
partial
products
product
 M x N-bit multiplication
– Produce N M-bit partial products
– Sum these to produce M+N-bit product
12: Datapath Functional Units
CMOS VLSI Design
Slide 36
General Form
Y = (yM-1, yM-2, …, y1, y0)
X = (xN-1, xN-2, …, x1, x0)
 Multiplicand:
 Multiplier:
N 1
 M 1


j
i 
P    y j 2    xi 2  

 j0
  i0
 Product:
p11
y5
y4
y3
y2
y1
y0
x5
x4
x3
x2
x1
x0
x0y5
x0y4
x0y3
x0y2
x0y1
x0y0
x1y5
x1y4
x1y3
x1y2
x1y1
x1y0
x2y5
x2y4
x2y3
x2y2
x2y1
x2y0
x3y5
x3y4
x3y3
x3y2
x3y1
x3y0
x4y5
x4y4
x4y3
x4y2
x4y1
x4y0
x5y5
x5y4
x5y3
x5y2
x5y1
x5y0
p10
p9
p8
p7
p6
p5
12: Datapath Functional Units
p4
p3
p2
N 1 M 1

i0
xi y j 2
i j
j0
multiplicand
multiplier
partial
products
p1
CMOS VLSI Design
p0
product
Slide 37
Dot Diagram
 Each dot represents a bit
x0
partial products
multiplier x
x15
12: Datapath Functional Units
CMOS VLSI Design
Slide 38
Array Multiplier
y3
y2
y1
y0
x0
x1
CSA
Array
x2
x3
CPA
p7
p6
p5
p4
p3
p2
p1
p0
A B
critical path
Sin A Cin
B
Sin
B
Cout
=
Cout
Cout
A
Sout
12: Datapath Functional Units
Cin
Cin
Sout
A
=
Cout
B
Cin
Sout
Sout
CMOS VLSI Design
Slide 39
Rectangular Array
 Squash array to fit rectangular floorplan
y3
y2
y1
y0
x0
p0
x1
p1
x2
p2
x3
p3
p7
12: Datapath Functional Units
p6
p5
p4
CMOS VLSI Design
Slide 40
Fewer Partial Products
 Array multiplier requires N partial products
 If we looked at groups of r bits, we could form N/r
partial products.
– Faster and smaller?
– Called radix-2r encoding
 Ex: r = 2: look at pairs of bits
– Form partial products of 0, Y, 2Y, 3Y
– First three are easy, but 3Y requires adder 
12: Datapath Functional Units
CMOS VLSI Design
Slide 41
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 42
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 43
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 44
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 45
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 46
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 47
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 48
Booth Encoding
 Instead of 3Y, try –Y, then increment next partial
product to add 4Y
 Similarly, for 2Y, try –2Y + 4Y in next partial product
12: Datapath Functional Units
CMOS VLSI Design
Slide 49
Booth Hardware
 Booth encoder generates control lines for each PP
– Booth selectors choose PP bits
yj
yj-1
Xi
x2i-1
x2i
2Xi
x2i+1
Mi
Booth
Encoder
Booth
Selector
PPij
12: Datapath Functional Units
CMOS VLSI Design
Slide 50
Sign Extension
 Partial products can be negative
– Require sign extension, which is cumbersome
– High fanout on most significant bit
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s
s s s s s s s
s s s s s
s s s
s
s
s
s
s
s
PP0
PP1
PP2
s
s
s
0 x-1
x0
PP3
PP4
multiplier x
s
s
s
s
s
s
s
s
PP5
PP6
PP7
PP8
12: Datapath Functional Units
CMOS VLSI Design
x15
0 x16
0 x17
Slide 51
Simplified Sign Ext.
 Sign bits are either all 0’s or all 1’s
– Note that all 0’s is all 1’s + 1 in proper column
– Use this to reduce loading on MSB
s
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
s
1 1 1 1 1 1 1 1 1 1 1 1 1
s
1 1 1 1 1 1 1 1 1 1 1
s
1 1 1 1 1 1 1 1 1
s
1 1 1 1 1 1 1
s
1 1 1 1 1
s
1 1 1
s
1
s
s
s
s
s
s
s
s
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
12: Datapath Functional Units
CMOS VLSI Design
Slide 52
Even Simpler Sign Ext.
 No need to add all the 1’s in hardware
– Precompute the answer!
s s s
1 s
s
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
CMOS VLSI Design
Slide 53
s
1 s
s
1 s
s
1 s
s
1 s
s
1 s
s
s
12: Datapath Functional Units
s
Advanced Multiplication
 Signed vs. unsigned inputs
 Higher radix Booth encoding
 Array vs. tree CSA networks
12: Datapath Functional Units
CMOS VLSI Design
Slide 54
Download