# Lecture 3

```ELECT 90X
Programmable Logic Circuits:
Multipliers
Dr. Eng. Amr T. Abdel-Hamid
Slides based on slides prepared by:
• B. Parhami, Computer Arithmetic: Algorithms and Hardware
Design, Oxford University Press, 2000.
• I. Koren, Computer Arithmetic Algorithms, 2nd Edition, A.K.
Peters, Natick, MA, 2002.
Fall 2009
Programmable Logic Circuits
Notation for our discussion of multiplication algorithms:
a
x
p
Multiplicand
Multiplier
Product (a  x)
p2k–1p2k–2
ak–1ak–2 . . . a1a0
xk–1xk–2 . . . x1x0
. . . p3 p2 p1 p0
Initially, we assume unsigned operands

a
x
M ultiplic and
M ultiplier
Dr. Amr Talaat
x a
0
x a
1
x a
2
x a
3
p
20
21
22
23
P artial
pro duc ts
bit-m atrix
P roduc t
Multiplication of two 4-bit unsigned binary numbers in dot notation.
ELECT 90X
Multiplication Recurrence
Programmable Logic Circuits

a
x
M ultiplic and
M ultiplier
x a
0
x a
1
x a
2
x a
3
p
20
21
22
23
P artial
pro duc ts
bit-m atrix
P roduc t
Preferred
Multiplication with right shifts: top-to-bottom accumulation
p(j+1) = (p(j) + xj a 2k) 2–1
|––shift right––|
with
p(0) = 0 and
p(k) = p = ax + p(0)2–k
Dr. Amr Talaat
Multiplication with left shifts: bottom-to-top accumulation
p(j+1) = 2 p(j) + xk–j–1a
|shift|
with
p(0) = 0 and
p(k) = p = ax + p(0)2k
ELECT 90X
Examples of Basic Multiplication
Programmable Logic Circuits
Dr. Amr Talaat
Right-shift algorithm
========================
a
1 0 1 0
x
1 0 1 1
========================
p(0)
0 0 0 0
+x0a
1 0 1 0
–––––––––––––––––––––––––
2p(1)
0 1 0 1 0
(1)
p
0 1 0 1 0
+x1a
1 0 1 0
–––––––––––––––––––––––––
2p(2)
0 1 1 1 1 0
(2)
p
0 1 1 1 1 0
+x2a
0 0 0 0
–––––––––––––––––––––––––
2p(3)
0 0 1 1 1 1 0
p(3)
0 0 1 1 1 1 0
+x3a
1 0 1 0
–––––––––––––––––––––––––
2p(4)
0 1 1 0 1 1 1 0
(4)
p
0 1 1 0 1 1 1 0
========================
Left-shift algorithm
=======================
a
1 0 1 0
x
1 0 1 1
=======================
p(0)
0 0 0 0
(0)
2p
0 0 0 0 0
+x3a
1 0 1 0
––––––––––––––––––––––––
p(1)
0 1 0 1 0
(1)
2p
0 1 0 1 0 0
+x2a
0 0 0 0
––––––––––––––––––––––––
p(2)
0 1 0 1 0 0
2p(2)
0 1 0 1 0 0 0
+x1a
1 0 1 0
––––––––––––––––––––––––
p(3)
0 1 1 0 0 1 0
(3)
2p
0 1 1 0 0 1 0 0
+x0a
1 0 1 0
––––––––––––––––––––––––
p(4)
0 1 1 0 1 1 1 0
=======================
Examples
of
sequential
multiplicati
on with
right and
left shifts.
ELECT 90X
Basic Hardware Multipliers
S hift
Programmable Logic Circuits
M u ltip lie r x
D o ub le w id th p a rtia l p ro d uc t p
(j)
S hift
M u ltip lic a nd a
0
0
Mux
xj a
Dr. Amr Talaat
cout
k
1
xj
k
A dder
k
Hardware realization of the sequential multiplication algorithm
ELECT 90X
Programmable Logic Circuits
Multiplication of
Signed Numbers
Sequential
multiplication of
2’s-complement
numbers with right
shifts (positive
multiplier).
Dr. Amr Talaat
Negative multiplicand,
positive multiplier:
No change, other than
looking out for proper
sign extension
============================
a
1 0 1 1 0
x
0 1 0 1 1
============================
p(0)
0 0 0 0 0
+x0a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(1)
1 1 0 1 1 0
p(1)
1 1 0 1 1 0
+x1a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(2)
1 1 0 0 0 1 0
(2)
p
1 1 0 0 0 1 0
+x2a
0 0 0 0 0
–––––––––––––––––––––––––––––
2p(3)
1 1 1 0 0 0 1 0
(3)
p
1 1 1 0 0 0 1 0
+x3a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(4)
1 1 0 0 1 0 0 1 0
p(4)
1 1 0 0 1 0 0 1 0
+x4a
0 0 0 0 0
–––––––––––––––––––––––––––––
2p(5)
1 1 1 0 0 1 0 0 1 0
(5)
p
1 1 1 0 0 1 0 0 1 0
============================
ELECT 90X
Programmable Logic Circuits
The Case of a Neg
ative Multiplier
Sequential
multiplication of
2’s-complement
numbers with right
shifts (negative
multiplier).
Dr. Amr Talaat
Negative multiplicand,
negative multiplier:
In last step (the sign bit),
============================
a
1 0 1 1 0
x
1 0 1 0 1
============================
p(0)
0 0 0 0 0
+x0a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(1)
1 1 0 1 1 0
p(1)
1 1 0 1 1 0
+x1a
0 0 0 0 0
–––––––––––––––––––––––––––––
2p(2)
1 1 1 0 1 1 0
(2)
p
1 1 1 0 1 1 0
+x2a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(3)
1 1 0 0 1 1 1 0
(3)
p
1 1 0 0 1 1 1 0
+x3a
0 0 0 0 0
–––––––––––––––––––––––––––––
2p(4)
1 1 1 0 0 1 1 1 0
p(4)
1 1 1 0 0 1 1 1 0
+(-x4a)
0 1 0 1 0
–––––––––––––––––––––––––––––
2p(5)
0 0 0 1 1 0 1 1 1 0
(5)
p
0 0 0 1 1 0 1 1 1 0
============================
ELECT 90X
Booth’s Encoding
Programmable Logic Circuits
 When multiplying by 9:
 Multiply by 10 (easy, just shift digits left)
 Subtract once
 E.g.
 123454 x 9 = 123454 x (10 – 1) = 1234540 – 1234
54
 Converts addition of six partial products to one shift
and one subtraction
 Booth’s algorithm applies same principle
Dr. Amr Talaat
 Except no ‘9’ in binary, just ‘1’ and ‘0’
 So, it’s actually easier!
ELECT 90X
Booth’s Encoding
Programmable Logic Circuits
 Search for a run of ‘1’ bits in the multiplier
 E.g. ‘0110’ has a run of 2 ‘1’ bits in the middle
 Multiplying by ‘0110’ (6 in decimal) is equivale
nt to multiplying by 8 and subtracting twice, si
nce 6 x m = (8 – 2) x m = 8m – 2m
 Hence, iterate right to left and:
Dr. Amr Talaat
 Subtract multiplicand from product at first ‘1’
 Add multiplicand to product after first ‘1’
 Don’t do either for ‘1’ bits in the middle
ELECT 90X
Booth’s Algorithm
Programmable Logic Circuits
Dr. Amr Talaat
Curren Bit to
t bit
right
Explanation
Example
Operation
1
0
Begins run of ‘1’
0000111100
0
Subtract
1
1
Middle of run of ‘1’ 0000111100
0
Nothing
0
1
End of a run of ‘1’
0000111100
0
0
0
Middle of a run of
‘0’
0000111100
0
Nothing
ELECT 90X
Booth’s Encoding
Programmable Logic Circuits
 Really just a new way to encode numbers
 Normally positionally weighted as 2n
 With Booth, each position has a sign bit
 Can be extended to multiple bits
0
1
1
0
Binary
+1
0
-1
0
1-bit Booth
Dr. Amr Talaat
+2
-2
2-bit Booth
11
ELECT
90X
Booth’s Recoding
Programmable Logic Circuits
–––––––––––––––––––––––––––––––––––––
xi xi–1 yi
Explanation
–––––––––––––––––––––––––––––––––––––
0 0
0
No string of 1s in sight
0 1
1
End of string of 1s in x
-1
1 0
Beginning of string of 1s in x
1 1
0
Continuation of string of 1s in x
–––––––––––––––––––––––––––––––––––––
Dr. Amr Talaat
Example
1 0 0 1
(1) -1 0 1 0
1 1 0 1
0 -1 1 0
1 0 1 0 1 1 1 0
-1 1 -1 1
0 0 -1 0
Operand x
Recoded version y
Justification
2j + 2j–1 + . . . + 2i+1 + 2i = 2j+1 – 2i
ELECT 90X
Programmable Logic Circuits
Example Multiplication
with Booth’s Recoding
Sequential
multiplication of
2’s-complement
numbers with right
shifts by means of
Booth’s recoding.
Dr. Amr Talaat
––––––––––
xi xi–1 yi
––––––––––
0 0
0
0 1
1
-1
1 0
1 1
0
––––––––––
============================
a
1 0 1 1 0
x
1 0 1 0 1 Multiplier
y
1 1 -1 1 -1 Booth-recoded
============================
p(0)
0 0 0 0 0
+y0a
0 1 0 1 0
–––––––––––––––––––––––––––––
2p(1)
0 0 1 0 1 0
(1)
p
0 0 1 0 1 0
+y1a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(2)
1 1 1 0 1 1 0
(2)
p
1 1 1 0 1 1 0
+y2a
0 1 0 1 0
–––––––––––––––––––––––––––––
2p(3)
0 0 0 1 1 1 1 0
(3)
p
0 0 0 1 1 1 1 0
+y3a
1 0 1 1 0
–––––––––––––––––––––––––––––
2p(4)
1 1 1 0 0 1 1 1 0
(4)
p
1 1 1 0 0 1 1 1 0
y4a
0 1 0 1 0
–––––––––––––––––––––––––––––
2p(5)
0 0 0 1 1 0 1 1 1 0
(5)
p
0 0 0 1 1 0 1 1 1 0
============================
ELECT 90X
Programmable Logic Circuits

a
x
M ultiplic and
M ultiplier
x a
0
x a
1
x a
2
x a
3
20
21
22
23
P artial
pro duc ts
bit-m atrix
p
Dr. Amr Talaat
Number of cycles is
halved, but now the
“difficult” multiple 3a
must be dealt with
multiplication in dot
notation
P roduc t

a
x
M ultiplic and
M ultiplier
(x x )
a 40
(x x )
a 41
1
3
p
0 tw o
2 tw o
P roduc t
ELECT 90X
A Possible Design for a Radix-4 Multiplier
Programmable Logic Circuits
Precomputed via
(3a = 2a + a)
Multiplier
3a
0
a
2a
2-bit shifts
x i+1
00 01 10 11
xi
Mux
k + 1 cycles, rather than k
One extra cycle not too bad,
but we would like to avoid it
if possible
Dr. Amr Talaat
Solving this problem for radix 4
may also help when dealing
The multiple generation part
precomputation of 3a.
ELECT 90X
Programmable Logic Circuits
Dr. Amr Talaat
================================
a
0 1 1 0
3a
0 1 0 0 1 0
x
1 1 1 0
================================
p(0)
0 0 0 0
+(x1x0)twoa
0 0 1 1 0 0
–––––––––––––––––––––––––––––––––
4p(1)
0 0 1 1 0 0
p(1)
0 0 1 1 0 0
+(x3x2)twoa
0 1 0 0 1 0
–––––––––––––––––––––––––––––––––
4p(2)
0 1 0 1 0 1 0 0
p(2)
0 1 0 1 0 1 0 0
================================
multiplication using
the 3a multiple.
ELECT 90X
Modified Booth’s Recoding
Programmable Logic Circuits
Radix-4 Booth’s recoding yielding (zk/2 . . . z1z0)four
Dr. Amr Talaat
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
xi+1
xi
xi–1
yi+1
yi
zi/2
Explanation
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
0
0
0
0
0
0
No string of 1s in sight
0
0
1
0
1
1
End of string of 1s
0
1
0
0
1
1
Isolated 1
0
1
1
1
0
2
End of string of 1s
1
0
0
1
0
2
Beginning of string of 1s
1
0
1
1
1
1
End a string, begin new one
1
1
0
0
1
1
Beginning of string of 1s
1
1
1
0
0
0
Continuation of string of 1s
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Recoded
Context
Example
1 0 0 1
(1) -1 0 1 0
(1)
-2
2
1 1 0 1
0 -1 1 0
-1
2
1 0 1 0 1 1 1 0
-1 1 -1 1
0 0 -1 0
-1
-1
0
-2
Operand x
Recoded version y
ELECT 90X
Example Multiplication via Modified Booth’s Recoding
Programmable Logic Circuits
Dr. Amr Talaat
================================
a
0 1 1 0
x
1 0 1 0
-1
-2
z
================================
p(0)
0 0 0 0 0 0
+z0a
1 1 0 1 0 0
–––––––––––––––––––––––––––––––––
4p(1)
1 1 0 1 0 0
p(1)
1 1 1 1 0 1 0 0
+z1a
1 1 1 0 1 0
–––––––––––––––––––––––––––––––––
4p(2)
1 1 0 1 1 1 0 0
p(2)
1 1 0 1 1 1 0 0
================================
multiplication with
modified Booth’s
recoding of the 2’scomplement multiplier.
ELECT 90X
Multiple Generation with Radix-4 Booth’s Recoding
Programmable Logic Circuits
Multiplier
Multiplicand
Init. 0
2-bit shift
x i+1
xi
x
i–1
k
Sign extension,
not 0
Recoding Logic
neg
two
non0
0
0
a
0
En able
2a
1
Mux
Select
0, a, or 2a
k+1
Dr. Amr Talaat
co ntrol
z i/2 a
The multiple generation part of a radix-4 multiplier
based on Booth’s recoding.
ELECT 90X
Programmable Logic Circuits
Count = 4 Vs 8 speed improvement
Count = 7 Vs 9 no speed improvement
Dr. Amr Talaat
Count = 16 Vs 8 speed worsened.
On an average no improvement in speed
ELECT 90X
Yet Another Design for Radix-4 Multiplication
Multip lier
Programmable Logic Circuits
2a
0
M ux
x i+1
a
0
xi
M ux
Old Cumulativ e
Partial Pro du ct
CSA
CSA
New Cumulativ e
Partial Pro du ct
Dr. Amr Talaat
FF
2-Bit
To the Lo wer Half
of Partial Pro duct
ELECT 90X
Multiplier
Programmable Logic Circuits
0
8a
Mux
0
x i+3
4a
Mux
4-bit
right
shift
0
4-Bit
Shift
Mux
0
x i+1
a
Mux
CSA
Dr. Amr Talaat
with the upper half of the
cumulative partial
product in carry-save
form.
x i+2
2a
xi
CSA
CSA
CSA
Sum
Carry
Partial Product
(Upper Half)
4
3
FF
4-Bit
4
To the Lo wer Half
of Partial Pro du ct
ELECT 90X
A Spectrum of Multiplier Design Choices
Programmable Logic Circuits
Next
multiple
Several
multiples
All multiples
...
...
Small CSA
tree
Full CSA
tree
Partial product
Partial product
Dr. Amr Talaat
Basic
binary
Speed up
or
partial tree
Economize
Full
tree
ELECT 90X
Multibeat Multipliers
Programmable Logic Circuits
Inp uts
P res ent
s tate
Nex t-s tate
logic
S tate
flip- fl ops
N e xt-s ta te
e xcita tio n
Inp uts
P H1
Nex t-s tate
logic
S tate
latc hes
CLK
(a) S e que ntial m ac hine with F F s
S tate
latc hes
Nex t-s tate
logic
P H2
Inp uts
(b) S e que ntial m ac hine with latc hes and 2 -ph as e c loc k
Conceptual view of a twin-beat multiplier.
Dr. Amr Talaat
Begin changing FF contents
Change becomes visible at FF output
Observation: Half of the
clock cycle goes to waste
One cycle
ELECT 90X
Twin-Beat and Three-Beat Multipliers
Programmable Logic Circuits
Twin Multiplier
Registers
a
3a
a
3a
4
4
Pip elined
Booth
Reco der
& Selecto r
Pip elined
Booth
Reco der
& Selecto r
CSA
CSA
Sum
Sum
Carry
Carry
5
Dr. Amr Talaat
FF
6
6-Bit
6
To the Lo wer Half
of Partial Pro duct
Booth’s recoding.
ELECT 90X
Full-Tree Multipliers
Programmable Logic Circuits
Multip lier
...
a
Multip leForming
Circuits
a
. . .
a
a
Partial-Pro ducts
Reduction Tree
(Multi-Operand
Redundant result
Dr. Amr Talaat
Redundant-to-Binary
Converter
Higher-order
product bits
Some lower-order
product bits are
generated directly
General structure of a full-tree multiplier.
ELECT 90X
Full-Tree versus Partial-Tree Multiplier
Programmable Logic Circuits
A ll p a r tia l p ro d ucts
S e ve ra l p a rtia l p ro d ucts
. . .
. . .
Dr. Amr Talaat
L a rg e tre e o f
ca rry -sa ve
a d d e rs
Logd e p th
A dder
Logd e p th
P ro d uct
S m a ll tre e o f
ca rry -sa ve
a d d e rs
A dder
P ro d uct
Schematic diagrams for full-tree and partial-tree multipliers.
ELECT 90X
Variations in Full-Tree Multiplier Design
Programmable Logic Circuits
Designs are distinguished by
variations in three elements:
Multip lier
...
a
Multip leForming
Circuits
1. Multiple-forming circuits
a
. . .
a
a
Partial-Pro ducts
Reduction Tree
2. Partial products reduction tree
(Multi-Operand
Dr. Amr Talaat
Redundant result
3. Redundant-to-binary converter
Redundant-to-Binary
Converter
Higher-order
product bits
Some lower-order
product bits are
generated directly
ELECT 90X
Example of Variations in CSA Tree Design
Programmable Logic Circuits
D a d d a T re e
(4 F A s + 2 H A s + 6 -B it A d d e r)
W a lla c e T re e
(5 F A s + 3 H A s + 4 -B it A d d e r)
1
2
1
3
4
3
2
1
FA FA FA HA
3
2
3
2
1
4
3
2
1
-------------------1
1
3
2
2
3
2
1
FA HA HA FA
FA HA FA HA
---------------------2
2
2
2
1
3
FA FA
-------------------1
2
1
---------------------2
1
2
2
2
1
2
1
Dr. Amr Talaat
----------------------
----------------------
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Two different binary 4  4 tree multipliers.
ELECT 90X
Binary Tree of 4-to-2 Reduction Modules
Programmable Logic Circuits
CSA
CSA
4 -to -2 re d u c tio n m o d u le
im p le m e n te d w ith tw o
le v e ls o f (3 ; 2 )-c o u n te rs
4 -to -2
4 -to -2
4 -to -2
4 -to -2
4 -to -2
4 -to -2
4 -to -2
Dr. Amr Talaat
Tree multiplier with a more regular structure
based on 4-to-2 reduction modules.
Due to its recursive structure, a binary tree is more regular
than a 3-to-2 reduction tree when laid out in VLSI
ELECT 90X
Array Multipliers
Programmable Logic Circuits
x 2a
x 1a
x 0a
CSA
x 3a
a4 x0
0
0 a3 x0
CSA
a 2 x0 0 a 1 x0
0
a0 x0
p0
a3 x1
a2 x1
a4 x1
x 4a
0
a1 x1
a0 x1
p1
a3 x2
a2 x2
a4 x2
a1 x2
a0 x2
CSA
p2
a3 x3
CSA
a2 x3
a4 x3
a1 x3
p3
a3 x4
a 0 x3
a1 x4
a2 x4
a4 x4
a 0 x4
p4
Dr. Amr Talaat
0
ax
A basic array multiplier uses a
one-sided CSA tree and a ripplecarry adder.
p9
p8
p7
p6
Details of a 5  5 array multiplier
using FA blocks.
ELECT 90X
p5
Array Multiplier Built of Modified Full-Adder Cells
Programmable Logic Circuits
Design of a 5  5 array
multiplier with two
include AND gates.
a
4
a
3
a
2
a
1
a
0
x
p
0
x
p
Dr. Amr Talaat
p
p
9
p
8
p
7
p
p
6
3
3
x
FA
2
2
x
p
1
1
x
p
0
4
5
ELECT 90X
4
Pipelined Array Multipliers
a
a
Programmable Logic Circuits
4
a
3
a
2
a
1
x
0
0
x
1
x
2
x
3
x
4
With latches after every
FA level, the maximum
throughput is achieved
Latches may be inserted
after every h FA levels for
an intermediate design
Example: 3-stage pipeline
Dr. Amr Talaat
Pipelined 5 5 array
multiplier using latched FA
blocks.
are latches.
L a tch e d
FA w ith
AN D g a te
FA
FA
FA
L a tch
FA
p
9
p
8
p
7
p
6
p
5
p
4
p
3
p
2
p
1
ELECT 90X
p
0
Bit-Serial Multipliers
Programmable Logic Circuits
(LSB first)
…x x x
2 1 0
FF
FA
…y y y
…s s s
2 1 0
2 1 0
Bit-serial multiplier
Dr. Amr Talaat
inputs with k 0s;
alternatively, view
the product as being
only k bits wide)
…a a a
2 0 1
…x x x
?
…p p p
2 0 1
2 0 1
What goes inside the box to make a bit-serial multiplier?
Can the circuit be designed to support a high clock rate?
ELECT 90X
Semisystolic Serial-Parallel Multiplier
Programmable Logic Circuits
a3
Multiplicand (parallel in)
a1
a2
x0 x1 x2 x3
a0
Multiplier
(serial in)
LSB-first
Su m
FA
Carry
FA
FA
FA
Product
(serial out)
Dr. Amr Talaat
Semi-systolic circuit for 4  4 multiplication in 8 clock cycles.
This is called “semisystolic” because it has a large signal fan-out of k
(k-way broadcasting) and a long wire spanning all k positions
ELECT 90X
Systolic Retiming as a Design Tool
Programmable Logic Circuits
A semisystolic circuit can be converted to a systolic circuit
via retiming, which involves advancing and retarding signals
by means of delay removal and delay insertion in such a
way that the relative timings of various parts are unaffected
Cut
–d
+d
e+d
f+d
e
f
CR
CL
CL
g
h
g–d
h–d
CR
Dr. Amr Talaat
–d
Original delays
+d
Example of retiming by delaying the inputs to CL and
advancing the outputs from CL by d units
ELECT 90X
Multiplicand (parallel in)
a1
a2
a3
Programmable Logic Circuits
A First Attempt
at Retiming
x0 x1 x2 x3
a0
Multiplier
(serial in)
LSB-first
Su m
FA
FA
FA
Product
(serial out)
Carry
a3
Multiplicand (parallel in)
a1
a2
FA
x0 x1 x2 x3
a0
Multiplier
(serial in)
LSB-first
Sum
FA
FA
Dr. Amr Talaat
FA
FA
Product
(serial out)
Carry
Cut 3
Cut 2
Cut 1
A retimed version of our semisystolic multiplier.
ELECT 90X
Multiplicand (parallel in)
a1
a2
a3
Programmable Logic Circuits
Deriving a Fully S
ystolic Multiplier
x0 x1 x2 x3
a0
Multiplier
(serial in)
LSB-first
Su m
FA
FA
FA
FA
Product
(serial out)
Carry
a3
Multip licand (parallel in)
a1
a2
x0
a0
x1
x2
x3
Multip lier
(serial in)
LSB-first
Sum
Dr. Amr Talaat
Carry
FA
FA
FA
FA
Product
(serial out)
A retimed version of our semisystolic multiplier.
ELECT 90X
```

20 cards

24 cards

17 cards

32 cards

27 cards