EI209-chapter3

advertisement
EI 209
Computer Organization
Fall 2015
Chapter 3: Arithmetic for
Computers
Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )
[Adapted from Computer Organization and Design, 4th Edition,
Patterson & Hennessy, © 2012, MK]
EI 209 Chapter 3.1
CSE, 2015
Review: MIPS (RISC) Design Principles

Simplicity favors regularity




Smaller is faster




limited instruction set
limited number of registers in register file
limited number of addressing modes
Make the common case fast



fixed size instructions
small number of instruction formats
opcode always the first 6 bits
arithmetic operands from the register file (load-store machine)
allow instructions to contain immediate operands
Good design demands good compromises

three instruction formats
EI 209 Chapter 3.2
CSE, 2015
Specifying Branch Destinations

Use a register (like in lw and sw) added to the 16-bit offset

which register? Instruction Address Register (the PC)
- its use is automatically implied by instruction
- PC gets updated (PC+4) during the fetch cycle so that it holds the
address of the next instruction

limits the branch distance to -215 to +215-1 (word) instructions from
the (instruction after the) branch instruction, but most branches are
local anyway
from the low order 16 bits of the branch instruction
16
offset
sign-extend
00
32
32 Add
PC
32
EI 209 Chapter 3.3
32
4
32
Add
32
branch dst
address
32
?
CSE, 2015
Other Control Flow Instructions

MIPS also has an unconditional branch instruction or
jump instruction:
j

label
#go to label
Instruction Format (J Format):
0x02
26-bit address
from the low order 26 bits of the jump instruction
26
Why shift left by two bits?
00
32
4
PC
EI 209 Chapter 3.4
32
CSE, 2015
Review: MIPS Addressing Modes Illustrated
1. Register addressing
op
rs
rt
rd
funct
Register
word operand
2. Base (displacement) addressing
op
rs
rt
offset
Memory
word or byte operand
base register
3. Immediate addressing
op
rs
rt
operand
4. PC-relative addressing
op
rs
rt
offset
Memory
branch destination instruction
Program Counter (PC)
5. Pseudo-direct addressing
op
Memory
jump address
||
jump destination instruction
Program Counter (PC)
EI 209 Chapter 3.5
CSE, 2015
Number Representations

32-bit signed numbers (2’s complement):
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
...
0111
0111
1000
1000
...
MSB
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1111
1111
0000
0000
1110two
1111two
0000two
0001two
=
=
=
=
+
+
–
–
maxint
2,147,483,646ten
2,147,483,647ten
2,147,483,648ten
2,147,483,647ten
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
minint
LSB

Converting <32-bit values into 32-bit values

copy the most significant bit (the sign bit) into the “empty” bits
0010 -> 0000 0010
1010 -> 1111 1010

sign extend
EI 209 Chapter 3.6
versus
zero extend (lb vs. lbu)
CSE, 2015
MIPS Arithmetic Logic Unit (ALU)

zero ovf
Must support the Arithmetic/Logic
operations of the ISA
add, addi, addiu, addu
1
1
A
32
sub, subu
ALU
mult, multu, div, divu
sqrt
result
32
B
32
and, andi, nor, or, ori, xor, xori
4
m (operation)
beq, bne, slt, slti, sltiu, sltu

With special handling for

sign extend – addi, addiu, slti, sltiu

zero extend – andi, ori, xori

overflow detection – add, addi, sub
EI 209 Chapter 3.7
CSE, 2015
Dealing with Overflow

Overflow occurs when the result of an operation cannot
be represented in 32-bits, i.e., when the sign bit contains
a value bit of the result and not the proper sign bit


When adding operands with different signs or when subtracting
operands with the same sign, overflow can never occur
Operation
Operand A
Operand B
Result indicating
overflow
A+B
≥0
≥0
<0
A+B
<0
<0
≥0
A-B
≥0
<0
<0
A-B
<0
≥0
≥0
MIPS signals overflow with an exception (aka interrupt) –
an unscheduled procedure call where the EPC contains
the address of the instruction that caused the exception
EI 209 Chapter 3.8
CSE, 2015
Addition & Subtraction

Just like in grade school (carry/borrow 1s)
0111
0111
0110
+ 0110
- 0110
- 0101
1101

0001
Two's complement operations are easy

do subtraction by negating and then adding
0111
- 0110
0001

0001


0111
+ 1010
1 0001
Overflow (result too large for finite computer word)

e.g., adding two n-bit numbers does not yield an n-bit number
0111
+ 0001
1000
EI 209 Chapter 3.10
CSE, 2015
Building a 1-bit Binary Adder
carry_in
A
1 bit
Full
Adder
B
carry_out
S
A
B
carry_in
carry_out
S
0
0
0
0
0
0
0
1
0
1
0
1
0
0
1
0
1
1
1
0
1
0
0
0
1
1
0
1
1
0
1
1
0
1
0
1
1
1
1
1
S = A xor B xor carry_in
carry_out = A&B | A&carry_in | B&carry_in
(majority function)

How can we use it to build a 32-bit adder?

How can we modify it easily to build an adder/subtractor?
EI 209 Chapter 3.11
CSE, 2015
Building 32-bit Adder
c0=carry_in
A0
B0
A1
B1
A2
S0
1-bit
FA
c2
S1
1-bit
FA
c3
S2

Just connect the carry-out of
the least significant bit FA to the
carry-in of the next least
significant bit and connect . . .

Ripple Carry Adder (RCA)

advantage: simple logic, so small
(low cost)

disadvantage: slow and lots of
glitching (so lots of energy
consumption)
...
B2
1-bit
FA
c1
c31
A31
B31
1-bit
FA
S31
c32=carry_out
EI 209 Chapter 3.12
CSE, 2015
A 32-bit Ripple Carry Adder/Subtractor
Remember 2’s
complement is just

complement all the bits
control
(0=add,1=sub)
B0

B0 if control = 0
!B0 if control = 1
add a 1 in the least
significant bit
A
0111
B - 0110
0001
EI 209 Chapter 3.14

0111
 + 1001
1
1 0001
c0=carry_in
A0
1-bit
FA
c1
S0
A1
1-bit
FA
c2
S1
A2
1-bit
FA
c3
S2
B0
B1
B2
...

add/sub
c31
A31
B31
1-bit
FA
S31
c32=carry_out
CSE, 2015
Overflow Detection Logic

Carry into MSB ! = Carry out of MSB
 For a N-bit ALU: Overflow = CarryIn [N-1] XOR CarryOut [N-1]
CarryIn0
A0
1-bit
ALU
B0
CarryIn1
A1
CarryOut0
1-bit
ALU
B1
CarryIn2
A2
Result1
CarryOut1
1-bit
ALU
B2
Result0
X
Y
X XOR Y
0
0
1
1
0
1
0
1
0
1
1
0
why?
Result2
CarryIn3 CarryOut2
A3
B3
1-bit
ALU
Result3
Overflow
CarryOut3
EI 209 Chapter 3.15
CSE, 2015
Multiply

Binary multiplication is just a bunch of right shifts and
adds
n
multiplicand
multiplier
partial
product
array
n
can be formed in parallel
and added in parallel for
faster multiplication
double precision product
2n
EI 209 Chapter 3.16
CSE, 2015
Multiplication

More complicated than addition

Can be accomplished via shifting and adding
0010 (multiplicand)
x_1011 (multiplier)
0010
0010
(partial product
0000
array)
0010
00010110 (product)

In every step
• multiplicand is shifted
• next bit of multiplier is examined (also a shifting step)
• if this bit is 1, shifted multiplicand is added to the product
EI 209 Chapter 3.17
CSE, 2015
Multiplication Algorithm 1
In every step
• multiplicand is shifted
• next bit of multiplier is examined (also a shifting step)
• if this bit is 1, shifted multiplicand is added to the product
EI 209 Chapter 3.18
CSE, 2015
EI 209 Chapter 3.19
CSE, 2015
Comments on Multiplicand Algorithm 1

Performance
 Three basic steps for each bit
 It requires 100 clock cycles to multiply two
32-bit numbers If each step took a clock
cycle,
How to improve it?
 Motivation (Performing the operations in
parallel):
 Putting multiplier and the product together
 Shift them together

EI 209 Chapter 3.20
CSE, 2015
Refined Multiplicand Algorithm 2
multiplicand
add
32-bit ALU
product
multiplier
shift
right
Control
• 32-bit ALU and multiplicand is untouched
• the sum keeps shifting right
• at every step, number of bits in product + multiplier = 64,
hence, they share a single 64-bit register
EI 209 Chapter 3.21
CSE, 2015
Add and Right Shift Multiplier Hardware
0110
=6
multiplicand
add
32-bit ALU
product
shift
right
multiplier
add
add
add
add
EI 209 Chapter 3.22
0000
0110
0011
0011
0001
0111
0011
0011
0001
0101
0101
0010
0010
1001
1001
1100
1100
1110
Control
=5
= 30
CSE, 2015
Exercise

Using 4-bit numbers to save space, multiply 2ten*3ten, or
0010two * 0011two
EI 209 Chapter 3.23
CSE, 2015
Division

Division is just a bunch of quotient digit guesses and left
shifts and subtracts
dividend = quotient x divisor + remainder
n
quotient
n
0 0 0
dividend
divisor
0
partial
remainder
array
0
0
remainder
n
EI 209 Chapter 3.24
CSE, 2015
Division
Divisor
1000ten
1001ten
| 1001010ten
-1000
10
101
1010
-1000
10ten
Quotient
Dividend
Remainder
At every step,
• shift divisor right and compare it with current dividend
• if divisor is larger, shift 0 as the next bit of the quotient
• if divisor is smaller, subtract to get new dividend and shift 1
as the next bit of the quotient
EI 209 Chapter 3.25
CSE, 2015
First Version of Hardware for Division
A comparison requires a subtract; the sign of the result is
examined; if the result is negative, the divisor must be added back
26
EI 209 Chapter 3.26
CSE, 2015
Divide Algorithm
Start
1. Subtract the Divisor register from the
Remainder register, and place the result in the
Remainder register.
Remainder >=0
2a. Shift the Quotient register to the left
setting the new rightmost bit to 1.
Test Remainder
Remainder < 0
2b. Restore the original value by adding the
Divisor reg to the Remainder reg and place the
sum in the Remainder reg. Also shift the Quotient
register to the left, setting the new LSB to 0
3. Shift the Divisor register right1 bit.
33rd repetition?
No: < 33repetitions
Yes: 33repetitions
Done
EI 209 Chapter 3.27
CSE, 2015
Divide Example
• Divide 7ten (0000 0111two) by 2ten (0010two)
Iter
0
Step
Quot
Divisor
Remainder
Initial values
1
2
3
4
5
28
EI 209 Chapter 3.28
CSE, 2015
Divide Example
• Divide 7ten (0000 0111two) by 2ten (0010two)
Iter
Step
Quot
Divisor
Remainder
0
Initial values
0000
0010 0000
0000 0111
1
Rem = Rem – Div
Rem < 0  +Div, shift 0 into Q
Shift Div right
0000
0000
0000
0010 0000
0010 0000
0001 0000
1110 0111
0000 0111
0000 0111
2
Same steps as 1
0000
0000
0000
0001 0000
0001 0000
0000 1000
1111 0111
0000 0111
0000 0111
3
Same steps as 1
0000
0000 0100
0000 0111
4
Rem = Rem – Div
Rem >= 0  shift 1 into Q
Shift Div right
0000
0001
0001
0000 0100
0000 0100
0000 0010
0000 0011
0000 0011
0000 0011
5
Same steps as 4
0011
0000 0001
0000 0001
EI 209 Chapter 3.29
CSE, 2015
Efficient Division
Shift Right
Divisor
64 bits
Shift Left
Quotient
32 bits
64-bit ALU
Remainder
64 bits
Write
Control
30
divisor
subtract
32-bit ALU
dividend
remainder
EI 209 Chapter 3.30
quotient
shift
left
Control
CSE, 2015
Left Shift and Subtract Division Hardware
0010
=2
divisor
subtract
32-bit ALU
dividend
remainder
sub
sub
sub
sub
EI 209 Chapter 3.31
0000
0000
1110
0000
0001
1111
0001
0011
0001
0010
0000
quotient
shift
left
Control
0110 =6
1100
1100
rem neg, so ‘ient bit = 0
1100
restore remainder
1000
1100
rem neg, so ‘ient bit = 0
1000
restore remainder
0000
rem pos, so ‘ient bit = 1
0001
0010
rem pos, so ‘ient bit = 1
0011
= 3 with 0 remainder
CSE, 2015
Restoring Unsigned Integer Division
s(0) = z
the remainder
shift left by 1 bit
K=32, put divisor in the
left 32 bit register
for j = 1 to k
if 2 s(j-1) - 2k d > 0
qk-j = 1
s(j) = 2 s(j-1) - 2k d
else
qk-j = 0
s(j) = 2 s(j-1)
No need to restore the
remainder
in the case of
R-D>0,
Restore the remainder
In the case of
R-D<0,
32
EI 209 Chapter 3.32
CSE, 2015
Non-Restoring Unsigned Integer Division
If in the last step,
remainder –divisor >0,
Perform subtraction
why?
s(1) = 2 z - 2k d
for j = 2 to k
if s(j-1)  0
qk-(j-1) = 1
s(j) = 2 s(j-1) - 2k d
else
qk-(j-1) = 0
s(j) = 2 s(j-1) + 2k d
end for
if s(k)  0
q0 = 1
else
q0 = 0
If in the last step,
remainder –divisor <0,
Perform addition
Correction step
EI 209 Chapter 3.33
CSE, 2015
s(0)
=z
for j = 1 to k
if 2 s(j-1) - 2k d > 0
qk-j = 1
s(j) = 2 s(j-1) - 2k d
else
qk-j = 0
s(j) = 2 s(j-1)
Restoring Unsigned Integer
Division
s(1) = 2 z - 2k d
for j = 2 to k
if s(j-1)  0
qk-(j-1) = 1
equal
s(j) = 2 s(j-1) - 2k d
else
qk-(j-1) = 0
Why?
s(j) = 2 s(j-1) + 2k d
end for
if s(k)  0
q0 = 1
else
q0 = 0
Correction step
Non-Restoring Unsigned
Integer Division
2x-y= 2(x-y)+y
EI 209 Chapter 3.34
considering two consequent
steps j-1 and j, in particular
2s(j-2) - 2k d <0
In the j-1 step, Restoring
Algorithm computes
qk-j = 0
s(j-1) = 2 s(j-2)
In the subsequent j step,
Restoring Algorithm
computes
2 s(j-1) - 2k d
== 2*2 s(j-2) - 2k d
Non-Restoring Algorithm
s(j-1) = 2 s(j-2) - 2k d
In the subsequent j step, nonRestoring Algorithm
computes
2 s(j-1) + 2k d
= 2*2 s(j-2) - 2*2k d +2k d
= 2*2 s(j-2) - 2k d
CSE, 2015
Non-restoring algorithm
set subtract_bit true
1: If subtract bit true:
Subtract the Divisor register from the Remainder and place the result
in the remainder register
else
Add the Divisor register to the Remainder and place the result in the
remainder register
2:If Remainder >= 0
Shift the Quotient register to the left, setting rightmost bit to 1
else
Set subtract bit to false
3: Shift the Divisor register right 1 bit
if < 33rd rep
goto 1
else
Add Divisor register to remainder and place in Remainder register
exit
EI 209 Chapter 3.35
CSE, 2015
Example:
Perform n + 1 iterations for n bits
Remainder 0000 1011
Divisor 00110000
----------------------------------Iteration 1:
(subtract)
Rem 1101 1011
Quotient 0
Divisor 0001 1000
----------------------------------Iteration 2:
(add)
Rem 11110011
Q00
Divisor 0000 1100
----------------------------------Iteration 3:
(add)
Rem 11111111
Q000
Divisor 0000 0110
EI 209 Chapter 3.36
----------------------------------Iteration 4:
(add)
Rem 0000 0101
Q0001
Divisor 0000 0011
----------------------------------Iteration 5:
(subtract)
Rem 0000 0010
Q 00011
Divisor 0000 0001
Since reminder is positive, done.
Q = 0011 and Rem = 0010
CSE, 2015
Exercise
Calculate A divided by B using restoring and non-restoring
division. A=26, B=5
EI 209 Chapter 3.37
CSE, 2015
MIPS Divide Instruction

Divide (div and divu) generates the reminder in hi
and the quotient in lo
div
$s0, $s1
# lo = $s0 / $s1
# hi = $s0 mod $s1
0


16
17
0
0
0x1A
Instructions mfhi rd and mflo rd are provided to move
the quotient and reminder to (user accessible) registers in the
register file
As with multiply, divide ignores overflow so software
must determine if the quotient is too large. Software
must also check the divisor to avoid division by 0.
EI 209 Chapter 3.38
CSE, 2015
Lecture 1
EI 209 Chapter 3.39
CSE, 2015
Download