Uploaded by 四鷺葉人草

Lecture5

advertisement
Computer Architecture
Lecture 9 - Floating Point
2011
Reading: 3.6-3.9
Homework: 3.30, 3.35, 3.37, 3.38, 3.40, 3.44
Why did the Ariane 5 Explode?
(image source: java.sun.com)
Outline - Floating Point
 Motivation and Key Ideas
3
 IEEE 754 Floating Point Format
 Floating Point Arithmetic
 MIPS Floating Point Instructions
 Summary
2
Floating Point - Motivation
 Review: n-bit integer representations
 Unsigned:
 Signed Two’s Complement:
 Biased (excess-b):
0 to 2n-1
- 2n-1 to 2n-1-1
-b to 2n-b
 Problem: how do we represent:
 Very large numbers
 Very small numbers
 Rational numbers
 Irrational numbers
 Transcendental numbers
9,345,524,282,135,672,
2354
0.00000000000000005216,
2-100
2/3
sqrt(2)
e, π
3
Fixed Point Representation
 Idea: fixed-point numbers with fractions
 Decimal point (binary point) marks start of fraction
 Decimal: 1.2503 = 1 X 100 + 2 X 10-1 + 5 X 10-2 + 3 X 10-4
 Binary: 1.0100001 = 1 X 20 + 1 X 2-2 + 1 X 2-7
 Problems
 Limited locations for “decimal point” (binary point”)
 Won’t work for very small or very larger numbers
4
Another Approach: Scientific Notation
 Represent a number as a combination of
 Mantissa (significand): Normalized number
AND
 Exponent (base 10)
 Example: 6.02 X 1023
Significand
(mantissa)
Exponent
Radix
(base)
5
Floating Point
 Key idea: adapt scientific notation to binary
 Fixed-width binary number for significand
 Fixed-width binary number for exponent (base 2)
 Idea: represent a number as
Exponent
1.xxxxxxxtwo X 2yyyy
Leading ‘1’
(Implicit)
Significand
(mantissa)
Radix
(2)
Important Points:
This is a tradeoff between precision and range
Arithmetic is approximate - error is inevitable!
6
Outline - Floating Point
 Motivation and Key Ideas
 IEEE 754 Floating Point Format
 Floating Point Arithmetic
 MIPS Floating Point Instructions
 Summary
3
7
IEEE 754 Floating Point
 Single precision (C/C++/Java float type)
S
E Exponent
1 bit
8 bits
F Significand
23 bits
Bias
Value N = (-1)S X 1.F X 2E-127
 Double precision (C/C++/Java double type)
S
E Exponent
1 bit
11 bits
F Significand
20 bits
F Significand (continued - 52 bits total)
32 bits
Value N = (-1)S X 1.F X 2E-1023
Bias
The exponent represents both positive and negative values. To do this, a bias is added to the actual
exponent in order to get the stored exponent. For IEEE SP , this value is 127 to represent 0, thus, a
stored value of 200 indicated an exponent of (200 – 127) = 73. an exponent of 3 is stored as 130. an
exponent of -2 is stored as (-2 +127) = 125. For DP bias = 1024.
8
Floating Point Examples
 8.75ten = 1 X 23 + 1 X 2-1 + 1 X 2-2 = 1.00011 X 23
 Single Precision:
• Significand: 1.00011000…. (note leading 1 is implied)
• Exponent: 3 + 127 = 130 = 10000010two
0 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
S
E Exponent
F Significand
 Double Precision:
• Significand: 1.00011000…
• Exponent: 3 + 1023 = 1026 = 10000000010two
0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
S
E Exponent
F Significand
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
F Significand (continued - 52 bits total)
9
Floating Point Examples
 -0.375ten = 1 X 2-2 + 1 X 2-3 = 1. 1 X 2-2
 Single Precision:
• Significand: 1.1000….
• Exponent: -2 + 127 = 125 = 01111101two
1 0 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
S
E Exponent
F Significand
 Double Precision:
• Significand: 1.1000…
• Exponent: -2 + 1023 = 1021 = 01111111101two
1 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
S
E Exponent
F Significand
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
F Significand (continued - 52 bits total)
10
Floating Point Examples
 Q: What is the value of the following singleprecision word?
0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0
S
E Exponent
F Significand
 Significand = 1 + 2-1 + 2-4 + 2-8 + 2-10 + 2-12
 Exponent = 8 - 127 = -119
 Final Result = (1 + 2-1 + 2-4 + 2-8 + 2-10 + 2-12) X 2-119
= 2.36 X 10-36
11
Outline - Floating Point
 Motivation and Key Ideas
 IEEE 754 Floating Point Format
 Floating Point Arithmetic
3
 MIPS Floating Point Instructions
 Rounding & Errors
 Summary
12
Floating Point Addition
Add: 9.999 x 101 to 1.610 x 10-1
Step 1: Normalize the smaller number to fit with the
larger:
0.0161 x 101
If we can represent only 4 digit number so it becomes:
0.016 x 101
Step 2:
9.999
0.016
10.015 (not in normalized scientific notation
Step 3: 10.015 x 101 = 1.0015 x 102
13
Floating Point Addition (Fig. 3.16)
1. Align binary point to number with larger exponent
2. Add significands
3. Normalize result and adjust exponent
4. If overflow/underflow throw exception
5. Round result (go to 3 if normalization needed again)
A
1.11 X 20
1.11 X 20
1.75
+ B + 1.00 X 2-2
+ 0.01 X 20
0.25
10.00 X 20
(Normalize)
1.00 X 21
2.00
14
Floating Point Multiplication (Fig. 3.18)
1. Add 2 exponents together to get new exponent
(subtract 127 to get proper biased value)
2. Multiply significands
3. Normalize result if necessary (shift right) & adjust
exponent
4. If overflow/underflow throw exception
5. Round result (go to 3 if normalization needed again)
6. Set sign of result using sign of X, Y
15
Outline - Floating Point
 Motivation and Key Ideas
 IEEE 754 Floating Point Format
 Floating Point Arithmetic
 MIPS Floating Point Instructions
 Summary
3
16
MIPS Floating Point Instructions
 Organized as a coprocessor
 Separate registers $f0-$f31
 Separate operations
 Separate data transfer (to same memory)
 Basic operations
 add.s - single
 sub.s - single
 mul.s - single
 div.s - single
add.d - double
sub.d - double
mul.d - double
div.d - double
17
MIPS Floating Point Instructions (cont’d)
 Data transfer
 lwc1, swcl (l.s, s.s) - load/store float
to fp reg
 l.d, s.d - load/store double to fp reg pair
 Testing / branching
 c.lt.s, c.lt.d, c.eq.s, c.eq.d, …
compare and set condition bit if true
 bclt - branch if condition true
 bclf - branch if condition false
18
Outline - Floating Point
 Motivation and Key Ideas
 IEEE 754 Floating Point Format
 Floating Point Arithmetic
 MIPS Floating Point Instructions
 Summary
19
Addendum Why Did the Ariane 5 Explode?
 In 1996 Ariane 5 Flight 501 exploded after launch.
 Estimated cost of accident: $500 million
20
Addendum Why Did the Ariane 5 Explode?
 The cause was traced to the Inertial reference
system (SRI).
 Both the main and backup SRI failed.
 Both units failed due to an out-of-range conversion
Input: double precision floating point
Output: 16-bit integer for “horizontal bias” (BH)
 Careful analysis during software design had
indicated that BH would “fit” in 16 bits
 So, why didn’t it fit?
21
Addendum Why did the Ariane 5 Explode?
 Careful analysis during software design had
indicated that BH would “fit” in 16 bits
 BUT, all analysis had been done for the Ariane 4,
the predecessor of Ariane 5 - software was reused
 Since Ariane 5 was a larger rocket, the values for
BH were higher than anticipated
 AND, there was no handler to deal with the
exception!
 For more information:
 http://www.ima.umn.edu/~arnold/disasters/ariane.html
 Or, Google “Ariane 5 Flight 501”
22
Summary - Chapter 3
 Important Topics
 Signed & Unsigned Numbers (3.2)
 Addition and Subtraction (3.3)
 Constructing an ALU (B.5)
 Multiplication and Division (3.4, 4.5)
 Floating Point (3.6)
 Coming Up:
 Performance (Chapter 4)
23
Download