Floating Point Representation Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons

advertisement
Floating Point Representation
Major: All Engineering Majors
Authors: Autar Kaw, Matthew Emmons
http://numericalmethods.eng.usf.edu
Numerical Methods for STEM undergraduates
7/12/2016
http://numericalmethods.eng.usf.edu
1
Floating Decimal Point : Scientific Form
256.78 is written as  2.5678 10
2
0.003678 is written as  3.678 10
3
 256.78 is written as  2.5678 10
2
2
http://numericalmethods.eng.usf.edu
Example
The form is
or
sign  mantissa 10exponent
  m 10e
Example: For
 2.5678 10 2
  1
m  2.5678
e2
3
http://numericalmethods.eng.usf.edu
Floating Point Format for Binary
Numbers
y    m 2
  sign of number 0 for  ve, 1 for - ve 
e
m  mantissa 12  m  102 
1 is not stored as it is always given to be 1.
e  integer exponent
4
http://numericalmethods.eng.usf.edu
Example
9 bit-hypothetical word
the
the
the
the
first bit is used for the sign of the number,
second bit for the sign of the exponent,
next four bits for the mantissa, and
next three bits for the exponent
54.7510  110110.112  1.10110112  25
 1.10112  1012
We have the representation as
0
Sign of the
number
5
0
1
Sign of the
exponent
0
1
mantissa
1
1
0
1
exponent
http://numericalmethods.eng.usf.edu
Machine Epsilon
Defined as the measure of accuracy and found
by difference between 1 and the next number
that can be represented
6
http://numericalmethods.eng.usf.edu
Example
Ten bit word
Sign of number
Sign of exponent
Next four bits for exponent
Next four bits for mantissa
Next
number
0
0
0
0
0
0
0
0
0
0
 110
0
0
0
0
0
0
0
0
0
1
 1.00012  1.062510
mach  1.0625  1  2 4
7
http://numericalmethods.eng.usf.edu
Relative Error and Machine
Epsilon
The absolute relative true error in representing
a number will be less then the machine epsilon
Example
0.0283210  1.11002  25
 1.11002  20110
2
10 bit word (sign, sign of exponent, 4 for exponent, 4 for mantissa)
0
Sign of the
number
1
0
Sign of the
exponent
1
1
exponent
1.11002  20110
2
0
1
1
0
0
mantissa
 0.0274375
0.02832  0.0274375
a 
0.02832
8
 0.034472  2 4  0.0625
http://numericalmethods.eng.usf.edu
IEEE 754 Standards for Single
Precision Representation
http://numericalmethods.eng.usf.edu
IEEE-754 Floating Point
Standard
• Standardizes representation of
floating point numbers on
different computers in single and
double precision.
• Standardizes representation of
floating point operations on
different computers.
One Great Reference
What every computer scientist (and even if
you are not) should know about floating point
arithmetic!
http://www.validlab.com/goldberg/paper.pdf
IEEE-754 Format Single
Precision
32 bits for single precision
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Sign
(s)
Biased
Exponent (e’)
Mantissa (m)
.
Value  ( 1)  1 m 2  2
s
12
e ' 127
Example#1
1 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Sign
(s)
Biased
Exponent (e’)
Mantissa (m)
Value   1  1. m2  2
s
e ' 127
 1  1.101000002  2
  1  1.625  2162127
  1 1.625 235  5.5834 1010
1
13
(10100010) 2 127
Example#2
Represent -5.5834x1010 as a single
precision floating point number.
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Sign
(s)
Biased
Exponent (e’)
Mantissa (m)
 5.5834  10   1  1. ?   2
10
14
1
?
Exponent for 32 Bit IEEE-754
8 bits would represent
0  e  255
Bias is 127; so subtract 127 from
representation
127  e  128
15
Exponent for Special Cases
Actual range of e
1  e  254
e  0 and e  255 are reserved for special numbers
Actual range of
e
 126  e  127
Special Exponents and Numbers
s
0
1
0
1
0 or 1
e  0
e  255
e
all zeros
all zeros
all ones
all ones
all ones
all zeros
all ones
m
Represents
all zeros
0
all zeros
-0
all zeros

all zeros
non-zero
NaN

IEEE-754 Format
The largest number by magnitude
1.1........12  2
127
 3.40 10
38
The smallest number by magnitude
1.00......02  2
126
 2.18 10
38
Machine epsilon
 mach  2
18
23
 1.19  10
7
THE END
http://numericalmethods.eng.usf.edu
Download