Floating Point Representation Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons http://numericalmethods.eng.usf.edu Transforming Numerical Methods Education for STEM Undergraduates 4/13/2015 http://numericalmethods.eng.usf.edu 1 Floating Point Representation http://numericalmethods.eng.usf.edu Floating Decimal Point : Scientific Form 256.78 is writtenas 2.567810 2 3 0.003678is writtenas 3.67810 256.78 is writtenas 2.567810 2 3 http://numericalmethods.eng.usf.edu Example The form is or sign mantissa10exponent m 10e Example: For 2.5678102 1 m 2.5678 e2 4 http://numericalmethods.eng.usf.edu Floating Point Format for Binary Numbers y m 2 sign of number 0 for ve,1 for - ve m mantissa12 m 102 e 1 is not stored as it is always given to be 1. e integerexponent 5 http://numericalmethods.eng.usf.edu Example 9 bit-hypothetical word the the the the first bit is used for the sign of the number, second bit for the sign of the exponent, next four bits for the mantissa, and next three bits for the exponent 54.7510 110110.112 1.10110112 25 1.10112 1012 We have the representation as 0 Sign of the number 6 0 1 Sign of the exponent 0 1 mantissa 1 1 0 1 exponent http://numericalmethods.eng.usf.edu Machine Epsilon Defined as the measure of accuracy and found by difference between 1 and the next number that can be represented 7 http://numericalmethods.eng.usf.edu Example Ten bit word Sign of number Sign of exponent Next four bits for exponent Next four bits for mantissa Next number 0 0 0 0 0 0 0 0 0 0 110 0 0 0 0 0 0 0 0 0 1 1.00012 1.062510 mach 1.06251 24 8 http://numericalmethods.eng.usf.edu Relative Error and Machine Epsilon The absolute relative true error in representing a number will be less then the machine epsilon Example 0.0283210 1.11002 25 1.11002 20110 2 10 bit word (sign, sign of exponent, 4 for exponent, 4 for mantissa) 0 Sign of the number 1 0 Sign of the exponent 1 1 0 exponent 1.11002 2 0110 2 1 1 0 0 mantissa 0.0274375 0.02832 0.0274375 a 0.02832 9 0.034472 2 4 0.0625 http://numericalmethods.eng.usf.edu IEEE 754 Standards for Single Precision Representation http://numericalmethods.eng.usf.edu IEEE-754 Floating Point Standard • Standardizes representation of floating point numbers on different computers in single and double precision. • Standardizes representation of floating point operations on different computers. One Great Reference What every computer scientist (and even if you are not) should know about floating point arithmetic! http://www.validlab.com/goldberg/paper.pdf IEEE-754 Format Single Precision 32 bits for single precision 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sign (s) Biased Exponent (e’) Mantissa (m) . Value (1)s 1 m2 2e' 127 13 Example#1 1 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sign (s) Biased Exponent (e’) Mantissa (m) Value 1 1. m2 2 s e' 127 1 1.101000002 2 1 1.625 2162127 1 1.625 235 5.58341010 1 14 (10100010 ) 2 127 Example#2 Represent -5.5834x1010 as a single precision floating point number. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Sign (s) Biased Exponent (e’) Mantissa (m) 5.583410 1 1. ? 2 10 15 1 ? Exponent for 32 Bit IEEE-754 8 bits would represent 0 e 255 Bias is 127; so subtract 127 from representation 127 e 128 16 Exponent for Special Cases Actual range of e 1 e 254 e 0 and e 255 are reserved for special numbers Actual range of e 126 e 127 Special Exponents and Numbers e 0 e 255 s 0 1 0 1 0 or 1 e all zeros all zeros all ones all ones all ones all zeros all ones m Represents all zeros 0 all zeros -0 all zeros all zeros non-zero NaN IEEE-754 Format The largest number by magnitude 1.1........12 2 127 3.4010 38 The smallest number by magnitude 1.00......02 2126 2.181038 Machine epsilon mach 2 23 19 7 1.19 10 Additional Resources For all resources on this topic such as digital audiovisual lectures, primers, textbook chapters, multiple-choice tests, worksheets in MATLAB, MATHEMATICA, MathCad and MAPLE, blogs, related physical problems, please visit http://numericalmethods.eng.usf.edu/topics/floatingpoint_re presentation.html THE END http://numericalmethods.eng.usf.edu