Floating Point Numbers

Floating Point Numbers  In the decimal system, a decimal point (radix point) separates the whole numbers from the fractional part  Examples: 37.25 ( whole=37, fraction = 25) 123.567 10.12345678 Floating Point Numbers For example, 37.25 can be analyzed as: 101 100 10-1 10-2 Tens Units Tenths Hundredths 3 7 2 5 37.25 = 3 x 10 + 7 x 1 + 2 x 1/10 + 5 x 1/100 Binary Equivalent The binary equivalent of a floating point number can be computed by computing the binary representation for each part separately.   whole part: subtraction or division Fractional part: subtraction or multiplication Binary Equivalent In the binary representation of a floating point number the column values will be as follows: … 26 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4 … … 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 … … 64 32 16 8 4 2 1 . .5 .25 .125 .0625 … Finding Binary Equivalent of fraction part Converting .25 using Multiplication method. Step 1 : multiply fraction by 2 until fraction becomes 0 .25 x2 0.5 x2 1.0 Step 2 Collect the whole parts and place them after the radix point 64 32 16 8 4 2 1 . .5 .25 .125 .0625 . 0 1 Finding Binary Equivalent of fraction part Converting .25 using subtraction method. Step 1: write positional powers of two and column values for the fractional part . 2-1 2-2 2-3 2-4 2 -5 . ½ ¼ 1/8 1/16 1/32 . .5 .25 .125 .0625 0.03125 Finding Binary Equivalent of fraction part Converting .25 using subtraction method. Step 2: start subtracting the column values from left to right, place a 0 if the value cannot be subtracted or 1 if it can until the fraction becomes .0 . .25 2 1 . .5 .25 .125 .0625 - .25 . 0 1 .0 Binary Equivalent of FP number Given 37.25, convert 37 and .25 using subtraction method. 64 32 16 8 4 2 1 . .5 .25 .125 .0625 26 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4 1 0 0 1 0 1 . 0 1 37 .25 - 32 - .25 5 .0 -4 1 -1 0 37.2510 = 100101.012 So what is the Problem? Given the following binary representation: 37.2510 = 100101.012 7.62510 = 111.1012 0.312510 = 0.01012 How we can represent the whole and fraction part of the binary rep. in 4 bytes? Solution is Normalization Every binary number, except the one corresponding to the number zero, can be normalized by choosing the exponent so that the radix point falls to the right of the leftmost 1 bit. 37.2510 = 100101.012 = 1.0010101 x 25 7.62510 = 111.1012 = 1.11101 x 22 0.312510 = 0.01012 = 1.01 x 2-2 So what Happened ? After normalizing, the numbers now have different mantissas and exponents. 37.2510 = 100101.012 = 1.0010101 x 25 7.62510 = 111.1012 = 1.11101 x 22 0.312510 = 0.01012 = 1.01 x 2-2 IEEE Floating Point Representation Floating point numbers can be represented by binary codes by dividing them into three parts: the sign, the exponent, and the mantissa.  1 2 9 10 32 IEEE Floating Point Representation  The first, or leftmost, field of our floating point representation will be the sign bit:  0 for a positive number,  1 for a negative number. IEEE Floating Point Representation   The second field of the floating point number will be the exponent. Since we must be able to represent both positive and negative exponents, we will use a convention which uses a value known as a bias of 127 to determine the representation of the exponent.    An exponent of 5 is therefore stored as 127 + 5 or 132; an exponent of -5 is stored as 127 + (-5) OR 122. The biased exponent, the value actually stored, will range from 0 through 255. This is the range of values that can be represented by 8-bit, unsigned binary numbers. IEEE Floating Point Representation  The mantissa is the set of 0’s and 1’s to the left of the radix point of the normalized (when the digit to the left of the radix point is 1) binary number.   ex:1.00101 X 23 The mantissa is stored in a 23 bit field, Converting decimal floating point values to stored IEEE standard values. Example: Find the IEEE FP representation of 40.15625. Step 1. Compute the binary equivalent of the whole part and the fractional part. ( convert 40 and .15625. to their binary equivalents) Converting decimal floating point values to stored IEEE standard values. 40 - 32 8 - 8 0 Result: 101000 .15625 - .12500 .03125 - .03125 .0 So: 40.1562510 = 101000.001012 Result: .00101 Converting decimal floating point values to stored IEEE standard values. Step 2. Normalize the number by moving the decimal point to the right of the leftmost one. 101000.00101 = 1.0100000101 x 25 Converting decimal floating point values to stored IEEE standard values. Step 3. Convert the exponent to a biased exponent 127 + 5 = 132 ==> 13210 = 100001002 Converting decimal floating point values to stored IEEE standard values. Step 4. Store the results from above Sign 0 Exponent (from step 3) 10000100 Mantissa ( from step 2) 01000001010 .. 0 Covert 40.15625 to IEEE 32-bit format 40.15625 Step 1 Find binary of whole number 40-32-8 Binary value of 40 32 16 8 4 2 1 1 0 1 0 0 0 Step 1b Find binary of fraction number .15625 -.1250 - .03125 Binary value of .15625 .5 .25 .125 .0625 .03125 0 0 1 0 1 Step 2a Move decimal (Radix point) to the right of the left most 1 to come up with the exponent . 101000.00101 Step 2b Normalize by multipling the number by 2 and the exponent from step 3 Equals 101000.00101 x 25 = 1.0100000101 Step 3a convert the exponent to a biased exponent by adding the bias of 127 to the exponent 127 + 5 = 13210 Step 3b Convert the biased exponent to its binary value Binary value of 132 132-128-4 64 32 16 8 4 2 1 1 0 0 0 1 0 0 Step 4a Piece together values. Step 4c. Left pad exponent and right pad mantissa to determine the binary equivalent of IEEE Sign Exponent 0 10000100 0 010000100 Mantissa 0100000101 01000001010…0 Sign Exponent 0 010000100 Mantissa 01000001010…0 Step 4b Move to mantissa without the 1. Converting decimal floating point values to stored IEEE standard values. Ex : Find the IEEE FP representation of –24.75 Step 1. Compute the binary equivalent of the whole part and the fractional part. 24 .75 - 16 Result: - .50 Result: 8 11000 .25 .11 - 8 - .25 0 .0 So: -24.7510 = -11000.112 Converting decimal floating point values to stored IEEE standard values. Step 2. Normalize the number by moving the decimal point to the right of the leftmost one. -11000.11 = -1.100011 x 24 Converting decimal floating point values to stored IEEE standard values. Step 3. Convert the exponent to a biased exponent 127 + 4 = 131 ==> 13110 = 100000112 Step 4. Store the results from above Sign Exponent mantissa 1 10000011 1000110..0 Converting from IEEE format to the decimal floating point values.  Do the steps in reverse order  In reversing the normalization step move the radix point the number of digits equal to the exponent. if exponent is +ve move to the right, if –ve move to the left. Converting from IEEE format to the decimal floating point values. Ex: Convert the following 32 bit binary numbers to their decimal floating point equivalents. a. Sign Exponent Mantissa 1 01111101 010..0 Converting from IEEE format to the decimal floating point values. Step 1 Extract exponent (unbias exponent) biased exponent = 01111101 = 125 exponent: 125 - 127= -2 Converting from IEEE format to the decimal floating point values. Step 2 Write Normalized number Exponent 1 . ____________ x 2 mantissa -1. 01 x 2 –2 ---- Converting from IEEE format to the decimal floating point values. Step 3: Write the binary number (denormalize value from step2) -0.01012 Step 4: Convert binary number to FP equivalent ( add column values) -0.01012 = - ( 0.25 + 0.0625) = -0.3125 Converting from IEEE format to the decimal floating point values. Ex: Convert the following 32 bit binary numbers to their decimal floating point equivalents. Sign 0 Exponent 10000011 Mantissa 1101010..0 Converting from IEEE format to the decimal floating point values. Step 1 Extract exponent (unbias exponent) biased exponent = 10000011 = 131 exponent: 131 - 127= 4 Converting from IEEE format to the decimal floating point values. Step 2 Write Normalized number Exponent 1 . ____________ x 2 mantissa 1. 110101 x 2 4 ---- Converting from IEEE format to the decimal floating point values. Step 3 Write the binary number (denormailze value from step 2) 11101.012 Step 4 Convert binary number to FP equivalent ( add column values) 11101.012 = 16 + 8 + 4 + 1 + 0.25 = 29.2510 Proof your work Convert 0 10000100 01000001010…0 back to IEEE Sign Exponent 0 010000100 Step 1a Determine the decimal of the binary number Step 1b Unbias the number by Subtracting 127 from the decimal number to determine the exponent Mantissa 01000001010…0 128 32 16 8 4 2 1 1 0 0 0 1 0 0 128+4 Binary value = 132 Step 2 Denormalize by multipling the number by 2 and the exponent from step 2. Set up format 1 mantissa x 2 exponent 132 – 127 = 5 1.0100000101 x 25 . Step 3a Move decimal (Radix point) to the right of the left most 1 to come up with the exponent Equals 101000 . 001012 Step 3b Convert binary number to FP equvalent Step 4a Find whole number of exponent 32 16 8 4 2 1 1 0 1 0 0 0 .5 .25 .125 .0625 .03125 0 0 1 0 1 32+8 = 40 .1250 + .03125 = .15625 Step 4c Add together values. Make sure to include the sign if it is a negative value 40.15625 Step 4b Find fractional numberof the mantissa

Floating Point Numbers

Related documents

Products

Support

Floating Point Numbers

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib