Fixed and Floating Point Number Representation

Real Number Representation in Computer Systems This guide covers:    Two’s complement, a review Fixed-point number representation Floating-point number representation Two’s Complement Review In 8-bit two’s complement integer representation, the significance of the bits is like this: -27 26 25 24 23 22 21 20 To convert from a positive number to its negative representation: 1. Flip the bits 2. Add the lowest bit value (with an integer this is 1)   The minimum value is 10000000 The maximum is 01111111 Example: To find -3 in two’s complement: 1. Find plus 3: 00000011 2. Flip the bits: 11111100 3. Add 1: 11111101 Fixed-Point Numbers This is a fixed-point two’s complement binary number with 4 integer bits and 4 fractional bits. Just like integer two’s complement its maximum is 0111.1111 and its minimum is 1000.0000. -23 22 21 Max: 4 + 2 + 1 + ½ + ¼ + 1/8 + 1/16 = 7.9375 20 2-1 2-2 2-3 2-4 Min: -8 (With i integer bits and f fractional bits you can work out the formula for the maximum and minimum possible values. Min is -2i-1 and Max is 2i-1-2f.) Note: To convert from a positive real number to its negative equivalent you do the same as you would for an integer (flip the bits and add…) except this time you add the lowest bit value, rather than 1. In the case above that value is 2-4. Example: What is -5.5 using fixed-point two’s complement binary number with 4 integer bits and 4 fractional bits. First find plus 5.5. 5.5 = 4 + 1 + 0.5 = 0101.1000 Flip the bits: 1010.0111 Add the lowest bit value (here 0.0001): 1010.1000 Check: 1010.1000 = -23 + 21 + 2-1 = -8 + 2 + 0.5 = -5.5 Basic Floating Point Representation In decimal you can express numbers in Standard Form, e.g.: 28,353.25 can be expressed as 2.835325 x 104 The 2.835325 part is called the mantissa, and the 4 part is called the exponent. The exponent tells you how many places to shift the decimal point to the right. It can be negative, and shifting -3 places to the right means shifting 3 places to the left. You can have the same sort of representation in binary: 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 This number has a 10-bit mantissa and a 6-bit exponent and can be written 0.101000000 x 2000010. (Don’t be afraid – it’s not as complicated as it looks.) Let’s see if we can find out what this number is. Step 1: First we’ll shift the point (I can’t use the term decimal point any more – technically it’s a bicimal point in binary but you more often hear the general term radix point). The exponent is in two’s complement format but the high-bit (the first one from the left) is 0, so this is a positive number, and the number is 2. So we shift the point 2 to the right, giving: 10.10000000 Step 2: So, having moved the point we can interpret this as a binary number with the value: 21 + 2-1 = 2.5dec Now let’s look at three simple examples: 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 If you look at these carefully you will see they are all equal to 2.5! Computer scientists hate waste and inefficiency and it is a wasteful and inefficient to have lots of different ways of storing the same number. We use a process called normalization to prevent this. Decimal Standard Form Before we look at normalization in binary, let’s look at something you have studied in mathematics. Here are lots of ways of expressing the same number:     0.003 0.03 x 10-1 0.3 x 10-2 3.0 x 10-3 All of these numbers is equal to 0.003, but only 3.0 x 10-3 is in “Standard Form”. The rule for standard form is that the mantissa (here the 3.0 part) is greater than or equal to 1 and less than 10. If it’s not then we move the decimal point and change the exponent to compensate. Exactly the same technique is used in represent real (fractional) numbers in binary, except in binary the rule (for positive numbers) is that the mantissa begins with 0.1…. Normalization Using our method, of a 10-bit two’s complement mantissa and a 6-bit two’s complement exponent, the number 2.5 is represented like this: 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 It is not represented in any other way. The reasoning is exactly the same as for decimal standard form. Any leading zeroes are removed from the mantissa and the exponent is changed accordingly. If you look carefully at this number you will see it is equal to 111.01, or 7.25dec. But at the moment it is not normalized and so it would not actually be represented like this. 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 1 When it is normalized it becomes: 0 1 1 1 0 Two things have happened. 1. The four leading 0’s after the point have been removed (ie the point has been shifted to the right four places) 2. The exponent has been decreased by four. So instead of moving the point seven to the right in this number: 0.000011101 We move it three to the right in this number: 0.111010000 Either way we end up with 111.01, or 7.25dec. So with positive mantissas, the left-most bit is 0 and we remove all leading 0’s from the fractional part of the mantissa. This gives you a handy check! Because any positive mantissa expressed in two’s complement floating point format must begin with 0.1… If it doesn’t, you’ve done it wrong. Negative Numbers Firstly, bear in mind that a floating point number is only negative when the mantissa is negative. A negative exponent doesn’t mean a negative number, it just means a small positive number (e.g. compare 1 x 10-3 with -1 x 103). With negative mantissas, instead of removing all the leading 0’s, we remove all leading 1’s. Using our 10-bit mantissa and 6-bit exponent, let’s look at the representation of -0.125dec. First we need the 10-bit binary representation of 0.125, which is: 0.00100000 Now we negate it by finding the two’s complement, which will give us the 10-bit binary representation of -0.125: (Flip the bits: 1.110111111) (Add 1: 1.111000000) So the unnormalized mantissa is 1.111000000 (and the exponent is 000000). We now remove the three leading 1’s and subtract 3 from the exponent, giving: 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 This gives us 1.000000000|111101 as the normalized floating-point representation. (Notice that the exponent is -3 in two’s complement format!) To get back to the number from the normalized floating-point format you need to bear in mind that leading 1’s are irrelevant in a two’s complement negative number, just as leading 0’s are irrelevant in a positive number. Example: 000000000101 is the same as 101, no problem there. But in two’s complement: 111111111101 is the same as 101! Why? Well complement them and see: 101 = -011 = -3dec 111111111101 = -000000000011 = -011 = -3dec Now, to get from our normalized floating-point representation of 0.125dec which was 1.000000000|111101, we have to do the following. 1. Work out what the exponent is: 111101 = -000011 = -3 dec 2. This means we have to move the point to the left three places. 3. Remember now that in a negative two’s complement number the bits to the left of the MSB (most significant bit) are 1’s and not 0’s! So it’s like this: etc←1111111111111. 000000000 not like this etc←0000000000001. 000000000. 4. Move the point, so 1.000000000 becomes 1.111000000. 5. Complement this to check it’s value: 1.111000000 → -0.001000000 = -0.125dec Test your understanding: This exercise will focus on the following skills.       Converting decimal numbers to fixed-point binary and back Minimum and maximum possible values given certain integer and fractional parts Normalizing unnormalized floating-point numbers Converting decimal numbers to normalized floating-point format Calculating maximum and minimum possible values given certain mantissa and exponent sizes Defining/explaining terms 1. Using fixed point two’s complement, with 4 integer bits + 4 fractional bits, calculate the maximum and minimum values that can be stored. 2. Using 8 integer bits and 4 fractional bits, represent the following decimal numbers in fixed-point two’s complement: a. 125.5 b. -63.25 c. -17.625 d. -113.1875 3. Calculate the value of the following two’s complement fixed-point binary numbers: a. 0101.1010 b. 1101.1100 4. What is the maximum value that can be store using fixed-point two’s complement binary with 6 integer bits and 3 fractional bits? 5. The following unnormalized two’s complement floating-point numbers have an 8-bit mantissa and a 4-bit exponent. Write them in normalized form and state their decimal values. a. 000010100100 b. 000100101101 c. 111001010011 d. 111111011101 6. The following normalized two’s complement floating-point numbers have an 8-bit mantissa and a 4-bit exponent. Calculate their decimal values. 7. 8. 9. 10. 11. 12. a. 011000110101 b. 101000001110 Using an 8-bit mantissa and a 4-bit exponent, calculate the two’s complement floating-point representation of the following decimal numbers. a. 128.75 b. -58.625 Using an 8-bit mantissa and a 4-bit exponent calculate: a. The highest number that can be represented b. The lowest number that can be represented Define fixed and floating-point number representation. Compare and contrast fixed and floating-point number representations. Explain the need for normalization in floating-point number representations. Define underflow and overflow and give an example of each.

Fixed and Floating Point Number Representation

Related documents

Products

Support

Fixed and Floating Point Number Representation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib