Fixed and Floating Point Number Representation

advertisement
Real Number Representation in Computer Systems
This guide covers:



Two’s complement, a review
Fixed-point number representation
Floating-point number representation
Two’s Complement Review
In 8-bit two’s complement integer representation, the significance of the bits is like this:
-27 26
25
24
23
22
21
20
To convert from a positive number to its negative representation:
1. Flip the bits
2. Add the lowest bit value (with an integer this is 1)


The minimum value is 10000000
The maximum is 01111111
Example: To find -3 in two’s complement:
1. Find plus 3: 00000011
2. Flip the bits: 11111100
3. Add 1: 11111101
Fixed-Point Numbers
This is a fixed-point two’s complement binary number with 4 integer bits and 4 fractional bits. Just like
integer two’s complement its maximum is 0111.1111 and its minimum is 1000.0000.
-23 22
21
Max: 4 + 2 + 1 + ½ + ¼ + 1/8 + 1/16 = 7.9375
20 2-1 2-2 2-3 2-4
Min: -8
(With i integer bits and f fractional bits you can work out the formula for the maximum and minimum
possible values. Min is -2i-1 and Max is 2i-1-2f.)
Note: To convert from a positive real number to its negative equivalent you do the same as you would
for an integer (flip the bits and add…) except this time you add the lowest bit value, rather than 1. In the
case above that value is 2-4.
Example: What is -5.5 using fixed-point two’s complement binary number with 4 integer bits and 4
fractional bits.
First find plus 5.5.
5.5 = 4 + 1 + 0.5 = 0101.1000
Flip the bits:
1010.0111
Add the lowest bit value (here 0.0001):
1010.1000
Check:
1010.1000
= -23 + 21 + 2-1
= -8 + 2 + 0.5
= -5.5
Basic Floating Point Representation
In decimal you can express numbers in Standard Form, e.g.:
28,353.25
can be expressed as
2.835325 x 104
The 2.835325 part is called the mantissa, and the 4 part is called the exponent. The exponent tells you
how many places to shift the decimal point to the right. It can be negative, and shifting -3 places to the
right means shifting 3 places to the left.
You can have the same sort of representation in binary:
0
1
0
1
0
0
0
0
0
0
0
0
0
0
1
0
This number has a 10-bit mantissa and a 6-bit exponent and can be written 0.101000000 x 2000010. (Don’t
be afraid – it’s not as complicated as it looks.)
Let’s see if we can find out what this number is.
Step 1: First we’ll shift the point (I can’t use the term decimal point any more – technically it’s a bicimal
point in binary but you more often hear the general term radix point). The exponent is in two’s
complement format but the high-bit (the first one from the left) is 0, so this is a positive number, and
the number is 2. So we shift the point 2 to the right, giving:
10.10000000
Step 2:
So, having moved the point we can interpret this as a binary number with the value:
21 + 2-1 = 2.5dec
Now let’s look at three simple examples:
0
0
1
0
1
0
0
0
0
0
0
0
0
0
1
1
0
0
0
1
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
0
0
If you look at these carefully you will see they are all equal to 2.5! Computer scientists hate waste and
inefficiency and it is a wasteful and inefficient to have lots of different ways of storing the same number.
We use a process called normalization to prevent this.
Decimal Standard Form
Before we look at normalization in binary, let’s look at something you have studied in mathematics.
Here are lots of ways of expressing the same number:




0.003
0.03 x 10-1
0.3 x 10-2
3.0 x 10-3
All of these numbers is equal to 0.003, but only 3.0 x 10-3 is in “Standard Form”. The rule for standard
form is that the mantissa (here the 3.0 part) is greater than or equal to 1 and less than 10. If it’s not then
we move the decimal point and change the exponent to compensate.
Exactly the same technique is used in represent real (fractional) numbers in binary, except in binary the
rule (for positive numbers) is that the mantissa begins with 0.1….
Normalization
Using our method, of a 10-bit two’s complement mantissa and a 6-bit two’s complement exponent, the
number 2.5 is represented like this:
0
1
0
1
0
0
0
0
0
0
0
0
0
0
1
0
It is not represented in any other way.
The reasoning is exactly the same as for decimal standard form. Any leading zeroes are removed from
the mantissa and the exponent is changed accordingly.
If you look carefully at this number you will see it is equal to 111.01, or 7.25dec. But at the moment it is
not normalized and so it would not actually be represented like this.
0
0
0
0
0
1
1
1
0
1
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
When it is normalized it becomes:
0
1
1
1
0
Two things have happened.
1. The four leading 0’s after the point have been removed (ie the point has been shifted to the
right four places)
2. The exponent has been decreased by four.
So instead of moving the point seven to the right in this number:
0.000011101
We move it three to the right in this number:
0.111010000
Either way we end up with 111.01, or 7.25dec.
So with positive mantissas, the left-most bit is 0 and we remove all leading 0’s from the fractional part
of the mantissa.
This gives you a handy check! Because any positive mantissa expressed in two’s complement floating
point format must begin with 0.1… If it doesn’t, you’ve done it wrong.
Negative Numbers
Firstly, bear in mind that a floating point number is only negative when the mantissa is negative. A
negative exponent doesn’t mean a negative number, it just means a small positive number (e.g.
compare 1 x 10-3 with -1 x 103).
With negative mantissas, instead of removing all the leading 0’s, we remove all leading 1’s.
Using our 10-bit mantissa and 6-bit exponent, let’s look at the representation of -0.125dec.
First we need the 10-bit binary representation of 0.125, which is:
0.00100000
Now we negate it by finding the two’s complement, which will give us the 10-bit binary representation
of -0.125:
(Flip the bits: 1.110111111)
(Add 1: 1.111000000)
So the unnormalized mantissa is 1.111000000 (and the exponent is 000000).
We now remove the three leading 1’s and subtract 3 from the exponent, giving:
1
0
0
0
0
0
0
0
0
0
1
1
1
1
0
1
This gives us 1.000000000|111101 as the normalized floating-point representation.
(Notice that the exponent is -3 in two’s complement format!)
To get back to the number from the normalized floating-point format you need to bear in mind that
leading 1’s are irrelevant in a two’s complement negative number, just as leading 0’s are irrelevant in a
positive number.
Example:
000000000101 is the same as 101, no problem there. But in two’s complement:
111111111101 is the same as 101!
Why? Well complement them and see:
101 = -011 = -3dec
111111111101 = -000000000011 = -011 = -3dec
Now, to get from our normalized floating-point representation of 0.125dec which was
1.000000000|111101, we have to do the following.
1. Work out what the exponent is: 111101 = -000011 = -3 dec
2. This means we have to move the point to the left three places.
3. Remember now that in a negative two’s complement number the bits to the left of the MSB
(most significant bit) are 1’s and not 0’s! So it’s like this: etc←1111111111111. 000000000 not
like this etc←0000000000001. 000000000.
4. Move the point, so 1.000000000 becomes 1.111000000.
5. Complement this to check it’s value: 1.111000000 → -0.001000000 = -0.125dec
Test your understanding:
This exercise will focus on the following skills.






Converting decimal numbers to fixed-point binary and back
Minimum and maximum possible values given certain integer and fractional parts
Normalizing unnormalized floating-point numbers
Converting decimal numbers to normalized floating-point format
Calculating maximum and minimum possible values given certain mantissa and exponent sizes
Defining/explaining terms
1. Using fixed point two’s complement, with 4 integer bits + 4 fractional bits, calculate the
maximum and minimum values that can be stored.
2. Using 8 integer bits and 4 fractional bits, represent the following decimal numbers in fixed-point
two’s complement:
a. 125.5
b. -63.25
c. -17.625
d. -113.1875
3. Calculate the value of the following two’s complement fixed-point binary numbers:
a. 0101.1010
b. 1101.1100
4. What is the maximum value that can be store using fixed-point two’s complement binary with 6
integer bits and 3 fractional bits?
5. The following unnormalized two’s complement floating-point numbers have an 8-bit mantissa
and a 4-bit exponent. Write them in normalized form and state their decimal values.
a. 000010100100
b. 000100101101
c. 111001010011
d. 111111011101
6. The following normalized two’s complement floating-point numbers have an 8-bit mantissa and
a 4-bit exponent. Calculate their decimal values.
7.
8.
9.
10.
11.
12.
a. 011000110101
b. 101000001110
Using an 8-bit mantissa and a 4-bit exponent, calculate the two’s complement floating-point
representation of the following decimal numbers.
a. 128.75
b. -58.625
Using an 8-bit mantissa and a 4-bit exponent calculate:
a. The highest number that can be represented
b. The lowest number that can be represented
Define fixed and floating-point number representation.
Compare and contrast fixed and floating-point number representations.
Explain the need for normalization in floating-point number representations.
Define underflow and overflow and give an example of each.
Download