10.2 Sign Magnitude Representation Sign Magnitude is straight-forward method for representing both positive and negative integers. It uses the most significant digit of the digit string to indicate the sign of the number and the rest of the digits to represent its magnitude. This means, of course, that every number represented in this fashion has one more digit--the sign digit--than the conventional number. Let us agree that a 0 in the sign position means that the number is positive. We indicate a negative by the digit resulting from the subtraction of 1 from the base of the original number. We will see later that this decision has consequences for other representation schemes. But for now, (+907)10 = (0907)10 (-907)10 = (9907)10 (+605)8 = (0605)8 (-605)8 = (7605)8 (+101)2 = (0101)2 (-101)2 = (1101)2 Suppose we are using a 16 bit word to store positive and negative numbers in sign magnitude representation. What is the largest and smallest integer that we can accomodate? The smallest has a 1 in the sign position and 1's in all other positions: (1111111111111111)2 = (-215 - 1)2 = (-32767)10 The largest has a 0 in the sign position and 0's in all other positions: (0111111111111111)2 = (+215 - 1)2 = (+32767)10 Although sign magnitude representation of numbers is straight forward it has two significant drawbacks. First, there are two ways to represent the number zero. There can be a positive zero and a negative zero. Second, the computer hardware necessary to do sign magnitude arithmetic is more complicated and, consequently, arithmetic on more expensive than that necessary to perform numbers stored in other representations. 10.3 Complement Representation The additive inverse of a number is the number which, when added to the original number will result in the value zero. For example, the additive inverse of 23 is -23 because 23 + (-23) = 0. So the negation of a number is its additive inverse. The fact that the size of numbers that a computer can represent is limited by the number of bits allocated to its representation has interesting consequences for additive inverses. Consider what would happen if we were to add (433)10 and (567)10 on a calculator which only handled numbers up to three digits long. 433 + 567 -----(1) 000 no digit in which to store the carry The result is zero, so if we limit ourselves to three digits, 567 is the additive inverse of 433. That is, when 567 is added to 433 it produces a zero. The opposite, of course, is also true. In this three digit system the additive inverse of 1 is 999. So, as long as we restrict ourselves to three digits: - 433 = 567 - 001 = 999 Recall that when we developed sign magnitude representation we said that any positive number has a 0 in the most significant digit position and that any negative number has (base 1) in that position. So that we are consistent with how we defined positive and negative numbers in sign magnitude representation, let us put the sign digit back in. By necessity, we have to modify our calculator has to handle four digits. Now we are looking for the additive inverse of (0433)10 in four digits. That is, what number when added to 0433 will produce 0's in all four positions as well as a carry into the fifth position (that will then be lost)? 0433 + 9567 -----(1)0000 So, 9567 is the additive inverse of 0433 when using four digits. Notice that 9567, a negative number, has a 9 in the most significant digit position. This is consistent with how we have defined the sign of a negative number: base - 1. In this case, the base is 10. The additive inverse in base 10 when restricted to a fixed number of digits is called the 10's complement. It can be generalized to numbers in any base, B. To find the B's complement of a number, subtract the number whose complement you want to find from the base raised to the number of digits you have to work with. Formally, we define the B's complement like this: (N')B = BD - (N)B where: (N)B is the number whose complement you want to find (N')B is the complement B is the base D is the number of digits you are working with, including the sign digit Using this technique, the complement of 150 base 10 in four digits is given by: N' = 104 - 0150 = 9850 10.3.1 2's Complement Representation Though we can do complement arithmetic in any base, it is binary arithmetic using 2's complement representation that is most of interest to us. To keep things simple, imagine that we are working with a four bit computer. These are the positive integers we can represent using three magnitude bits and a sign bit: Decimal 0 1 Binary 0000 0001 2 3 4 5 6 7 0010 0011 0100 0101 0110 0111 Now, let's take the complement of these: 10000 - 0000 ----0000 10000 - 0001 ----1111 10000 - 0010 ----1110 10000 - 0011 ----1101 At this point a pattern should be emerging that could save us a lot of work. Notice that in every case the complement and the original number are identical from left to right until you pass the first 1 of the original number. At that point, all the digits are opposite. The complement of 0010, for example, is 1110. This of course, is a consequence of binary arithmetic. Beginning from the left, subtracting a 0 from a 0 produces a 0. Thus all 0's in the original number remain 0's in the complement until you encounter the first 1. When this happens, subtracting a 1 from a 0 results in a borrow from the next column. So, 1 from 10 in binary is 1. The column after the first borrow occured is where the changes begin. We borrowed from this column, but this column originally had a 0 in it. So this column was forced to borrow from the column to its left and so on. Once this column borrowed successfully, it became 10. But, of course, the column to its right borrowed from it and so changed it to 1. Now, if the number in the original column is 1, subtracting it from 1 produces a 0. If the number in the original column is 0, subtracting it produces 1. Here is the algorithm to find the 2's complement of a number: 1. 2. 3. Leave the 0's before the least significant 1 untouched. Leave the least significant 1 untouched. After the least significant 1, change 0's to 1's and 1's to 0's. Students often confuse 2's Complement as a representation scheme with taking the 2's complement of a number. As a representation scheme, 2's Complement is a set of numbers. The number of members in this set depends on how many digits we have to work with. Taking the 2's complement of a number is act of applying the algorithm just described. Thus there are positive 2's Complement numbers as well as negative 2's Complement numbers and we can find the 2's complement of each of them. Clearly, finding the 2's complement of a positive 2's Complement number produces its additive inverse, a negative 2's complement number. This scheme would not be consistent with the laws of arithmetic if the opposite were not true: finding the 2's complement of a negative 2's complement number produces the positive 2's complement number. For example, 0111, is a positive 2's complement number. Its complement is 1001. Now using the algorithm above, the complement of 1001 is 0111 as predicted. Here is the table of binary numbers and their complements using four bits. Decimal -8 -7 -6 -5 Binary -1000 -0111 -0110 -0101 2's Complement 1000 1001 1010 1011 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 -0100 -0011 -0010 -0001 0000 0001 0010 0011 0100 0101 0110 0111 1000 1100 1101 1110 1111 0000 0001 0010 0011 0100 0101 0110 0111 - Using 2's Complement representation and four bits, we can represent numbers from -8 to +7 in base 10. Can you see why there is no 2's Complement representation for +8 base 10 in four bits. This is an improvement over sign magnitude representation for two reasons. First, there is a single representation for 0. Second, 2's Complement is a positional number scheme not a code. To convince yourself of this. Try adding -5 and +2 in 2's Complement representation: 1011 + 0010 ---1101 1101 is -3 in 2's Complement, just as we would expect. 10.4 Storing Characters Although we can store signed and unsigned numbers to varying degrees of precision on a computer, this is not the end of the story. A computer manipulates textual data as well. This text, for instance, is being written using a word processor on a microcomputer. Somehow, the letters and punctuation symbols, as typed are transformed into strings of 1's and 0's and stored in the computer. Clearly, these letters and punctuation symbols--let's call them "characters"--are encoded and then decoded when the time comes to print the document out. The only real requirement for an encoding scheme is that it be unambiguous. If we assign a string of bits to a letter, we cannot assign that same string of bits to another letter. One encoding scheme that fulfills this requirement is ASCII, and acronym for American Standard Code for Information Interchange. ASCII is a seven bit code. It represents character data as strings of seven 1's and 0's. This means that we can use ASCII to encode 27 different characters. These 128 characters are shown in the ASCII table handed out in class. The table has four groups of three columns. The leftmost column shows hexadecimal code for a given character. IF we translate this to binary and ignore the most significant bit, we will have the actual string of 1's and 0's stored in memory. The middle column shows the decimal equivalent of the hexadecimal code. The rightmost column shows the character that the hexadecimal code represents. For example, find decimal code 84 in the third group of columns. The hexadecimal code here is 54. Since (54)16 = (84)10, we can see that the decimal code is listed for convenience only. The binary eqivalent of hexadecimal 54 is 01010100. The least significant seven bits of this binary number is the ASCII code for upper case "T". The characters encoded through ASCII are: 26 26 32 10 34 upper case alphabetic characters lower case alphabetic characters punctuation and miscellaneous symbols numeric characters control characters Notice that the binary code for each of the alphabetic characters is one greater than that of the preceding character. For example, the upper case characters range from hex 41 for "A" through hex 5A for "Z". The actual ASCII codes are 1000001 through 1011011. This means that sorting operations on alphbetic data can be done using arithmetic operations: "A" comes before "B" and 1000001 comes before 1000010. Notice also the relationship between the codes for upper and lower case alphabetic characters: Character A B Hex Code 41 42 Character a b Hex Code 61 62 . . . Y Z . . . 59 5A . . . y z . . . 79 7A Since ASCII encodings are just strings of 1's and 0's, they can be treated as binary numbers. We can, therefore, translate from upper case to lower case by adding (20)16 to the upper case character. But let's look more closely. The following table shows not just the hexadecimal code, but its binary equivalent. Character A B . . . Y Z Hex 41 42 . . . 59 5A Binary 01000001 01000010 . . . 01011001 01011010 Character a b . . . y z Hex 61 62 . . . 79 7A Binary 01100001 01100010 . . . 01111001 01111010 Notice that the upper case letter differs from the lower case equivalent in the fifth bit where the least significant bit is designated the zeroth. So, we can convert from upper case to lower case by changing the fifth bit from 0 to 1. We can convert from lower to upper case by changing the fifth bit from 1 to 0. It is important not to be misled by the ASCII representations of the numeric characters. These are codes, not numbers. They are used for encoding numbers on which computations will not be performed. Finally, the control characters are used to send messages to hardware devices. Hex 0D, for example, is the encoding for a carriage return character. When an output device encounters this character, it begins to display text on the next line. Because ASCII is a seven bit code, we can store one ASCII encoded character in a byte. We will discuss a possible use to which the most significant bit might be put in the next section. For now, assume it is a zero. Here is the ASCII encoding for the "computer" starting at hexadecmial address 00000040: Address Character Hex Code As Stored 00000040 00000041 00000042 00000043 00000044 00000045 00000046 00000047 "c" "o" "m" "p" "u" "t" "e" "r" 63 6F 6D 70 75 74 65 72 01100011 01101111 01101101 01110000 01110101 01110100 01100101 01110010