2. MACHINE REPRESENTATION OF TYPICAL ARITHMETIC DATA FORMATS (NATURAL AND INTEGER NUMBERS). 2.1. Natural Binary Code (NBC). The positional code with base 2 (B=2), introduced in Exercise 1, is used to encode the integer numbers without the sign (that means natural numbers). This code is usually called NBC (Natural Binary Code). In the machine representation the length of the code word is constant and depends on internal processor’s architecture – it’s equal to the length of basic machine / ALU word (32 or 64 bits in most of the today’s CPUs, for example) or the multiplicity of this basic machine word (for 8 or 16-bit processors, for example). In most of upper-level programming languages the natural numbers are represented as 32-bit NBC codes (“normal” natural numbers) and 64-bit NBC codes (“long” natural numbers). In the C-language these two types are declared as: unsigned int var1; unsigned long int var2; // // // // The variable var1 is a “normal”, 32-bit natural number The variable var2 is a “long”, 64-bit natural number On the other hand, the “short” version of natural number is also provided in most of languages that means 16-bit NBC value (this type of data was a basic one in 16-bit processors like Intel 8086/88 or Motorola 68000, for example). In the C-language notation declaration of such a type looks like this: unsigned short int var3; // The variable var3 is a “short”, 16-bit // natural number, “word” in assembly language The shortest version of the natural number used in C-language is called char from “character”. As a matter of fact this type of data stands for 8-bit NBC value: unsigned char var4; // The variable var4 is a “tiny” 8-bit // natural number, “byte” in assembly language The natural value which we can encode using 8 bits is from 0 to 255, so it has limited rather application in mathematical calculations (but can be useful as an error code returned by function, for example). In most of cases the char type is used for encoding the ASCII (American Standard Code for Information Interchange) characters. Example 1. The C-language code shown below uses one unsigned char value (a variable). It shows, that the same binary code can be used for a letter or number, depending on interpretation – by the printf() function (with formatted output), this time. #include <stdlib.h> #include <stdio.h> int main() { unsigned char a; a = 'A'; printf( "a = %d as decimal, %c as character.\n", a, a ); printf( "a+1 = %d as decimal, %c as character.\n", a+1, a+1 ); } The output from the program is: a = 65 as decimal, A as character. a+1 = 66 as decimal, B as character. 1 Knowing the construction of the NBC code we can try some basic mathematical operations, like addition or subtraction. Example 2. Try to add two 8-bit NBC values 00101110b and 01101101b: 00101110b = 2Eh = 46d + 01101101b = 6Dh = 109d ______________________ 10011011b = 9Bh = 155d Explanation: Adding two binary digits on each i-position (xi and yi) of the n-bit code, to obtain the sum on this position (si) we have to deal with these digits according to this table: xi 0 0 1 1 yi 0 1 0 1 si = xi + yi 0 1 1 0 ci 0 0 0 1 The ci value means carry from current position to the next (upper, left-side) position in the 8bit number. Only the right-most position in the sum has no potentially non-zero carry value from the lower (right-side) position. So we should modify the table above to include the carry value from the “younger” bit on each position (ci-1): ci-1 0 0 0 0 1 1 1 1 xi 0 0 1 1 0 0 1 1 yi 0 1 0 1 0 1 0 1 si = xi + yi+ ci-1 0 1 1 0 1 0 0 1 ci 0 0 0 1 0 1 1 1 Now we can write down the binary addition of two numbers with these comments this way: c7 c6 c5 c4 c3 c2 c2 c0 0 1 1 0 1 1 0 0 x7 x6 x5 x4 x3 x2 x1 x0 0 0 1 0 1 1 1 0 y7 y6 y5 y4 y3 y2 y1 y0 0 1 1 0 1 1 0 1 + __________________________________________________________________ s7 s6 s5 s4 s3 s2 s1 s0 1 0 0 1 1 0 1 1 Please notice, that carry value from the most significant position (cn-1 = c7) is equal to zero, so we have correct 8-bit result (the sum is still 8-bit binary number). More complicated operation is a subtraction of two numbers in any positional system. The borrowing (credit) from the more significant (left-side) position(s) is necessary if the digit on current position in minuend is lower than in subtrahend. Sometimes we have to borrow not 2 from the nearest more significant position, because the digit in this position is 0 and can’t borrow us any value. Example 3. Decimal subtraction of two numbers: 202 – 143. The minuend (202) is greater than the subtrahend (143), so we shouldn’t have any problem (the result expected will be still the natural number – the sign of the result will not change to negative). How the positional subtraction is performed? 9 10 Results of borrowing (credits) 2 0 2 - 1 4 3 0 5 9 The binary subtraction is done exactly in the same way, but the values of credit are 2 and 1 respectively. Example 4. Binary subtraction of two 4-bit NBC numbers: 1001 – 0110. The minuend (1001) is greater than the subtrahend (0110), the result expected will be still the 4-bit natural number. 1 2 Results of borrowing (credits) 1 0 0 1 - 0 1 1 0 0 0 1 1 In general the subtraction is more complicated operation than addition. The implementation of the borrowing in the logic circuits of the ALU is even much more difficult than carry. So the optimal implementation for subtraction is the adding of the subtrahend multiplied by -1: A – B = A + (-B). The two’s complement code (see section 2.3.) usually used for encoding integer numbers (with sign) makes it very easy (it’s very easy to obtain a -B value from B without many additional operations in the ALU, the adder designed for NBC values works correctly for two’s complement values). 2.2. Sign-modulus (SM) code. The simplest code that can be used for encoding integer numbers with sign uses the most significant (the first from the left side) bit in the binary code word as a sign (0 means “+”, 1 means “–“). So the number -1d in 8-bit “sign-modulus” (SM) code will be written as 10000001b. We can try to write down the definition of this code in more “mathematical” way. Notice, that bit 1 on the most significant position has a value 2n-1 (27 = 128 for 8-bit code). So the formula for the n-bit SM code of the number x can be written as: xSM = x x 0, xSM = 2n-1 + |x| (or 2n-1 – x) x 0. 3 It’s easy to see that the number 0d has two codes ( 0), 00000000b and 10000000b encoded in 8-bit SM code for example. 2.3. Two’s complement (TC) code. The idea of n-positional, base-complement (BC) code (first time used in XVII century by Blaise Pascal in his mechanical calculator) can be explained with the formula: xBC = x x 0, xBC = Bn – |x| (or Bn + x) x < 0. That means the code for negative number x is obtained by subtracting the absolute value of this number from the largest power of base (Bn in the code with n positions). The word “complement” means the difference between the Bn and |x|. The binary (B = 2) code of this kind is called two’s complement and can be defined by formula given below: xTC = x x 0, xTC = 2n – |x| (or 2n + x) x < 0. Example 5. In the 4-bit two’s complement code the values for negative numbers can be counted from the formula: xTC = 24 – |x| (or 24 + x). Let’s calculate these values (24 = 16): xd xTC d xTC b -1 15 1111 -2 14 1110 -3 13 1101 -4 12 1100 -5 11 1011 -6 10 1010 -7 9 1001 -8 8 1000 5 5 0101 6 6 0110 7 7 0111 The codes for zero and positive numbers will be as follows: xd xTC d xTC b 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 Basic considerations about two’s complement code: Most significant bit has the meaning of the “sign” (0 for “+”, 1 for “–“), like in the signmodulus code. So it’s easy to distinguish between positive and negative values. All 2n code values (16 for n = 4) are used, each of them for a different number. Exactly 2n / 2 codes (8 for n = 4) are used for encoding negative numbers, one code is used for 0 number, 2n / 2 – 1 codes (7 for n = 4) are used for positive numbers. Example 6. Try to subtract (binary, 4-bit) two numbers 7 – 3 by adding the ten’s compliment and two’s complement code for -3 to code of 7, that means calculate 7 + (-3): 10-compliment of 3: -310C = 10 – 3 = 7 7 +7 14 C3 C2 C1 C0 1 1 1 1 0 1 1 1 + 1 1 0 1 0 1 0 0 4 7d = 0111b -3d = 1101b Carry bits The result is correct (0100b = 4d), but we can notice the non-zero carry value from the last significant bit (c3). The same (carry to the next position) we can see in decimal addition. In the NBC code this was a signal of error – the result didn’t fit to length of code. This time the error signal (called overflow, V) is detected (in most of processors) in a bit different way: V = cn-1 cn-2. The operation (exclusive or – EXOR) gives positive result (1) if only one of the arguments is positive: cn-1 0 0 1 1 cn-2 0 1 0 1 V = cn-1 cn-2 0 1 1 0 In our example V = cn-1 cn-2 = c3 c2 = 1 1 = 0, so the result is really correct. As a matter of fact, the V signal is a useful and effective solution, but it wasn’t implemented in Intel 8080 CPU for example. So the programmers was forced to detect the overflow in operations with TC numbers from these considerations: Addition: when signs of both arguments are the same but sign of result is different, then result is not correct as TC integer. Subtraction: when signs of both arguments are different and sign of result is different than the sign of first argument, then result is not correct as TC integer. Can you give rules for multiplication and division of the TC integers? 2.4. One’s complement (OC) code. The code called “one’s complement” (OC) is made by bitwise NOT operation on the bites of the absolute value of the encoded number. We can show this idea on 4-bit example: Positive numbers (just equal to their absolute values): xd xOC d XOC b 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 Negative numbers (bitwise negations of the absolute values – complement to 1 for each bit): xd xOC d XOC b 0 15 1111 -1 14 1110 -2 13 1101 -3 12 1100 -4 11 1011 -5 10 1010 -6 9 1001 -7 8 1000 -5 11 1011 -6 10 1010 -7 9 1001 Comparing these codes with negative two’s complement codes: xd xTC d xTC b -1 15 1111 -2 14 1110 -3 13 1101 -4 12 1100 we can notice, that the xTC value for each negative number can be easily calculated from the xOC value by adding 1. 5 The mathematical definition (formula) for the n-bit “one’s complement” code can be written down this way: xOC = x x 0, xOC = 2n – 1 – |x| (or 2n – 1 + x) x 0. We can also assume, that xTC = xOC + 1 x 0. So we can obtain the two’s complement value very easily: Make the bitwise negation on the binary code of the absolute value of the negative number we want to encode (it’s very easy for digital hardware too). Add 1 to code obtained in the first step. Example 7. Try to encode the number -5d in 4-bit two’s complement code: 5d = 0101b, the bitwise negation is 1010b, plus 1 gives 1011b (to check this see chapter 2.3.). 2.5. Binary Coded Decimal (BCD) code. In computing and electronic systems, binary-coded decimal (BCD) is an encoding for decimal numbers in which each digit (0,…, 9) is represented by its own binary (4-bit) sequence. In most of cases “BCD” stands for a code in which each decimal digit is represented by a 4-bit NBC (weights 8-4-2-1) word, used in limited range: Decimal digit BCD “digit” 0 1 2 3 4 5 6 7 8 9 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 The NBC codes larger than 1001b are not used as digits in BCD code (6 wasted values). Example 8. The decimal value 1984 in the BCD code looks like this: 0001 1001 1000 0100. The purpose for the BCD code is limited not for arithmetical operations in computer programs rather, but for driving specialized digital displays (decimal counters, clocks, digital measuring devices etc.). Typical display device used as output for BCD-encoded numbers is 7-segment LED or LCD display module (see Fig. 1.). In fact special logic circuit (decoder / driver) is connected between BCD-encoded output from measuring system (or computer) and display. A typical 7-segment LED display component, with decimal point (8-segments in fact). 6 A widely used variant of this code is called “packed BCD” (or simply packed decimal), where numbers are stored with two decimal digits “packed” into one byte each. The last decimal digit (or “nibble”) is used as a sign indicator. The preferred sign values are 1100b (Ch) for positive (+) and 1101b (Dh) for negative (−). Other allowed signs are 1010b (Ah) and 1110b (Eh) for positive and 1011b (Bh) for negative. Some implementations also provide unsigned BCD values with a sign nibble of 1111b (Fh). Example 9. In packed BCD, the decimal number +127 is represented as two bytes: 0001.0010 0111.1100 (hex. 12 7C), and −127 as 0001.0010 0111.1101 (hex. 12 7D). The processors usually don’t support BCD-oriented arithmetic operations (like “addBCD”, for example), upper-level languages don’t have the basic types like “BCD int”. But in most of cases it’s possible to write assembly program which can operate on BCDencoded numbers, using special correction for normal binary additions, subtractions etc. when needed. Example 10. Let’s analyze the binary addition of two BCD values: C7 C6 C5 C4 47d = 0100.0111b + 19d = 0001.1001b 0 66d = 0110.0110b + 0 0 1 C3 C2 C1 C0 1 1 1 1 0 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 Carry bits The result is not correct in BCD code, although both the digits (nibbles) are less than 10d (1010b). Please notice, that there was non-zero carry between lower and upper nibble (c3). This carry bit is often called an auxiliary carry. This fact is a signal for us, that the BCD correction should be applied to the result. The second condition can be the value of nibble greater than 9d (1001b), but this doesn’t happen in this example. What kind of correction should be done? We need to subtract 10d from the lower nibble of the result (like we do in “normal” decimal adding). How to do it as fast as it’s possible? Just add the two’s compliment of the 10d number. Let’s count -10: 10d = 1010b, bitwise negation is 0101b, plus 1 is 0110b (6d). So let’s do it: C7 C6 C5 C4 47d = 0100.0111b + 19d = 0001.1001b 0 66d = 0110.0110b + + 0 0 1 1 C3 C2 C1 C0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 7 Carry bits BCD correction “plus 6 correction” This kind of correction is often called “a plus 6 correction”. Of course, if the result on upper nibble fulfills one of the BCD correction conditions (value of the nibble greater than 1001b or carry bit c7 set to 1) we will have to apply the “plus 6 correction” to the upper nibble too. The minimum support from the CPU for BCD arithmetic we should count on is checking auxiliary carry (AC = c3) and carry (CY = c7) bits. In fact most of processors have the packed BCD correction implemented as one instruction, so programmers should only remember to make it after each operation on BCD arguments. 8 Exercises: 1. Fill the table below for the sign-module (SM), one’s complement (OC) and two’s complement (TC) codes. Assume 16-bit code word (n = 16). Binary notation ZERO Maximum number Minimum number Hexadecimal notation Decimal value SM OC TC SM OC TC SM OC TC 2. Try to draw the diagram for the (discrete) function f(x) = xSM (n=8). 3. Try to draw the diagram for the (discrete) function f(x) = xOC (n=8). 4. Try to draw the diagram for the (discrete) function f(x) = xTC (n=8). 5. Using the numbers from your date of birth (day, month, year) write down the following arguments as 16-bit integers: A = mmddd = ?h = ?b; B = yyyyd = ?h = ?b. Then perform on binary arguments (two’s complement code) the following operations: A + B = ?; A – B = ?, B – A = ? 6. Using the numbers from your date of birth (day, month) write down the following arguments as packed BCD-coded values (one byte, two decimal digits): A = mm; B = dd Try to add these arguments checking conditions for BCD correction (“plus 6 correction”) and apply correction if needed. 9