1.4.1 – Data Types A-level Computer Science Specification Overview Specification Points 1.4.1 Data Types • Primitive data types, integer, real/floating point, character, string and Boolean. • Represent positive integers in binary. • Use of sign and magnitude and two’s complement to represent negative numbers in binary. • Addition and subtraction of binary integers. • Represent positive integers in hexadecimal. • Convert positive integers between binary, hexadecimal and denary. • Representation and normalisation of floating point numbers in binary. • Floating point arithmetic, positive and negative numbers, addition and subtraction. • Bitwise manipulation and masks: shifts, combining with AND, OR and XOR. • How character sets (ASCII and UNICODE) are used to represent text. Overview Computers use electricity to represent information. An electric current has two possible states: it can either be on or off. Computers represent ‘on’ using the digit 1 and ‘off’ using the digit 0. These 0s and 1s are called binary and combinations of 0s and 1s can be used to represent anything from a simple number to a film. Each 0 or 1 in binary is called a bit (short for binary digit). Because it requires so many bits to represent quite small numbers, let alone complex things like games or music files, groups of bits are commonly grouped together and given names. A group of 4 bits is called a nibble and can represent 24 (128) combinations. A group of 8 bits is called a byte; a byte (00000000) can provide 28 (256) different combinations of 0s and 1s. Bytes are grouped together in the following ways: • 1 024 bytes = 1 kilobyte (kB) = 210 bytes • 1 048 576 bytes = 1 megabyte (MB) = 220 bytes • 1 073 741 824 bytes = 1 gigabyte (GB) = 230 bytes • 1 099 511 627 776 bytes = 1 terabyte (TB) = 240 bytes Primitive Data Types Nearly all programming languages provide programmers with five primitive (or simple) data types that can be connected together to create more complex (or composite) data types. The five primitive data types are integer, real, character, string and Boolean. The amount of memory that each data type takes up varies from language to language and processor to processor (they take up more bits on a 64-bit system compared to a 32-bit system). A good estimate is that a character takes up 8 bits and an integer 32 bits. The rest of this chapter will show how binary is used to represent all possible values held using these data types. Task Copy this table into your blog and complete: Field/Variable Shoe Size Password Multiple Choice Answer (Single Letter) Age Over 18? Goal Difference Data Type Justification for choice of Data Type Representing Positive Numbers in Binary Binary is a base 2 number system, which means that each value is twice as large as the value before it. Our usual number system is base 10, also known as denary. In denary, each digit is ten times larger than the one before it. In the denary system, 1011 means one thousand and eleven, because 1000 10 1 equals one thousand and eleven. In the binary system 1011 means eleven, because 8 + 2 + 1 = 11. 1000 100 10 1 1 0 1 1 8 4 2 1 1 0 1 1 Converting from Denary to Binary 98 in denary into binary 98 divide by 2 = 49 remainder 0 49 divide by 2 = 24 remainder 1 24 divide by 2 = 12 remainder 0 12 divide by 2 = 6 remainder 0 6 divide by 2 = 3 remainder 0 3 divide by 2 = 1 remainder 1 1 divide by 2 = 0 remainder 1 0 divide by 2 = 0 remainder 0 01100010 Write the binary value starting from the remainder at the top on the far right hand side. 01100010 which equals 98 Converting from Binary to Denary 10000111 from Binary to Denary 128 64 32 16 8 4 2 1 1 0 1 1 1 0 0 0 128 + 4 + 2 + 1 = 135 Task Convert the following, show your workings: 1. 01111001 into denary 2. 44 into binary 3. 01011010 in denary 4. 111 into binary 5. 11111001 into denary Representing negative numbers in binary Binary can also be used to represent negative numbers (−1, −2, −3 and so on). There are two ways in which this can be done, you’ll need to read the exam questions carefully to make sure you use the correct method. Sign and Magnitude Two’s Complement Sign and Magnitude Sign and magnitude is the simplest way of representing negative numbers in binary. The most significant bit (MSB) is simply used to represent the sign: 1 means the number is negative, 0 means it is positive. So to represent −4 in 8-bit binary you would write out the 4 in binary, 00000100, then change the MSB to a 1, creating 10000100. This notation isn’t used very often as using the MSB as a sign bit means that the largest number that can be represented using 8 bits is 127, much less than 255. It also makes it harder to do calculations as different bits mean different things; some represent numbers, others represent signs. Also, their value of zero is represented twice as both positive and negative 0. MSB notation can represent numbers in the range −127 to 127. Two’s Complement Two’s complement is a much more useful way of representing binary numbers as it allows you to use negative numbers without reducing the range of numbers you can use. The easiest way to show a negative number using two’s complement is to write it out as a positive number using the usual binary method. Then, starting at the right-hand side, leave every bit up to and including the first 1 alone, but invert all the bits after this. https://www.youtube.com/watch?v=aoQrPv8CxWA&index=4&list=PLCiOXwirra UBO3Z2dxnIfuNDspmJmorJB Two’s Complement 1. For example, to show −90 in binary, first work out 90. 90 is 01010110 because 64 + 16 + 8 + 2 = 90. 1. Then start at the right-hand side and leave everything alone up to and including the first 1: 1. Finally invert everything after this, so all the 1s become 0s and vice versa: 2. Thus, −90 in binary is 10100110 in two’s complement notation. 128 64 32 16 8 4 2 1 0 1 0 1 1 0 1 0 128 64 32 16 8 4 2 1 0 1 0 1 1 0 1 0 (-)128 64 32 16 8 4 2 1 1 0 0 1 1 0 1 0 Binary Addition Binary addition is fairly straightforward. Just remember these simple operations: 0+0=0 1+0=1 0+1=1 1 + 1 = 1 0 (1 and 0 / 0 carry 1) 1 + 1 + 1 = 1 1 (1 and 1 / 1 carry 1) Binary Addition To add together two binary numbers, just arrange them above each other in a table. The following table shows the number 59 above the number 117: 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 1 1 1 0 1 0 1 1 0 1 1 0 0 0 0 So our final result is 10110000 or 176. We know this is correct because we can check our answer: 59 + 117 = 176. Task Add together the following binary values, show your workings: 1. 01111001 + 00011011 1. 01010101 + 00111101 2. 11101110 + 00011111 Exam Tip! In an exam it can be tempting just to work out the answer in denary and then write down the binary equivalent of this number. Avoid this temptation, as questions of this type may require you to show your working. Binary Subtraction Binary subtraction is the same as binary addition except that you convert the number to be subtracted into negative binary (using two’s complement) before adding them together. For example, if you want to calculate 123 − 77 in binary, the first thing you do is write down the binary for 123. Then you work out what −77 is in binary using two’s complement and write this beneath it (77 is 01001101, thus -77 is 10110011) Finally, you add them together using the same operations as for binary addition. (Notice that the final 1 that would be carried off the end of the final (8th) bit is just ignored.) So the answer is 00101110 = 46 and we know this is correct because 123 − 77 = 46. 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 0 1 1 1 0 Task Subtract the following binary values, show your workings: 1. 11110001 – 00110011 1. 75 - 37 1. 103 – 42 Representing positive integers in hexadecimal and converting between binary, hexadecimal and denary Binary can be difficult for people to understand and it’s easy to make mistakes when recording numbers. To get around this problem we often use a base 16 number system, called hexadecimal, to represent numbers in computing. Hex is used in computing… • to represent colours using the RGB colour model (#FF00A1) • within Mac addressing (00:0a:95:9d:68:16) • to express error codes (0x32DD – see https://msdn.microsoft.com/enus/library/ms681381(VS.85).aspx) You’ll need to know how to convert between denary, binary and hexadecimal. Luckily it isn’t very hard. Hexadecimal uses 16 symbols, the numbers 0–9 and letters A–F. Denary Binary (4-bit) Hexadecimal 0 1 2 0000 0001 0010 0 1 2 3 4 5 6 7 0011 0100 0101 0110 0111 3 4 5 6 7 8 1000 8 9 10 11 1001 1010 1011 9 A B 12 13 14 15 1100 1101 1110 1111 C D E F Converting Binary/Denary to Hexadecimal To convert a denary number to hexadecimal, simply convert it to 8-bit binary. Then split your 8-bit binary number into two 4-bit sections and convert each of these to hexadecimal using the table above. Then split this 8-bit binary number into two 4bit sections: E.g. 108 = 01101100 8 0 4 1 2 1 1 0 8 1 4 1 2 0 1 0 Converting Binary/Denary to Hexadecimal Finally convert each of these 4-bit segments to hexadecimal: 4 + 2 = 6 (6) 8 + 4 = 12 (C) 108 in hex is 6C 8 0 4 1 2 1 1 0 8 1 4 1 2 0 1 0 Converting Hexadecimal to Denary To convert a hexadecimal number into denary we use the column values just like what we did for binary, but this time for a base 16 numbering system. 4096 256 16 1 163 = 4096 162 = 256 161 = 16 160 = 1 256 16 1 A 2 C 10 x 256 2 x 16 12 x 1 Thus, A2C as a denary number is… 4096 A2C is 2560 + 32 + 12 = 2604 in denary. Task Convert the following numbers into Hex: 1. 11110101 2. 01010101 3. 87 4. 131 Convert the following Hex values into denary: 1. A3 2. 69 3. 8AF 4. 1C2E Floating Point Numbers in Binary Floating point binary is used to hold really big, really small or really accurate numbers using just 8 bits. In GCSE Mathematics you learned about floating point notation (sometimes called scientific notation). Using floating point, the number 92 can be represented as 0.92 × 100, which is the same as 0.92 × 102. It is this last part that we refer to as floating point notation. It is called floating point because the number of digits is fixed, but the decimal point floats around. Floating Point Numbers in Binary Using the example 0.92 × 102, 0.92 is the mantissa, 10 is the base and 2 is the exponent. The mantissa is the part of the floating point number that represents the significant digits of that number. The exponent is the power to which the number in the mantissa is to be raised. Floating Point Numbers in Binary Floating point notation allows us to store a much larger range of numbers and store them with much more accuracy. To show your understanding of floating point notation in binary, you will usually be given two binary values (one for the mantissa and one for the exponent) and asked to convert them to denary. E.g. What is the decimal equivalent of the floating point number 01101 011 where 01101 is the mantissa and 011 is the exponent? Floating Point Numbers in Binary What is the decimal equivalent of the floating point number 01101 011 where 01101 is the mantissa and 011 is the exponent? • Step 1: Convert the exponent to denary, in this example 011 = 3. • Step 2: The mantissa started as 0.1101. Move the decimal point 3 places to get 0110.1 • Step 3: Write 0110.1 as denary 6.5 (4 + 2 + 0.5) 8 4 2 1 . 1/2 1/4 1/8 1/16 0 1 1 0 . 1 0 0 0 Example 2 What is the decimal equivalent of the floating point number 0100101000 000100? Exponent = 000100 = 4, thus binary point has “floated” four places to the left. Mantissa started as 0.100101000, move 4 place it becomes 1001.01000: 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1 0 0 1 . 0 1 0 0 0 Thus 0100101000 000100 = 9.25 in denary. Converting a Denary Number to Floating Point Consider we were given the number +4.25 and needed to convert it to floating point binary, using 5 bits for the mantissa and 3 bits for the exponent. 8 4 2 1 . +4.25 = 0100.01 in pure binary 0 1 0 0 . Move the point 3 places to the left. Mantissa =10001 Exponent = 011 10001 011 1/2 1/4 1/8 0 1 0 1/16 1/32 0 0 Task Convert the following floating point numbers into denary: 1. 0101001000 000100 1. 0101100100 000110 1. 0111000000 111111 Convert the denary numbers into floating point numbers (use a 6 bit mantissa and 3 bit exponent): 1. +2.25 1. +6.75 1. +4.5 Normalisation of Floating Point Numbers The precision of floating point binary depends on the number of bits used to represent the mantissa. The larger number of bits used in the mantissa will allow a number to be represented with greater accuracy, however this will reduce the number of bits in the exponent and consequently the range of values which can be represented. A key point of note is that there will always be a trade-off between accuracy and range when storing real numbers using floating point notation, as there will always be a set number of bits allocated to storing real numbers with the potential to increase or decrease the number of bits used for the mantissa against the number of bits used for the exponent. In calculating complex qualifications sometimes the relationships between exponent vs. mantissa can swap (2nd Year University stuff). (Known as accuracy and range in the Specification) For example, 67849 could be 0.67849 × 105 with 5 digits for the mantissa or 0.6785 × 104 with 4 digits for the mantissa. In order to achieve the most accurate representation possible with the number of bits available for a mantissa, the number should be written with no leading 0s. Normalisation is the process of removing these leading 0s. Normalisation of Floating Point Numbers Mantissa Exponent 0000111 100 Step 1: Convert the exponent to denary, 100 = 4. Step 2: Move the decimal point in the mantissa so that it is before the first 1, so 0.000111 becomes 0.111. The decimal place has moved 3 places. Step 3: Because we’ve just moved the decimal point we need to adjust the exponent to take this into account. As we moved to the right, we subtract the number of moves from our exponent: 4 − 3 = 1. So our new exponent is 1. Step 4: Our new, normalised, floating point number is 0111 001, which is the same as 0000111 100 (work it out if you don’t believe it!). But now we have loads more space in the mantissa so we can store a much more accurate number if we wanted to. Example 2 A binary number is presented in this format, a 5 bit mantissa and 3 bit exponent 00111 011 Q: Is this in normalised form ? Answer: No, consider the first two bits. Exponent = 011 = 3 Mantissa = 0.0111, which becomes 0.111. Number of moves right is 1. As we moved right by 1, we subtract this from the exponent to become 010. 00111 011 becomes 0111 010 Example 3 This time we are going to look at if the bit length of the Mantissa and Exponent are provided… E.g. A real binary number may be represented in floating point binary notation using 5 bits for the mantissa and 3 bits for the exponent, both in two’s complement binary. Give the normalised version of the number 00010 011 The answer would be… 01000 001 WHY? Task Give the normalised version of all the following numbers using 6 bits for the mantissa and 3 bits for the exponent. 1. 000011 010 1. 001010 100 1. 000110 011 Negative Floating Point Binary Numbers It is also possible to have a negative exponent and move the decimal point to the left. This is achieved by storing the exponent as a two’s complement binary number. Mantissa Exponent 8 4 2 1 . 1/2 1/4 1/8 01110 110 0 0 0 0 . 0 0 1 1/16 1/32 1/64 1 1 • Step 1: Convert the exponent to denary; in this example 110 becomes 010 or −2. • Step 2: The mantissa started as 0.1110. Move the decimal point left 2 places to get 0.001110. • Step 3: Write 0.001110 as denary 0.21875 (0.125 + 0.0625 + 0.03125). 0 Negative Floating Point Binary Numbers The same method can be used to deal with negative numbers, representing the mantissa using two’s complement. Mantissa Exponent 11111 010 8 4 2 1 . 1/2 1/4 0 . 0 1 1/8 1/16 1/32 1/64 • Step 1: Convert the exponent to denary, 010 = 2. • Step 2: Convert the mantissa to negative binary by flipping the bits, in this example 11111 becomes 0.0001 (think twos complement). • Step 3: The mantissa started as 0.0001. Move the decimal point 2 places to get 0.01. • Step 4: Write 0.01 as denary −0.25; don’t forget it’s negative! Example 2 Mantissa Exponent 10101 011 Exponent = 011 = 3 Mantissa = 10101 = Converted to 2s comp = 0.1011 Move Mantissa 3 places right 0101.1 = 8 4 2 1 . 1/2 1 0 1 . 1 1/4 -2.5 1/8 1/16 1/32 1/64 Adding Floating Point Numbers You will be pleased to know that addition and subtraction of floating point numbers are very similar operations. This example will walk you through adding the two numbers below: Mantissa (10 bit) Exponent (6 bit) 0100100000 000100 0110100000 000011 Step 1: In order to add two floating point numbers together, their exponent must be the same. We achieve this by changing the exponent of our second number from 000011 (3) to 000100 (4). Because we’ve changed the exponent of the second number, we need to adjust its mantissa to take account of this, so the mantissa becomes 0011010000 (as 0011010000 000100 is the same as 0110100000 000011). Adding Floating Point Numbers Step 2: Now we can add together the two mantissas using our usual method: 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 Step 3: So the result is 0111110000 00100 Adding Floating Point Numbers Lets check it back… Mantissa (10 bit) Exponent (6 bit) Our original data looked like this… 0100100000 000100 0110100000 000011 0100100000 000100 = 01001.00000 = 9 0110100000 000011 = 0110.100000 = 6.5 9 + 6.5 = 15.5 Answer - 0111110000 00100 = 01111.10000 = 15.5 Subtracting Floating Point Numbers Subtraction is the same as addition except that you convert the number to be subtracted to a negative number using two’s complement. Mantissa (10 bit) Exponent (6 bit) 0110000000 000011 0100100000 000010 Step 1: Just like addition, the exponents of the two numbers must be the same. We achieve this by changing the exponent of our second number from 000010 (2) to 000011 (3). Because we’ve changed the exponent of the second number, we need to adjust its mantissa to take account of this, so the mantissa becomes 0010010000. Subtracting Floating Point Numbers Step 2: Now convert the mantissa of the second number to a negative one, using two’s complement: 0010010000 becomes 1101110000. Step 3: Now we can add together the two mantissas. 1 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 Step 4: The result is 0011110000 000011 or 3.75 (X remember to ignore any carried values past the magnitude of the starting bit length when using 2’s comp). Subtracting Floating Point Numbers Lets check it back… Our original data looked like this… Mantissa (10 bit) Exponent (6 bit) 0110000000 000011 0100100000 000010 0110000000 000011 = 0110.000000 = 6 0100100000 000010 = 010.0100000 = 2.25 6 – 2.25 = 3.75 0011.110000 000011 = 3.75 Character Sets As well as representing numbers, binary can also be used to represent characters. For example, 01100001 might represent the character ‘a’. The different representations of letters and symbols in binary are called character sets. The two most commonly used character sets are called ASCII and UNICODE. Character Sets ASCII encodes 127 different letters and symbols using 7-bit binary codes, which is perfect for writing in English. You won’t need to know the ASCII value of every character but it’s worth taking the time to learn where lower case letters (97) and uppercase letters (65) start. http://www.ascii-code.com/ Languages like Japanese, Arabic and Hebrew all have a much larger alphabet than English. To get around this problem, UNICODE was invented. UNICODE uses up to four bytes to represent each character (depending on the version). This means that it can represent up to 110 000 different characters – more than enough to cope with most of the world’s languages. http://unicode-table.com/en/