Higher Computing Systems Unit Higher Systems Data Representation Topic 1 Data Representation Representation of positive numbers in binary including place values and range up to and including 32 bits Conversion from binary to decimal and vice versa Description of the representation of negative numbers using two’s complement using examples of up to 8 bit numbers Description of the relationship between the number of bits assigned to the mantissa/exponent and the range and precision of floating point numbers Conversion to and from bit, byte, Kilobyte, Megabyte, Gigabyte, Terabyte. (Kb, Mb, Gb, Tb) Description of Unicode and its advantages over ASCII Description of the bit map method of graphic representation using examples of colour/greyscale bit maps Description of the relationship of bit depth to the number of colours using examples up to and including 24 bit depth (true colour) Description of the vector graphics method of graphic representation Description of the relative advantages and disadvantages of bit mapped and vector graphics Description of the relationship between the bit depth and file size Explanation of the need for data compression using the storage of bit-map graphic files, as examples Page -1- Higher Systems Data Representation Introduction Computers are called two-state devices because all data is stored using two values. All the logic circuits used in digital computers are based upon two-state logic. That is, quantities can only take one of two values, typically 0 or 1. These quantities will be represented internally by voltages on lines, zero voltage representing 0 and the operating voltage of the device representing 1. The reason two-state logic is used is because it is easy and economic to produce such devices. Measures: o A bit is a Binary digit a 0 or a 1. o 8 bits make a byte o 1024 bytes in 1 Kilobyte (1024 = 210) o 1024 Kbytes in 1 Megabyte (220) o 1024 Mbytes in 1 Gigabyte (230) o 1024 Gbytes in 1 Terabyte. To go from bits to bytes, divide by 8. To go from Kbytes to Mbytes, divide by 1024, etc. Numbers We use the base 10 number system to represent whole numbers, integers and fractional numbers. This number system uses the 10 digits 0 9 to represent numbers. The value of a decimal digit is given by its position within the base 10 number system. Page -2- Higher Systems Data Representation Example: 34 043 is 10 000 1 000 100 3 4 0 10 4 1 3 3 x 10 000 + 4 x 1 000 + 4 x 10 + 3 x 1 The binary number system When numbers are represented electronically, the most convenient base is 2, where each column, reading from the right is a power of two. The base 2 number system uses 2 symbols, 0 and 1 to represent a value. Example 1: 10110 is 16 1 8 0 4 2 1 1 1 0 1 x 16 + 1 x 4 + 1 x 2 = 22 Example 2: 1101100110011010 is 32768 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1 1 1 0 1 1 0 0 1 1 0 0 1 1 0 1 0 = 32768 + 16384+4096+2048+256+128+16+8+2 = 55 726 So, to convert a binary number to our numbers (denary, or base 10), you put the headings starting from the right : …. 32 16 8 4 2 1 above the number and add up the headings where there is a 1. The binary number system has the huge advantage that only two symbols are required, 0 and 1. These can easily be represented in a computer system by a switch or transistor being on or off, or by a high or low voltage level. Imagine how difficult it would be to represent 10 discrete logic values for the base 10 number system. You can easily store data in binary e.g. magnetic discs using N/S magnetism or CDs using pits and lands to reflect light. Binary representation also simplifies the number of arithmetic rules that need to be applied in calculations. Binary arithmetic has fewer rules. You need 100 rules for adding our numbers, you just need to know 0+0, 0+1, 1+0 and 1+1 (=10) to add binary numbers. Page -3- Higher Systems Data Representation So the advantages of binary are: Simple arithmetic Simple electronic circuits Wide range of storage devices can use 2 values. Another advantage is called ‘signal degradation’. If you used 0V for 0, 1V for 1 up to 9V for 9, voltages are never stable and if that 9V drops to 8.5 is it an 8 or a 9? With binary you can have a large difference between the values (e.g. 0V for 0 and 8V for 1) Converting our numbers to binary: There are two ways to convert decimal numbers into binary. Method 1. To convert 29 into binary: write down the binary headings (don’t go past 29), then work out which headings add up to 29: 16 8 4 2 1 To get 29 we need a 16, an 8 a 4 and a 1, so 29 = 11101 Method 2. This is guaranteed to work on any number and is useful for very large numbers. Here you continuously divide by 2, writing down the remainder each time until there is nothing left. The binary number is formed by reading the remainders up the way. 2 29 2 14 R 1 2 7 R 0 2 3 R 1 2 1 R 1 = 1 1 1 0 1 0 R 1 Page -4- Higher Systems Data Representation Something to know about storing numbers on a computer is that a fixed number of bits is always used. Let’s say a computer uses 32 bits to store numbers, then 3 would be stored as: 324 564 046 stored as: 00000000000000000000000000000011 00010011010110000111010001001110 The programming would be too difficult if variable length numbers were used, the computer wouldn’t know when a number ended! This means that there is a fixed range of values that can be used determined by how many bits you use. The range of positive integers. With 1 bit you can get 2 possible values: 0 or 1. With 2 bits you can get 4 values: 00, 01, 10 and 11 With 3 bits you get 8 values: 000, 001, 010, 011, 100, 101, 110 and 111 1 bit 2 bits 3 bits … 8 bits … n bits 21 22 23 … 28 … 2n values values values … values … values = = = 2 4 8 range 0 1 range 0 3 range 0 7 = 256 range 0 255 = 2n range 0 2n - 1 The range of positive numbers you can code is always one less because we start at 0. For n bits the range is 0 to 2n - 1 If we lived in a binary world we would probably have a 16 hour clock (and 32 hour day), but the clock numbers would range from 0000 to 1111 (0 – 15) ! Page -5- Higher Systems Data Representation You can add, subtract, multiply & divide in binary exactly the same as our numbers although you would find division difficult to get the hang of. The only thing you need to remember is 1+ 1 = 10 (and 1+1+1 = 11). Also 10 – 1 = 1 Simple really! So 1010 + 1011 = 10 10 + 1 01 1 1 10101 EXERCISE 1: 1. Convert these binary numbers to decimal: a) 10101 b) 11001 c) 11100010 d) 10101010 e) 11110000 2. Convert these decimal numbers to binary: a) 27 b) 37 c) 56 (use 8 bits) d) 97 (use 8 bits) e) 765 (use ÷ by 2 method) 3. Why are computers called two state devices? 4. Give two reasons why computers use binary. 5. What range of positive numbers can be stored using: a) 1 byte b) 16 bits use powers of 2 for these answers: c) 20 bits d) 32 bits 6. Add these binary numbers: a) 1 0 1 0 + 11 b) 1 1 0 1 1 1 +1 0 1 1 0 1 c) 1 0 1 0 1 1 0 0 +11010101 7. If you use a fixed number of bits to store numbers, what will happen if there is a carry at the end? 8. What number has been tattooed on this leg: a) If you go ankle to knee? b) If you go knee to ankle? Page -6- Higher Systems Data Representation Negative numbers An obvious way of getting computers to store –ves would be make the first bit a 1 for negative, 0 for positive. This is called sign and magnitude but unfortunately doesn’t work because adding gives the wrong answer and there is a +ve 0 and a –ve 0. So two’s complement is used because there is only one 0 and arithmetic works correctly. In actual fact the ALU in a processor can only carry out two operations: Binary addition Inverting (or flipping) bits changing 1 to 0 and vice versa. Two’s complement allows the ALU to store negatives and to subtract using these two operations. Subtraction is just adding the negative ( i.e. 7 – 2 = 7 + (-2) ) To store a –ve in two’s complement: Step 1: Write down the positive value in binary Step 2: Flip the bits ( 1 becomes 0 and 0 becomes 1) Step 3: Add 1 For two’s complement to work correctly you must use a fixed number of bits for each number, so add 0s to the front to get the required amount. Example 1: What is -17 in two’s complement using 8 bits Step 1: Step 2: Step 3: 17 = 00010001 flip: 11101110 add 1: 11101111 So 11101111 is -17 in 8-bit two’s complement. Example 2: What is -88 in 8-bit two’s complement? Step 1: Step 2: Step 3: 88 = 01011000 flip: 10100111 add 1: 10101000 So 10101000 is -88 in two’s complement. You must remember that positive numbers are still stored in ordinary binary. So you only need to do step 1 for +ve numbers. Page -7- Higher Systems Data Representation Using two’s complement, -ve numbers will always start with a 1, +ves with a 0. Also two’s complement is its own inverse. So to convert back you flip and add 1. So going from binary to our numbers: Example 3: What is this two’s complement number? 10110001 We can see it is a –ve because it starts with 1. So Flip : 01001110 Add 1: 01001111 Put the usual binary headings above this number and you get: 64 + 8 + 4 + 2 + 1 = 79 So the original number was –79 Example 4: What is the value of the two’s complement number11001111? Flip: 00110000 Add 1: 00110001 work it out as 49, so answer -49. N.B. If the number starts with a 0 it is positive, just put the headings above it to find out what it is. EXERCISE 2: 1. Write these numbers in 8-bit two’s complement: a) -34 b) -19 c) -97 d) -64 e) 28 2. These binary numbers are stored in 8-bit two’s complement, work out their value in denary. a) 11000000 b) 10111111 c) 10011000 d) 01010001 3. Why do computers use two’s complement? 4. Work out these subtractions by i) subtracting them (remember 10 – 1 = 1 ) and ii) by adding the two’s complement. a) 01001010 – 00000110 b) 01100111 – 00111111 (throw away the carry at the end when you add the two’s complement) 5. What range of values can be stored using 8 bits in two’s complement? Page -8- Higher Systems Data Representation That covers how to store Integers. What about numbers outside of the range of integers? Also what about decimals? For these numbers, computers use Floating Point. This is the same as Standard Form except the computer does not store the point or the base and counts the places from the BEGINNING OF THE NUMBER. 93 000 000 in Standard Form : 9.3 x 107 93 000 000 in Floating Point : 93 8 The 93 is called the MANTISSA The 8 is called the EXPONENT. Again, computers use a fixed number of bits to store floating point numbers. For instance with 32 bits they might use 24 bits for the mantissa and 8 bits for the exponent. Or they could use 20 bits mantissa, 12 bits exponent. This has an effect of the ACCURACY that numbers are stored in and the RANGE of values that can be stored. With our numbers imagine we have a calculator that can only store 3 digits for the mantissa and 1 digit for the exponent: E 93 000 000 would be stored: 12 875 000 would be stored: 1 9 3 8 2 9 8 Note the loss of accuracy. The length of the mantissa determines how accurately (or precisely) floating point numbers can be stored. 2 340 000 000 cannot be stored at all. The point moves 10 places and this is out of our range. So the exponent determines the range of numbers that can be stored in floating point. The more bits for the mantissa, the higher the accuracy The more bits for the exponent, the bigger the range. And vice versa. Page -9- Higher Systems Data Representation TEXT ASCII American Standard Code for Information Interchange (ASCII) was first developed for teletypewriters and is now an internationally agreed standard for storing information. ASCII uses 7 bits per character, giving a possible 128 different characters. It has 96 displayable characters, enough to represent every letter, number and punctuation mark of the English alphabet which forms its character set. Each character has its own unique code. There are 32 special character codes known as control characters. They make something happen like new line, clear screen etc. Now computers use bytes (groups of 8 bits), so the extra bit can be used either for error checking or extending the character set to include French, German etc characters like : ê, å, ñ, etc. Now the problem with ASCII is it was designed for our Latin alphabet and as we have seen can be extended to cover Western European character sets, but what about Urdu, Arabic, Chinese and so on. So ASCII is being superseded by Unicode. Unicode is a 16 bit code giving 65 536 characters which is enough to include all the world’s alphabets. ASCII forms the first 128 characters and extended ASCII forms the next 128 codes. So Unicode has the advantage of including existing ASCII but extends to all character sets, so you can code every alphabet in the world (including ancient unused ones). The disadvantage is text takes up twice as much storage or twice as much bandwidth. EXERCISE 3 1. Describe how very large or small numbers are stored. 2. What is the effect of increasing the number of bits allocated to the mantissa? 3. When running a programming you can get an ‘overflow’ error. What do you think this error means in relation to floating point storage? 4. Give one advantage and one disadvantage of using Unicode rather than ASCII. 5. How many different characters (or ‘glyphs’) can be stored in Unicode? 6. How many bytes of storage would be needed for the sentence inside the rectangle below if it was stored in Unicode? This is quite a short sentence, only 9 words.!! Page -10- Higher Systems Data Representation GRAPHICS There are two methods of storing graphics:- bitmap and vector. BITMAP In Maths a mapping is a correspondence between two sets like whole numbers and their squares. 1 2 3 4 1 4 9 16 bitmap is the correspondence between the pixels and the bits in memory. In Black & White (monochrome) graphics there is a simple 1 – 1 correspondence between each pixel and each bit. Every bit maps to a pixel. For colour we can use more than 1 bit per pixel (called the colour depth or bit depth). There will still be a mapping, but now 1 pixel will map to more than 1 bit. If we have 4 colours you will need 2 bits per pixel, possibly like this: 00 01 Now the bit map for this: 10 11 will be this: 01001011 00010000 11110010 10000100 Page -11- Higher Systems Data Representation If you use n bits you will get 2n different colours or shades, n is called the colour depth or bit depth. So: No. of Bits 1 2 4 8 16 24 n No. of Colours 2 4 16 256 65 536 16 777 216 2n 24 bit colour depth is also called true colour as that is the maximum number of shades the human eye can distinguish. The actual bitmaps are stored in the VRAM (Video RAM) on the graphics card. The amount of VRAM on the card determines the maximum colour depth and resolution (number of pixels) your screen can display. You can easily calculate the storage requirements for a screen using the formula: No. of BITS = Pixels across x Pixels down x bit depth. Example 1: A 1280 by 800 screen using 8 bit colour depth needs: 1280 x 800 x 8 BITS = 8192000 bits /8 = 1024000 bytes /1024 = 1000 Kbytes Page -12- Higher Systems Data Representation Example 2: Calculate the storage required for a screen using 1620 by 1280 pixels with 65 536 colours. Bits = 1620 x 1280 x 16 (because 216 = 65 536) = 33177600 bits /8 = 4147200 bytes /1024 = 4050 Kbytes /1024 = 3.96 Mbytes Note how the number of bits per pixel increases the file size. 24 bit colour produces a file 3 times larger than 8 bit colour. VECTOR GRAPHICS The second way of storing graphics is vector or object orientated graphics. Here graphics are stored by their objects and their attributes. In a graphics package you have a set of drawing tools. You can choose rectangle, circle, polygon etc. You can also choose colours for the lines, fill patterns, line thickness and so on. The objects are rectangle, circle etc. The attributes are line colour, line thickness, fill patters etc. Here in Fireworks I have drawn a circle choosing the object here. The attributes (or properties) can be changed here. Word uses vector graphics, Paint uses bitmap. Page -13- Higher Systems Data Representation Vector (or object orientated) stores graphics by objects and their attributes. This is just a list of numbers: 3, 100,200,250,3,4,2,1 where : 3 could be a circle 100,200 its centre 250 the radius 3 the line thickness 4 line colour 2 fill pattern 1 the layer CAD uses vector graphics You should get a feel yourself for the differences between vector and bitmap by trying Appleworks where Draw is vector, Paint is bitmap or the Paint program on a PC compared to Word graphics. You can also examine the size of any file you save by right clicking and choosing properties. Bitmaps are always big files. Differences between bitmap and vector: First of all you need to understand the effect of enlarging a graphic: Scaling a bit map in a painting package is done by applying a scale factor to each pixel. For an enlargement, there is the same number of pixels as in the original, each pixel just gets bigger. A screen’s resolution might be 100 dpi. If you print that on a printer with 600 dpi then each pixel is scaled by a factor of 6, i.e. it gets larger and appears ‘blocky’. With vector, the computer is using a sort of formula to draw each object, it can use all 600 dots on the printer so in fact the graphic actually gets finer and improves. Bitmap is called resolution dependent and vector is resolution independent. On the next page are printouts to show the differences, both of these graphics looked the same on screen. Page -14- Higher Systems Data Representation Advantages of bit mapped graphic representation A bit mapped image can be manipulated at the pixel level. Thus a designer may apply particular colour values to a selected pixel area to produce shading or texture effects. You can use spray cans, you can use an eraser. It is possible to create a wider range of irregular shapes and patterns by simply deleting pixels or adding pixels anywhere on the image. Disadvantages of bit mapped representation Requires large amounts of storage space; image becomes ‘blocky’ when scaled; does not take advantage of resolutions that are higher than the resolution of the image. Advantages of vector graphic representation: Requires less storage space than a bit mapped image; They can be edited at the "object" level, thus allowing the user to reposition, scale and delete entire objects, or groups of objects, with ease; Objects can be grouped to form larger objects that can then be manipulated as a single image; Objects can be layered. Images are resolution independent meaning that they can use the full quality of the display or print device. BITMAP FOR Edit individual pixels. Can create irregular shapes and lines. Use brush and spray can for paint effects. AGAINST Large amount of storage required Resolution dependent, loses quality on printer. VECTOR FOR Less storage required. Edit individual objects, e.g. move, scale, change colour, layer. Resolution independent, can use full quality of a printer. AGAINST Cannot edit pixels Do not have eraser or spray can. Because bitmaps take up so much storage space you often need to compress files for saving on digital camera storage cards or displaying on the Web. Specialised compression result in images that are indistinguishable from the original as far as the human eye is concerned, but much smaller file sizes. Two standards that will produce compressed image files are GIF (Graphic Interchange Format), better suited to drawings and cartoons that have only a few colours in them, and JPEG (Joint Pictures Expert Group) which can compress as much as 10 times more than GIF and is also more suitable for photo-realistic images. Page -15- Higher Systems Data Representation EXERCISE 4 1. What does a ‘bitmap’ mean? 2. How many colours can be displayed using a colour depth of 8 bits? 3. Calculate the storage required for a black and white (monochrome) screen using 800 by 600 pixels. 4. Calculate the storage required for a screen using 2048 by 1600 pixels with true colour. 5. How much storage would be required for a scan of a 6” by 4” photograph at 400 dpi using 65 536 colours? 6. What does the term resolution independent mean? 7. Name two operations you can carry out on a vector graphic that cannot be done on a bitmap. 8. If a systems analyst has to work out how much backing storage to have for a system which uses a great number of graphics, why might they prefer to know that only bitmaps are used on the system. 9. Give one advantage and one disadvantage of bitmap graphics. 10. Give one advantage and one disadvantage of vector graphics. 11. An animated gif has 12 frames each 40 by 40 pixels using 256 shades of grey. Calculate the total storage for the gif animation. Summary You should know / understand: Data representation as positive binary numbers and range of values (using up to 32 bits) and negative binary numbers (using Two’s Complement with up to 8 bits); Conversion between binary and decimal numbers; Representation of numbers using floating point technique and accuracy/range depending on size of mantissa and exponent; Conversion of binary numbers to and from bit, byte, Kbyte, Mbyte, Gbyte, Tbyte; Unicode and limitations of ASCII when representing character sets; Bit map methods of graphic representation including greyscale and colour; Relationship of bit depth and colour (up to 24 bits); Vector graphic methods of graphic representation (using objects); Relative advantages and disadvantages of bit mapped and vector graphics; Relationship between the bit depth and file size; The need for data compression. Page -16- Higher Systems Computer Structure Topic 2 Computer Structure Detailed description of the purpose of the ALU and control unit Description of the purposes of registers: to hold data being processed, instructions being executed, and addresses to be accessed Description of the function of the data bus and the address bus Description of the read, write and timing functions of the control lines Identification of other control lines, including reset and interrupt lines Simple description, referring to the appropriate buses and control lines, of the steps in the fetch-execute cycle Description of the following elements of computer memory: registers, cache, main memory, backing storage Distinction between the above elements of memory according to function and speed of access The concept of addressability Description and evaluation of the following measures of performance: clock speed, MIPS, FLOPS, and application based tests Description of the effect the following factors have on system performance: data bus width, use of cache memory, rate of data transfer to and from peripherals Description of current trends in computer hardware, including increasing clock speeds, increasing memory and backing storage capacity The main aim is to understand how binary programs (machine code) are stored and executed. Also you should understand what factors make one computer faster / more powerful than another computer. A British mathematician called John von Neuman in the 1940s first described how a computer should be designed. This revolved around the idea of the ‘stored program concept’ where a list of instructions was loaded into memory and a processor would fetch an instruction, decode it and execute it; fetch the next instruction and so on. This ‘von Neuman architecture’ is exactly the way computers operate today. Page -17- Higher Systems Computer Structure Computer Organisation A computer system is made up of Input / Output devices, backing storage, a processor and memory. Backing Storage Memory Input Devices Processor (CPU) Output Devices The input / output / storage devices are peripherals that plug into the system unit. We are going to take a closer look at the processor and memory. First of all there are three main parts to the processor: The ALU (Arithmetic & Logic Unit) which performs all the calculations and logical comparisons. A logical comparison is deciding whether one thing equals another or if it is bigger or smaller (e.g. IF answer$ = “PARIS” THEN ….) The Control Unit manages the fetching, decoding and executing of instructions. Registers which are temporary storage locations inside the processor. The CPU communicates with the Memory by BUSES. A bus is a set of lines working in parallel to carry bits back and forward. Here is an 8 bit bus carrying a byte from the CPU to RAM: P R O C E S S O R 0 1 0 0 1 1 1 0 M E M O R Y The byte will have come from a register in the processor and is going to a storage location in RAM. Page -18- Higher Systems Computer Structure There are three buses: The DATA BUS: this carries actual data back and forward between memory and the processor. It links to the Memory Data register (MDR) inside the CPU. The ADDRESS BUS: this carries the address of the actual storage location that data is being read from or written to. It links to the Memory Address Register (MAR) inside the CPU. The CONTROL BUS: this is not an actual bus but a series of lines that carry out different functions. One of these is the READ line that tells memory that the data has to come out of that storage location. Another line is the WRITE line that tells memory the data is to go into that storage location. There are 3 others that we’ll deal with later. So this is how the CPU communicates with memory: A Memory READ: The address is put on the address bus The read line is set on the control bus The contents of that location are put on the data bus and so transferred to the CPU. A Memory WRITE: The data is put on the data bus The address is put on the address bus The write line is set on the control bus The contents of the data bus go into that storage location. Each step in each of these operations has to be synchronised. There is another line on the control bus called the clock. The clock pulses billions of times a second (e.g. a Pentium IV 3 Ghertz microprocessor has a clock speed of 3 billion pulses per second). Each time a clock pulse is received on the clock line, the processor goes onto the next step. Registers inside the processor have an important role in all of this. For instance for a memory write: The data is put in the MDR which connects to the data bus. The address is put in the MAR which connects to the address bus. Likewise for a memory read except the data coming in goes from the data bus to the MDR. Page -19- Higher Systems Computer Structure THE FETCH _ EXECUTE CYCLE. Fetch Going back to the Von Neuman stored program concept, all a processor is doing over and over again is fetching an instruction, decoding it and executing it. This is called the fetchexecute cycle. Decode The cycle involves doing a memory read (to get the instruction), decoding the instruction and then carrying it out (executing). Execute The actual execution of the instruction could be just one step or many steps depending on what the instruction is. So the fetch – execute cycle involves: Put the address on the address bus Set the READ line The instruction comes in on the data bus The instruction goes to the Instruction Register where it is decoded The instruction is executed All of this is synchronised by the clock line on the control bus. Summary of what you need to know so far: You should know the purpose of the ALU is to perform calculations and make logical comparisons. The control unit manages the fetch – execute cycle by clock pulses to synchronise each step, by decoding instructions and managing their execution. Registers are storage locations inside the processor, they hold data that is being processed, the address that is to be accessed, the instruction that is to be decoded. There are three buses connecting the processor with memory: The data bus carries the actual data back and forward, the address bus carries the address of the storage location that is to be accessed. Page -20- Higher Systems Computer Structure The control bus is not really a bus as its lines do different things. Examples of lines on a control bus are: READ: tells the storage location being accessed to put its contents on the data bus. WRITE: tells the location being accessed to take in the contents of the data bus. CLOCK: synchronises all operations by regular pulses. RESET: sets the processor back to its start up state. INTERRUPT: carries a signal to tell the processor to stop what it is doing and deal with a peripheral. NON-MASKABLE interrupt. The interrupt line can be masked by programmers so that the CPU does not ‘see it’. However some things are too important to be masked – e.g. a power failure when the computer has a few seconds to save anything before RAM discharges. The fetch – execute cycle consists of Putting address of next instruction on the address bus Setting the read line The instruction comes into the processor The instruction is decoded The instruction is executed EXERCISE 5 1. What is meant by the stored program concept? 2. Name the three main parts of the CPU. 3. Give two examples of the purposes of registers. 4. Name the three buses that connect the processor to memory. 5. Why is the control bus not really a bus? 6. Give two different examples of the lines on the control bus. 7. What does it mean to say an interrupt can be masked? 8. Give an example of a non-maskable interrupt. 9. What is the purpose of the address bus? 10. Explain why the data bus is two way and the address bus is one way. 11. List the steps involved in performing a memory write. 12. By referring to the buses and control lines involved, state the steps in the fetch & execute cycle. Page -21- Higher Systems Computer Structure MEMORY We are now going to take a closer look at memory and all its different types. You will certainly know RAM and ROM. This forms the computer’s MAIN MEMORY. Read Only Memory is permanent; it is ‘fixed’ in the factory by blowing fuses inside the memory module. If we could look inside we would see circuits like this: 1 0 0 1 1 1 0 0 So ROM doesn’t need electricity to hold its data and cannot be altered. Random Access Memory is volatile, it need power to hold its data. There are two main ways of doing this: Dynamic RAM (DRAM) uses capacitors, this is a cheap form of memory and they can be organised in large modules (SIMMs or DIMMs) in groups of 256 Mbytes for instance. However the charge is miniscule and capacitors ‘leak’, so they have to be constantly refreshed. This slows down access to DRAM. SIMM, acronym for single in-line memory module, a small circuit board that can hold a group of memory chips. Typically, SIMMs hold nine RAM chips. On PCs, the ninth chip is often used for parity error checking. The bus from a SIMM to the actual memory chips is 32 bits wide. A newer technology, called dual in-line memory module (DIMM), provides a 64-bit bus. For modern Pentium microprocessors that have a 64-bit bus, you must use either DIMMs or pairs of SIMMs. Static RAM (SRAM) uses transistors, this is more expensive and you cannot organise it in large quantities, however it is perfectly stable and has very fast access. RAM in a computer uses DRAM, cheap and large scale modules. Because DRAM is slow to access, computer manufacturers try to speed things up by adding another kind of memory between the CPU and RAM called cache. Cache memory uses SRAM. There are two kinds called L1 and L2. Level 1 comes with the processor and is attached to it. Level 2 is just off the processor on the motherboard. Cache usually comes in 512 Kbytes or nowadays 1 Mbyte size. Page -22- Higher Systems Computer Structure The idea of cache is that once data is fetched from RAM it is stored in cache, then, if it is needed again it can be recovered very quickly. Although rare nowadays, computers can run out of main memory when running programs. If this happens they use an area of hard disk to store data. This is called VIRTUAL MEMORY. Obviously you can get a huge amount of Virtual Memory (or VRAM), however it is very slow (excruciatingly slow!) and it is to be avoided. So there is a hierarchy of memory available to the processor: Type of memory Description Registers RAM Locations inside the processor DRAM right next to processor SRAM on motherboard Virtual Memory Using the hard disk. Cache Speed of access Lightening Very fast Slow (ish) Snail’s pace Cost / Availability Expensive, very limited amount. Dear, limited amount. Fairly cheap, large amount. Dirt cheap, huge amount. To get an idea of the relative differences, fetching data from memory could be described like this: If the processor is the teacher sitting at his desk in room D10a and the data is a sheet of paper that he needs: A register is when the paper is lying on his desk. Cache is when the paper is in the filing cabinet. RAM is when the paper is in Mr. Cairns’ room. VRAM is when the paper is in the office. Clearly more registers will speed up a computer, but they are numbered in hundreds. More cache will speed up a computer, but that comes in limited amounts. It is essential to have enough RAM in your computer because if your machine has to resort to VRAM it will slow down very noticeably. Page -23- Higher Systems Computer Structure Whatever type of memory is being used it is divided into locations which hold a set of bits. The number of bits in each location is called the WORD size. This is equal to the number of lines on the data bus. So: Word size = lines on data bus = bits in each storage location. A word is the number of bits that can be processed in one cycle. When a computer is described as a 16 bit machine it means the word size is 16 bits, there are 16 lines on the data bus, there are 16 bits in each memory location. A Nintendo 64 games machine had 64 bits in each storage location, a 64 bit data bus, its word size was 64. Each storage location in memory has its own unique address. This address is used (via the address bus) to pinpoint whichever location the processor wishes to access. The width of the address bus will determine the amount of memory that a processor can access. Imagine an address bus had 2 lines on it, then the number of unique addresses it could carry would be 4. 0 0 0 1 1 0 3 lines could carry 8 different addresses, n lines: 2n unique addresses. The number of lines on the address bus determines the number of unique locations you can have in memory. If the address bus has ab lines then the CPU can access 2ab different storage locations. Page -24- 1 1 Higher Systems Computer Structure Question1: So if an address bus has 12 lines, what is the maximum number of storage locations it can access? Answer: 212 = 4 096 different storage locations. Now in order to work out the amount of memory in bytes, we need to know how much is stored in each location (the computer’s word size). Question2: How much memory can a computer have if the address bus is 24 bits wide and each storage location holds 16 bits? Answer: BITS = 2ab x No. of bits in each location (where ab is the number of lines on the address bus) = 224 x 16 BITS = 16 777 216 x 16 = 268 435 456 bits /8 = 33 554 432 bytes /1024 = 32 768 Kbytes /1024 = 32 Mbytes Page -25- Higher Systems Computer Structure EXERCISE 6 1. What does it mean to describe ROM as non-volatile? 2. Explain the difference between Dynamic RAM and Static RAM. 3. What is cache memory? 4. What is Virtual Memory? 5. Why does using cache speed up the operation of a computer? 6. Why do we not uses a lot more SRAM instead of DRAM for main memory? 7. What is a word? 8. A computer has a 32 bit address bus with a 16 bit word size. Calculate the maximum amount of memory the processor can access. 9. A computer has an 8 bit data bus and a 12 bit address bus. Calculate the maximum memory it can use. 10. (hard!)A 32 bit computer can have a maximum memory of 16 Gbytes. How many lines are there on the address bus? Page -26- Higher Systems Computer Structure COMPUTER PERFORMANCE By performance we mean how quickly can it carry out instructions. However this is not as simple as it sounds, there are many factors to take into account and the instructions themselves vary a great deal, so measuring the speed of a computer is very difficult. Because of this there are many different ways of trying to measure the speed of a computer: 1. Clock speed. This is a prime factor, the clock pulses synchronise the steps in the fetch & execute cycle, the faster the clock, the quicker it will get through the cycle. A processor is always quoted with its clock speed (the number of pulses per second) given in Gigahertz. E.g. a Pentium IV processor 2.5 Ghz, an AMD Athlon 3400+ and so on. However it doesn’t take in to account any other factors like data bus width, amount of cache and so on. Secondly, different processors have different machine codes and this can greatly affect their speed, clock speed can only properly be used to compare the same make of processor 2. MIPS: Millions of Instructions Per Second. The more instructions that are fetched and executed per second then clearly the faster the computer will be. This is better than just clock speed, however, it is not as straightforward as that. Again, different types of processor use different instructions, some can be simpler and faster, some longer. 3. FLOPS: Floating Point Operations Per Second. Floating Point arithmetic is a good measure of a processor because the steps carried out are common to all types of processor. However it doesn’t tell you how the processor deals with other types of instructions. 4. BENCHMARKS: The idea of a benchmark is to give the computer a specific task to do and time it. Examples are the Whetstone test which is a series of arithmetic functions. The Dhrystone test uses a series of very common programming statements and string comparisons. The benchmarks are coded in the machine code of that processor and timed from start to finish. Using a series of benchmarks you can get a good measure of a system’s overall performance. You can also just a run a program like Excel with a very complex spreadsheet and time it to recalculate. Page -27- Higher Systems Computer Structure What are the main factors that affect a computer’s performance in terms of speed of processing? 1. A faster processor. A Pentium IV 3 Ghz processor will be a lot faster than a Pentium IV 2 Ghz processor. However as we saw above, you cannot readily compare the speed of two different makes of processor, but that doesn’t change the general fact that faster processors make for faster computers. 2. Data bus width. The more lines on the data bus, the larger the word size, the more bits that are processed each cycle. A 32 bit data bus will give you a much faster computer than one with a 16 bit data bus. (N.B. the address bus affects the amount of memory, not the speed) 3. Amount of cache cuts down on the amount of accessing slow RAM, this speeds up the operation of the computer. More registers could help, but as they are numbered in hundreds they are not going to have a big effect. 1 Mbytes of cache can have a big effect on a computer’s speed. 4. Speeding up the accessing of peripherals will make a computer appear a lot faster. Loading and saving, reacting to key presses or mouse clicks if these are speeded up by faster interfaces then the computer will be faster. (more on interfaces later). The trend in computers since the very first PC was brought out nearly 30 years ago has been for more powerful, faster processors, more main memory, more backing store. Some years ago a computer would come with an 8 Mhz processor, a floppy disc drive, a 40 Mbyte hard drive and 16 Mbytes of RAM. Nowadays a 4 Ghz processor (500 x faster), floppy, CD and DVD writers, a 200 Gbyte hard drive (5 000 times larger) and 1 Gbyte of RAM (64 times larger). What is even more dramatic is the new computer is much cheaper than the old one! Page -28- Higher Systems Computer Structure Someone once said if cars had progressed as much as computers over the last 25 years then a Rolls Royce would have a top speed of Mach 2, it would have a turning circle of 10cm and it would cost £5. Someone else said if cars had progressed the same way as computers over the last 25 years then simple warning lights like fuel low would be replaced by ‘system error 32xx450, please send us a report’, it would randomly crash twice a week and the airbag would ask ‘are you sure’ before activating. However the trends are continuing, faster and faster processors, more RAM to handle all the overblown software we use and masses of hard disk space to store all our videos, MP3s, photos etc. EXERCISE 7 1. Name four ways of measuring the speed of a computer. 2. Create a table for your four methods and outline the advantages and disadvantages of each. 3. How does the data bus width affect the speed of a computer? 4. Describe one other method of increasing computer speed. 5. How would you describe the trends in the power of computers? Page -29- Higher Systems Peripherals Topic 3 PERIPHERALS Description of the use and advantages of buffers and spooling Description of a suitable selection of hardware, including peripherals, to support typical tasks including production of a multimedia catalogue, setting up a LAN in a school, development of a school website Justification of the hardware selected in terms of appropriate characteristics including resolution, capacity, speed, cost and compatibility Description of the features, uses and advantages of solid state storage devices including flash cards Description of the development trends in backing storage devices Description of the following functions of an interface: buffering, data format conversion (serial to parallel and analogue to digital), voltage conversion, protocol conversion, handling of status signals Distinction between parallel and serial interfaces Description and explanation of the current trends towards increasing interface speeds and wireless communication between peripherals and CPU Page -30- Higher Systems Peripherals Buffering and Spooling The processor is very fast, peripherals are relatively slow. It is impossible therefore for the processor to deal directly with a peripheral or else the processor would be sitting idle for exceedingly long times. There are two possible ways of dealing with this which we shall illustrate by considering a printer: Buffer: A buffer is an area of memory. The laser printer in the corner of the room has a 4 Mbyte buffer, i.e. its own 4 Meg of RAM. When anyone in the room prints, the processor sends the pages to the buffer, which it can do reasonably quickly. The pages are then printed from the buffer which is fairly slow because printing takes time. Meanwhile the processor gets on with other jobs. Spooler: A spooler uses backing storage to store the print job. In a busy office network where a hundred print jobs might go to the printer at the same time, a print server is used. The print job goes to the print server which ‘spools’ it to the hard disk. The server then sends the pages from the hard disk to the printer while the network computers get on with other things. (In the old days a tape spool was used for backing store). So buffering is using an area of memory, spooling is using backing storage. Both allow the processor to send the print job quickly and then let it get on with something else. Print jobs spooled to the server’s hard disk. Print jobs sent to the printer’s buffer Page -31- Higher Systems Peripherals As we have seen, the processor is too fast to deal directly with a peripheral, it also uses different values or methods for storing data (e.g the processor might use 8V for a 1, 3V for a 0 while a keyboard uses 15V for 1 and -15V for 0). If you plug 15V into the motherboard then you are going to blow it. For a peripheral to communicate with the processor, it has to connect through an interface. The peripheral will plug into a port, usually at the back of the computer. Behind this port is a card of electronics. This is the interface. An interface has a number of jobs to do including: buffering, data format conversion (serial to parallel and analogue to digital), voltage conversion, protocol conversion, handling of status signals Buffering is an area of memory built into the interface which stores data temporarily before it is sent on to the processor (or vice versa). A keyboard has a buffer that holds 256 characters, when you press return the contents are processed. Data conversion could include serial to parallel. Data that comes in on a serial port (e.g USB – Universal Serial Bus) is coming in one bit at a time. This data will go onto the data bus for transferring to the processor. The data bus is parallel, so the interface buffers the bits coming in until it has enough to send them on the parallel bus. 1 1 0 0 1 1 0 1 Serial to Parallel --1----0----1----1----0----0----1----1 Some interfaces are parallel (most computers still have the big parallel printer interface), however nearly all communication with peripherals nowadays is serial. The reason is that parallel is only reliable over a short distance. With longer distances the bits get out of sync and this causes errors. These 4 bits are out of sync : 0 1 1 0 Page -32- Higher Systems Peripherals Analogue to digital conversion is necessary for any analogue device which must have its data converted to digital for the computer and vice versa. Video input is an example of analogue to digital. Voltage conversion: as mentioned before, a peripheral might use different voltage values for ones and zeros compared to the processor. Protocol conversion: protocols are agreed rules used in data transmission about how data is formatted, timing, error checking etc. A network interface will have to handle protocol conversion to ensure data is sent out according to the rules of that network. Status signals: this could be something as simple as the printer letting the interface know it is on and ready. If the printer is switched off or out of paper, when you print the interface will inform you of that fact. As with all aspects of computing there has been a great increase in the speed of interfaces. USB 2 transfers data at 480 Mbits per sec., compared with USB 1.1 which had a speed of 12 Mbits per sec. (USB is Universal Serial Bus). Before USB, SCSI (scuzzy – Small Computer System Interface) was popular. It was parallel and transmitted at 5 Mbits per sec. Then ultra SCSI transferred at 20 Mbits p/s then ultra– wide SCSI (16 bit) at 40 Mbits p/s. Firewire was a successor to SCSI, but is serial. It can transfer data at 800 Mbits per sec. It was developed by Apple and is particularly useful for connecting video cameras. It is also used on all iPods. It can be used for printers / cameras / scanners etc. and devices can be daisy chained through one port. The main reason USB is used rather than firewire is because it adds a couple of pounds to a systems cost. Another trend is towards wireless communication with peripherals. Infra red is quite common for keyboards and mice. Laptops can communicate with printers via infra red. WAP (Wireless Application Protocol) can be used to govern how data is sent wirelessly. Bluetooth will possibly be very common in the future especially in setting up a WPAN (a Wireless Personal Area Network) where all your personal devices; phone, laptop, pda, computer, printer, camera etc. can all communicate. Page -33- Higher Systems Peripherals EXERCISE 8 1. Why does it improve computer performance if the attached printer has a buffer? 2. What method do print servers on networks use instead of buffering? 3. The difference between buffering and spooling could be summarised by simply stating (copy out): Buffering is using ________, spooling is using ________ _______ to take data quickly from a computer to let it get on with other jobs while the peripheral takes the data from the buffer/spooler in its own time. 4. What is the difference between a port and an interface? 5. What does buffering mean for an interface? 6. Why might an interface have to convert analogue to digital? 7. Give another example of data format conversion. 8. What is a network protocol? 9. Give an example of a status signal. 10. Which type of interface is very fast and suited to video transfer? 11. Why is wireless connection as with keyboards and mouse becoming popular. 12. State a drawback with wireless connection. Page -34- Higher Systems Peripherals Backing Storage There has been a great deal of development in storage devices over the last 10 years and this is continuing at a great rate. Magnetic storage devices include hard disks, floppy disks, Zip disks and magnetic tape. They are called magnetic storage devices because their recording surfaces are coated with a material that responds to magnetic fields to enable data to be stored. These storage devices can be fixed or removable. Removable storage devices allow the user to disconnect the device and physically transport data from one computer to another. Hard disks and floppy disks have been around for 20 years in PCs but lately the floppy disk has almost become extinct. It only holds 1.4 MB and for anything other than text files that is no use. The storage capacity of hard drives has increased enormously with 200 Gigabytes normal. Hard disks can spin at 7 200 rpm and transfer data at 50MB per sec., this depends a lot on the interface: IDE or SCSI being the most common. All discs are divided into tracks and sectors when they are FORMATTED. This divides the disc into blocks and the track and sector gives co-ordinates for each block so that the data can be found. Hard disks contain more than one disk (called platters). When a file is saved the name, the track, sector and length of the file are recorded in a table called the FAT (File Allocation Table). When a file is to be loaded, the computer looks up the FAT to find where it is. Magneto-optical storage devices combine magnetic and optical technologies to read and record data. With a magnetooptical disk, a laser beam and a magnetic field is used to write the data. Only the laser is used to read the data. ZIP discs are an example, but rewriteable CDs and DVDs and now flash memory has made them almost redundant. Page -35- Higher Systems Peripherals OPTICAL STORAGE CD drives which both Read and Write and now DVD – R/W drives are common place. CDs store about 650 MB and DVDs store 4.5 GB (there are dual layer which store 8.5GB and double sided is possible). These also have different rates of data transfer. The basic CD transfer rate is 0.15 MB per second. A 32X CD transfers data at 4.8 MB per second. A DVD’s original speed is 1.4 MB per second, an 8X DVD can transfer at 11 MB per second. CDs have probably reached their maximum speed at 52X, DVDs will probably not go higher than 16X. DVDs suffer from format differences in that you get DVD-R and DVD+R and not all drives play both. Tape storage Storing data on tapes used to be the only solution to backing up hard disks of large capacity. Now, with the advent of large, removable magnetic disks and optical CD RRW or DVD technology, this is no longer the case. However, removable storage media is comparatively expensive, with overall costs up to ten times that of tape. Tape, therefore, still has the edge in this market. Tape is read and written on a tape drive. This drive winds the tape from one reel to the other causing it to move past a read/write head. Data is written to tape in blocks with inter-block gaps between them. The tape runs continuously and a single operation writes each block Capacity Magnetic tapes have large capacities, reaching up to several gigabytes and come in a variety of sizes and formats. DAT tape is now the most popular for backups of servers. Access Tapes are sequential access devices which means that to get to a particular block of data on the tape, it must go through all the preceding blocks of data. Accessing data on tapes is therefore much slower than accessing data on disks with direct access. Page -36- Higher Systems Peripherals SOLID STATE STORAGE means there are no moving parts, everything is done electronically. It is non-volatile (i.e. does not need power) and so is called ROM, however the stored data can be changed as often as you wish when it is attached and power is going into it. (technically it should be called EEPROM – Electrically Erasable Programmable Read Only Memory!) This solid state storage comes in the shape of flash drives or USB sticks, SIM cards and various memory cards in cameras. This solid state storage is at the 1 Gbyte stage in 2005, but this will increase. In fact a lot of research is going into this area and they could replace hard drives in a few years. PERSONAL INVESTIGATION www.webopedia and www.whatis.com are always useful sites. Investigating Magnetic Tape Technologies Using the Internet or current technical magazines, investigate the cost, capacity and access times of tape media You may wish to look at the following manufacturers in your search: Athana, Gigatek, Hewlett-Packard, Imation, Verbatim. Hard disk You should use the Internet or current technical magazines to investigate the following characteristics of hard disks: Capacity Speed Cost Using textbooks or the Internet, find out how accuracy of stored data is achieved. Keywords you might like to explore are cyclic redundancy checks and error correction codes. Optical Storage: find out how data is stored (pits and lands), capacity, speed, cost for CDs and DVDs. Solid State Storage: find out about EPROM, EEPROM, flash ROM, capacities, cost. Page -37- Higher Systems Peripherals Peripherals Here we look at the hardware devices required to carry out typical tasks using a computer system. For example, to set up a LAN, or to develop a multimedia catalogue or a school website. When considering devices we specifically look at speed, cost, resolution, capacity and compatibility. Cost is self explanatory, compatibility is concerned with whether the peripheral will work with your system, e.g. does it have the right interface? Speed, resolution and capacity do not always apply, for some peripherals speed might be important, for others irrelevant. Scanners The flat-bed scanner is like a photocopier where you put a sheet or photo up to A4 size flat on the screen. Light is reflected onto photo cells called CCDs (charged coupled devices) that detect the light and the values are digitised (analogue – to – digital converter) to form the pixels of the bitmap. You can get hand scanners that you drag across the sheet / photo. Accuracy is determined by resolution and bit depth. The resolution is how many ccds there are per inch (dpi) and bit depth is how many bits are used to record the colour of each pixel. Capacity does not apply to scanners, but scans (bitmaps) take up a lot of storage on disc. Speed does not really come into consideration, there is little difference in the time taken to scan, but it could be a factor. The transfer rate from scanner to computer would also come under speed. Compatability would include type of interface and does the software run on your computer. Page -38- Higher Systems Peripherals Digital Cameras Like scanners, the digital camera uses millions of photosensitive diodes called charge coupled devices (CCDs), to record the intensity of light in an image. These analogue values are then converted to digital using an A-D-C. Digital photographs are bitmaps, made up of thousands or millions of pixels with values to represent image brightness and colour. Accuracy As in scanners, accuracy refers to how well the computer representation of the image matches the original. This will depend on the resolution and bit depth. Resolution This is measured in megapixels and can also be given as, say, 2560 x 1960. Bit-Depth The number of bits per pixel determines the number of colours that can be represented. Speed does not apply at all to cameras. For compatibility you might take into account ‘pictbridge enabled’, type of storage card as well as interface. Capacity is the number of photos that can be stored. Most cameras come with a very low capacity card, say 32 Mbyte. To be useful you need at least a 256 Mbyte card. (It might seem a 5 megapix camera using true colour (3 bytes per pixel) would need 15 Meg to store 1 photo, however the photo is compressed, usually jpeg to about 3 Meg.) Printers Liquid ink-jet Also known as bubble-jet, this device operates by squirting tiny droplets of ink onto the page. The ink is first heated by a passing an electric current through a coil. In milliseconds a bubble of vapour appears, forcing a tiny drop of ink from the nozzle onto the paper (measured in picolitres). Resolution is typically 600 to 1200 dots per inch. They support the printing of text and graphics, colour and a range of shades. Page -39- Higher Systems Peripherals Speed is pretty slow with a range of 4 pages per minute to 8 or maybe 12 pages per minute, depending upon the model. Cost is relatively cheap, though the cost of ink can be high. Photoquality ink jets are becoming popular with digital cameras and there are small dedicated photo printers. These can have very high resolutions. Laser Printers This type of printer uses lasers to "write" a page image onto a special drum as an electrostatic charge. The charged drum attracts toner particles that are transferred to the page and heated to set the image. Resolution is typically 600 to 1200 dpi, although higher resolutions are available if you are prepared to pay the price. They print a complete page at a time to a predefined maximum page length and width. Colour has now become affordable for laser printers Speed ranges between 4 pages per minute and 40 pages per minute. Capacity could include buffer size, important for network printers. Cost can be form £50 to thousands of pounds. Page -40- Higher Systems Peripherals Standards of peripherals are changing all; the time and the only way to keep up to date is for you to carry out you own investigation. Try to compare features of an expensive model and a cheaper one. The main features for all devices are: speed, cost, resolution, capacity and compatibility. For compatibility mainly look at the interface (eg USB) and possibly the software.. Speed will not apply to some devices like cameras, but will for printers. Capacity could be a storage card, a buffer or not apply as in a scanner. Resolution revolves around dpi and colour depth. For cost, take into account possible running costs as well. For each peripheral take into account other obvious factors. For a printer the ‘footprint’ is important (the size on a desk). For a camera, the quality of lens, does it take short video, the size of the view screen. Exercise 1. Using the Web find two different digital cameras (not too close in price). Compare the features of both cameras (in particular the characteristics mentioned above). Write a short report comparing them and recommend which one to buy, justifying your reasons. 2. As with one, this time search for two scanners. 3. Imagine you have a budget of £2 500 for buying a complete computer system with software and peripherals for producing a school magazine. Investigate the cost of a computer that would be up to the task plus the cost of software. Decide what essential peripherals will be required. Justify your choices particularly in terms of speed, cost, resolution, capacity and compatibility P. S. I. Peripheral Scene Investigation. Page -41- Higher Systems Networking Topic 4 NETWORKING Comparison of LANs, WANs, Intranet and Internet work in terms of transmission media, bandwidth, geographical spread and functions Distinction between a mainframe with terminals and a network of computers Descriptive comparison of peer-to-peer networks and client server networks Description of the functions of file, print and web servers Description of a node and a channel Description of bus, star, ring and mesh topologies using the terms node and channel Description of the consequences for each of the above topologies of node and channel failure Simple description of the functions and uses of a hub, switch and router Identification of the need for a network interface card (NIC) Description and explanation of the trends towards higher bandwidth and wireless communications Description of the following technical reasons for the increasingly widespread use of networks: advances in computer hardware, including processors, main memory capacity, backing storage, data transfer rates improved network related software, including browsers and network operating systems Description of the misuse of networks for the following illegal purposes: breaching copyright, hacking and planting viruses Description of the application of the Computer Misuse Act, the Copyright Designs and Patents Act and the Data Protection Act to the misuse of networks Page -42- Higher Systems Networking Definitions: Bandwidth is the speed of a connection, the bits per second that can be transmitted. Transmission media is what is used to transport the bits. It could be wireless or satellite, it could be copper phone lines or network cabling. Network cabling can be like TV aerial cables called coaxial or twisted pair. Twisted pair can be shielded from electrical interference, so you get STP (shielded twisted pair) or much more commonly UTP (unshielded twisted pair). UTP is by far the most common and is what is used in this school. There are various standards but by far the most common is Category 5 which transmits at 100 Mbits per second (compare to Broadband – 512 Kbits per second or even fast broadband at 2 Mbits per sec). This cable is often just called Cat 5. You can also get fibre optic cable which is very expensive but transmits over long dstances and has very high bandwidth. Comparing LAN / WAN / Intranet / Internet LAN: Local Area Network, in one room or building or site. Cabling and hardware usually owned by the company. High bandwidth (100 Mbits per sec), used for sharing peripherals e.g. printers and sharing data and files. Can also have application server which saves having programs installed in each machine. WAN: Wide Area Network, covers a large geographical area, can be private or open. The cabling is not opened by the company as they will use phone lines (bandwidth from 56 Kbits to 2 Mbits per sec), the hardware might be owned by one company as in a private WAN, or it could be like the web which does not have individual ownership. Can be used for file sharing, data sharing, emailing, instant messaging or chat. Page -43- Higher Systems Networking Intranet: This is a network (either LAN or WAN) that uses web technology to create a private web for use by only that company. So within their own network they can display information in the form of web pages, they can download (or upload) files, have email, instant messaging or chat. It is a private web. Internet: The internet is the actual hardware, the servers, the cabling, the modems, the routers and so on. There are 5 applications that run on the internet: o o o o o The Web (http) - what most people call the Internet. File downloading (ftp) Chat (IRC) Email (SMTP, POP) Newsgroups Page -44- Higher Systems Networking Comparison of LANs, WANs, Intranet and Internet work in terms of transmission media, bandwidth, geographical spread and functions: LAN Transmission Cat 5 (twisted pair) media Owned by company Very high Bandwidth 100Mbps Geographical One site spread Sharing Functions peripherals, files, email WAN Intranet Internet Phone lines Possible mixture of Phone lines cat5 and phone lines 56Kbps to 2Mbps Externally probably 1Mbps, internally 100Mbps Up to From one office to Worldwide Worldwide Sharing Sharing files, files, email email. Central access to up-to-date documents Varies, 56Kbps to 2Mbps Worldwide Web / information / shopping / email / chat etc. NETWORKS Original networks were multi-access systems where you had a central mainframe with lots of terminals attached. These were usually ‘dumb terminals’ in that they did not have any processing power of their own; they ran the programs off the mainframe and stored all their data on the mainframe. You can still get these types of system today, though they are called ‘thin client’ nowadays rather than dumb! They need a very powerful central mainframe but are easy to manage as all upgrades etc. are done on the mainframe rather than servicing hundreds of terminals. The terminals are very cheap and are easily replaced. Page -45- Higher Systems Networking Most networks however are not ‘thin client’ but use proper, quite powerful PCs as the terminals and each PC has its own software installed. There are two types of such networks: Peer to Peer: (don’t mix this up with Kazaa and its like, though it is the same idea). In P2P networks there is no central server, each PC is equal on the network, each machine runs its own software and saves its own data. However the PCs are joined together and can share data, send messages and share peripherals. Each user decides which files on their machine can be shared and whether they are read only or read/write. If a user is not logged on then their files are not available to everyone else. There is no central manager and each user is responsible for their own backups. Security is very low level. On this course by P2P we mean a LAN in an office. Kazaa, Morpheus etc have just taken that idea and turned it into a WAN by running software on a server on the Net. Client server: Here there is a central file server that controls access to the network through logins and passwords. Files are stored centrally on this file server (your Hdrive). Files can be made available for sharing (e.g. My Network Places). You can have internal / external email, file downloading and peripheral sharing. Backups are managed centrally from the file server. SUMMARY Peer to Peer: Every node equal. Share peripherals / send messages. Users make files available for sharing. Have to be logged on for files to be available. Each user responsible for backups Low security. Client / Server Central server controlling access. Share peripherals / send messages. Files available on server. Backup organised centrally. High level of security. Page -46- Higher Systems Networking For client / server you must have a Network Interface Card fitted to your computer (NIC). Each NIC has a unique number (the MAC1 address) that the server uses to identify your computer. 1 Medium Access Control Page -47- Higher Systems Networking SERVERS The file server stores everyone’s data and manages access. You can also have a print server (though we don’t) to spool print jobs and manage the printing, this could be just a basic PC. There can be application servers for thin client systems, email servers and web servers. Web servers transmit and store web pages, they are sometimes the same as proxy servers or can be separate. A proxy server connects a LAN to the Internet. All requests for web pages go through the proxy server and as far as the Internet is concerned there is only one machine connected although there might be hundreds of PCs on the LAN. A firewall usually runs on the proxy server. SUMMARY: File Server, stores data centrally and handles logins and passwords. Print server, manages printing on a network, spools and schedules print jobs. Web servers transmit and store web pages. Exercise 1. List 3 differences between the characteristics of a LAN and a WAN from transmission speed, geographical spread, functions and bandwidth. 2. What is an Intranet and what are its main functions? 3. What are the main functions of the Internet? 4. What is the difference between a mainframe with terminals and a network of computers? 5. Describe the functions of : a. A file server b. A print server c. A web server 6. Compare and contrast peer to peer networks with client server. Page -48- Higher Systems Networking There are various designs for the layout of networks. In these designs a terminal is called a node and the wire (or wireless!) is called a channel. These designs are called the network topology. You should be able to describe (or draw) each of these 4 topologies and explain what would happen if there is a node or channel failure. Page -49- Higher Systems Networking BUS A typical bus network uses the Ethernet standard. It has a single channel to which all the nodes are attached. It can use co-axial cable or cat 5 and transmits at 10 – 100 Mbits per sec. It needs terminators at the ends to kill any messages that are not picked up by the nodes as only one message can be on the channel at one time. Cable is usually limited to 100 metres maximum, but repeaters can extend this. This is a very simple network and cheap to set up, easy to extend and simple to add or remove nodes. If a node is down it has no effect on the rest of the network (unless it is the file server). If the main channel (backbone) goes down then the whole network is down. STAR For a LAN this will also use Ethernet protocols and tends to be twisted pair or coaxial cable. All data passes through a central hub which usually has a file server attached. This set up uses more cable than a bus. The whole network will be down if the hub or the central server fails. Node failure will not affect the rest of the network. It is fairly simple to add or remove nodes. Page -50- Higher Systems Networking RING A monitor station is required for a ring to remove any signal that is not being picked up by any node. If the main cable goes then the whole network will be down. If a node is down, then the network might go down but you can have a ‘bypass’. Nodes can only transmit when the ‘token’ is available. The token is a message container and only one message can be on the ring at one time. They can be large networks because each node retransmits the token as it goes round. MESH This is a fully connected mesh, a mesh does not need every node connected to every other node like here. There will always be multiple routes from node A to node B. This is the most reliable of networks as channel failure or node has no effect on the network, however it is the most expensive to cable. Exercise 1. On a network what is meant by : a. A node b. A channel 2. Draw a labelled diagram of a bus network. 3. What is the effect of node failure on a star network? 4. Why is a mesh network the most reliable? 5. What is the effect of node failure on a ring network? Page -51- Higher Systems Networking HARDWARE We have already come across a hub as the centre of a star network It can also act as a repeater, extending the distance of UTP cable. A switch, or switched hub also reads addresses on the network traffic and only sends the data to the correct node. An ordinary hub broadcasts the data to every node.A switch will also effectively divide the network up into a series of different segments, thus reducing the likelihood that a cable fault will bring the whole network down. A router is used to connect networks (internetworks). It uses IP addresses to determine which route or path data should take to get to its destination. A Network Interface Card (NIC) is the circuit board (interface) behind the Ethernet port at the back of a computer. You cannot connect to a LAN without an NIC. Every NIC has a unique MAC address that identifies the node on a network. Page -52- Higher Systems Networking Summary: A Network Interface Card is essential for connecting a node to a network. A hub is where all the nodes connect together in a star network. When a message comes in for a node the hub transmits the message to every node, the message has an address on it and the node it is for picks it up. A switch is an ‘intelligent’ hub, when a message comes in it looks at the address and only transmits it to that node. A router is for forwarding data through an internetwork. It uses the IP address to determine the best route to send the data on its way. TRENDS Networks are becoming faster, there has always been a trend towards higher bandwidth on LANs and WANs due to the amount of data that has to be transferred, particularly multimedia data, music and video which all have large files. LANs are normally 100Mbits per sec, compared to 10 Mbits per sec. Modems were originally 14 Kbits per sec., quickly superseded by 28 Kbps, then after a while came 56 Kbps. More recently we have ISDN and broadband. Home ADSL was 512 Kbps, recently that has become 2 Mbits per sec. for those that want it. Faster speeds will still be sought so that eventually the Internet could provide video on demand, personal TV scheduling and so on. Another recent trend is towards wireless connections and home networks and this will inevitably continue. Page -53- Higher Systems Networking Technical reasons for the increasing use of networks include advances in processors, main memory, storage and transfer rates. Processors have become much more powerful and capable of handling all the processing concerned with network traffic both for servers (which often use multiple processors) and for clients. The Web is multimedia and with LANs moving to intelligent clients then huge amounts of RAM are necessary. However large quantities of DRAM are readily and cheaply available for modern computers allowing them to handle these applications. Huge amounts of storage is required for client server networks as well as web servers. Also client PC need enormous storage for multimedia files. Again this has become readily available with even cheap PCs coming with 200 Gbytes of storage. As stated on the previous page, transfer rates have increased enormously, home broadband of 2Mbps is readily available and this has transformed the web in terms of multimedia content which needs large bandwidth. For LANs UTP used to be 10 Mbps and this has increased to 100 Mbps being normal nowadays for cat5 cable. This allows larger more data intensive LANs to operate. So: faster and more powerful processors, cheap and large quantities of RAM, cheap, large capacity hard discs and increasing bandwidth have all been important hardware factors in the rising use of networks. Page -54- Higher Systems Networking In addition to hardware advances, there have been great advances in software. A browser is a program that lets you view pages and interact with the World Wide Web. These have advanced a great deal since their inception and all a user has to do nowadays is enter an address in the address line or click a link to go to a site. Various plugins can be added that greatly enhance their capabilities. Also software like flash, real player and so on have all integrated with browsers giving us the present multimedia web that we all know. Operating Systems have also advanced greatly. Since Windows ’95, peer to peer networking has been possible. Windows NT, updated to Windows 2000 Pro and now Windows XP Pro allows client server networking. (NT stood for New technology). All client computers must have a Network O/S on their machines to connect to the network. These Operating systems have become more reliable and more feature rich and so are another technical reason for the increasing use of networks. Also security has been greatly enhanced which is very important nowadays with hacking and viruses. Page -55- Higher Systems Networking MISUSE OF NETWORKS Hacking is illegal access to a network. Controlling access to the network The network operating system is responsible for security on the network. The most obvious example of this is when a user logs on. The user must supply an identity and a password. The operating system compares the data entered with the identities and passwords in its database and if the two do not match up then it will not allow that user any access to the resources on the network. If the identity and password do match, then the resources which the user has access to will depend on the level of access that user has been given by the network manager. The access a user has to resources depends on that user’s level of permissions. There will always be a network manager who has access to everything, everybody’s files and passwords. They are able to trace anyone misusing the system, they organise the permissions for different categories of users, allocate ids and passwords to new users, remove old users from access. The easiest way for a hacker to operate is to find out a users login name. This is often freely available or can be found out from emails. Then find their password. This might seem fairly impossible, but can often be remarkably easy: Page -56- Higher Systems Networking 1. Bribery (everyone has their price!) 2. By knowing about them e.g. family names, football team, favourite singer etc. 3. By a ‘con’ e.g. phone them saying you are from the helpdesk and need their password to check something. 4. Set up a dummy website (‘phishing’) which seems like a real one where they have to enter their password. 5. Planting software on their system that records keystrokes. 6. Simple burglary and find it written down in their desk drawer. 7. Actual hacking using ‘backdoors’ and faults in the Windows software. Hackers often work for the imagined kudos rather than to do anything destructive. However there are also plenty of vandals out there who get a kick out of disrupting or destroying. Then there are professional criminals and Eastern Europe have some very clever unemployed Computer Scientists working on hacking into financial systems. To keep yourself safe you must have the latest version of your Operating System installed and make sure you get automatic updates. You must have an effective firewall, you must run anti-adware programs regularly. Be aware of the cons in the list above. Viruses A virus is a piece of programming code that causes some unexpected and usually undesirable event in a computer system. They are often designed so that they automatically spread to other computer users on a network. Viruses can be transmitted as attachments to an e-mail, as a download, or be present on a disk being used for something else. Some viruses take effect as soon as their code takes residence in a system whilst others lie dormant until something triggers their code to be executed by the computer. Viruses can be extremely harmful and may erase data or require the reformatting of a hard disk once they have been removed. Up to date virus protection software is essential on any computer connected to the Web. Page -57- Higher Systems Networking Copyright P2P software like Kazaa, Morpheus or Bit torrent has led to a huge market in free downloading of music, videos and software. Revenge of the Sith was on the Web before it was released to cinemas. Anything made available to a file sharing network can be downloaded a million times within a day and it is quite socially acceptable to steal in this way, nobody thinks anything of it, though court cases are arising as music companies try to strike back. One of the main problems is in the copyright laws themselves. It is illegal to make a copy of anything (text, picture, music etc.) in electronic or any other form. Now when you access a website a copy is ‘cached’ on your hard disk. So technically it is illegal to access most websites. This means the Prime Minister, the Chief Constable and everyone else is breaking copyright. If you keep a video recording of a program, you are breaking the law. Showing a video to friends is illegal. It all becomes a bit absurd, so downloading some mp3 files doesn’t seem very wrong either. Computer Misuse Act In the United Kingdom, the Computer Misuse Act (1990) covers using computers to damage or steal data. The Computer Misuse Act covers crimes such as breaking into computer systems or networks to destroy or steal data and propagating viruses that destroy or damage information or computer systems. Data Protection Act In the United Kingdom, the Data Protection Act (1998) describes the duties and responsibilities of those holding data on individuals. It also describes the right of these individuals. In general, it is the duty of those holding data on individuals to register with the Data Protection Registrar, to keep the information secure, make sure it is accurate, and to divulge it only to those persons who are authorised to view it. It is the right of an individual who has data stored concerning them to view that information and to have it changed if it is inaccurate. There are a number of organisations that may be given exemption from this act -namely the Police, Customs, National Security and Health Authorities. Copyright Designs and Patents Act This protects anyone’s rights to anything they have ‘created’ whether it is an essay, a song, a piece of artwork etc. It makes it illegal for anyone to make a copy, never mind sell it or anything else. Page -58- Higher Systems Networking Exercise 1. What is the main difference between a hub and a switch? 2. What is the job of a router? 3. Explain 2 technical reasons for the growth of networks. 4. One user of a network gets access to all files and folders, another user is restricted to only some files. How is this enabled? 5. Give some examples of how hackers can gain access to a network. 6. What is a virus? 7. Which law makes the spreading of viruses illegal? 8. Why has downloading mp3s become socially acceptable? Page -59- Higher Systems Software Topic 5 SOFTWARE Description of the function of a bootstrap loader Description and exemplification of the main functions of a single user operating system: o o o o o o interpreting users commands, file management, memory management, input/output management, resource allocation, managing processes Definition of a utility program Description of utility programs (including virus checker, disk editor and defragmenter) Description of the standard file formats for graphics files: jpeg, gif, TIFF Description of a suitable selection of software to support typical tasks including production of a multimedia catalogue, setting up a LAN in a school, development of a school website Description and exemplification of software compatibility issues (including memory, storage requirements, and OS compatibility) Classification of viruses by type of file infected: file virus, boot sector virus, macro virus Description of the following virus code actions: replication, camouflage, watching, delivery Distinction between a virus, a worm and a Trojan horse Description of anti-virus software detection techniques: o o o o use of checksum, searching for virus signature, heuristic detection and memory resident monitoring Page -60- Higher Systems Software There are two kinds of software – Applications and System Software. Applications are what we run in order to do something useful on the computer e.g. Microsoft Office. System Software is the set of programs that let our Applications access the hardware, control the running of the computer, protect the system and perform household tasks like tidying up disks. E.g. Windows and utilities. Systems software The operating system is part of the system software. The purpose of an operating system is to provide the user with a means of accessing the computer hardware and operating the system without knowing anything technical. The operating system can be viewed as providing a layer of software between the user applications and the underlying hardware of the machine. Nowadays as the O/S is so large, it is held on hard disc. However the computer cannot do anything without a program, so there must be something in ROM that is there as soon as the computer is switched on. This program checks the hardware, the RAM, peripherals etc. The main task however is to load the O/S from disk. This ROM based program is called the bootstrap2 loader. Starting a computer is often called ‘booting’. The Americans have an expression ‘Pull yourself up by your bootstraps’ meaning to get yourself organised. Bootstraps are laces. 2 Page -61- Higher Systems Software As the O/S is a very complex program it usually consists of a many various programs working together. It can be best described as having six functions: 1 2 3 4 5 6 interpreting user commands; file management; input and output; memory management; resource management; managing processes. 1. Interpreting user commands is undertaken by the ‘Command Language Interface’. This name comes from old command driven Operating Systems, nowadays it takes the commands from mouse clicks or menus. However the object is the same, take commands from the user, interpret them and pass them to the appropriate layer. 2. File management is about organising files on the disc and maintaining the hierarchical file structure. It does this by maintaining tables called the FAT. These File Allocation Tables contain the names of all the files, where they are on disc, the length etc. 3. Input / Output. The BIOS in a PC is the Basic Input Output System. This layer deals with all the peripherals and their interfaces. It sends data to the printer, loads data from a disc and so on. All the device drivers are part of the Input / Output System. Page -62- Higher Systems Software 4. Memory Management. Computers can run more than one program at once. Each program will take up an area of memory and it is the Memory Management’s job to allocate space in RAM to the different programs. However as programs run they produce data which must also be stored in the RAM, now it is very important that this data does not overwrite the memory space of another program, else that program will crash. This is the most important job, ensuring that one program does not interfere with the memory allocated to another program. In particular it must ensure that no program interferes with the O/S. Possible Memory Map for a PC running I.E. and Word: Windows 0 Word 100Meg Internet Explorer 125Meg 165Meg 256Meg If Word is creating data and it runs out of space in its block from 100-125Meg, then there are locations free between 165 and 256Meg, however if Windows has also been using that area to store data, then this would crash the PC. 5. Resource Allocation. This could be needed in a multi-tasking environment. When a computer is running more than one program it cannot have two programs printing at the same time. Resource management ensures this does not happen. It is certainly needed in a multi-access environment where many users might be accessing a peripheral at the same time. 6. Managing processes (the Kernel). A process is a running program. Even when you think nothing is running on your computer, just check the Task Manager to see all the processes going on. The CPU can only be used by one program at a time, the job of the kernel is to schedule access to the processor by all the processes that are running. The simplest way to do this is ‘round robin’ where it just goes round each process in turn giving it say 0.01 seconds access to the processor then switching to the next process and so on. However it usually more complicated than that as a system of priority operates. Page -63- Higher Systems Software All these functions of an Operating System work together as you use a computer. Imagine you start up a PC, load Word, Load a file and print it. Here is what will be going on: 1. Switch on The bootstrap loader in the ROM BIOS checks your PC, what memory you have, what peripherals are attached and then loads Windows from your hard disk. 2. Load Word The Command Language Interpreter takes your command to Load Word and passes it to File Management which has the location of Word on the hard disk. Once located on disk, the Input / Output System takes in the data from the disk and passes it to Memory Management which positions word in its own block of RAM. 3. Load a File Exact same as for loading Word, though the file goes in a separate area of RAM. 4. Print the File The Command Language Interpreter takes your command and passes it to the Input / Output system, which communicates with the interface and send the data to the printer. In addition you have the kernel, Process Management. It switches all these programs in and out of the processor so that the Command Language Interpreter is running, then Process Management removes that program from the CPU and brings in File Management etc. etc. Also the Input / Output system would be showing all this on the monitor. Resource Management would handle any conflict in use of disks, printers etc. if you were multi-tasking (i.e. playing a CD while doing your work). So, although you click Print or Load file in Word, the program just passes this command to the Operating System which handles all the processes involved. Page -64- Higher Systems Software Utility programs Utility programs perform housekeeping tasks on a computer. Some come with the Operating System, others can be added. Most operating systems will include: 1 2 3 disk cleanup; disk defragmentation tools: disk formatting programs. 1. Disk cleanup (disc editor) This frees up space on your hard disk by: • Removing temporary Internet files. • Remove Windows temporary files. • Remove optional Windows components that you are not using. • Remove installed programs that you no longer use. 2. Disk defragmentation tools Disk defragmentation tools are used to combat the problem of files being split up on a disc. When there has been a lot of deleting on a disc, empty space is all over the disc instead of just at the end. When a large file is saved the O/S will save part in one free space then another part in free space elsewhere and so on. A ‘defragger’ finds these files and puts them together. In doing so it also squeezes all the files together so that the free space goes to the end. Page -65- Higher Systems 3. Disk Software formatting programs Disk formatting programs are used to prepare the surface of a disk for use. The process often involves laying down ’markers’ for tracks and sectors. Formatting deletes any data on a disk.. Third party utilities Third party utilities are tools provided by others than the supplier of the operating system, like WinZip. These programs can often be free and include: Anti-virus software Anti-virus software is used to prevent the spread of small, usually malignant programs that spread amongst machines. As programs, they need to be executed before they can be effective, so viruses often attach themselves to program files. Another way is as email attachments often masquerading as something else, e.g. AnnaKournikova.jpg was an infamous virus which caught out millions throughout the world though luckily it was not harmful, just sent an email to everyone in your address book and directed them to a webpage. Others are not so lucky and can have all their data destroyed. Commercial anti-virus programs provide regular updates for their products, and given the speed with which viruses can propagate through the system, it is a good policy to update virus protection as often as possible. There is more on viruses later. Page -66- Higher Systems Software Exercise 1. What is the function of the bootstrap loader? 2. What is the job of file management? 3. What goes on in the Process management layer of the Operating System? 4. When you save a file to disc, outline the layers of the Operating system that will be involved. 5. What is a utility program? 6. What does defragging a disc mean? 7. Give an example of a disc editor. Page -67- Higher Systems Software SOFTWARE When you buy a program it comes with a System Specification3. This will list the requirements the software needs to run on your computer. Examples are: Memory: A minimum amount of RAM will be required e.g. 128Mb. Processor : A minimum processor or a certain processor type e.g. Pentium III 800Mhz or better. Hard Disk: A minimum amount of free space on your disk drive, e.g. 50 Mb free. These are hardware requirements. Also there is usually a software requirement: e.g. Operating System, Windows ME or later. For your computer, try and answer these questions. What operating system is installed? What is the available RAM? What is the processor speed? What is the disk capacity? What peripherals are available? Exercise: 1. Go to www.dabs.com and under software, Graphics and Media > Illustration & Drawing find Photoshop. Under ‘Specification’ scroll down to the bottom 3 and check to see if this program will run on your computer. 2. Try to find Adobe premiere Pro video editing and check to see if that will run on your computer. While in DABS software, have a look at Utilities for sale. Question 3 over the page Always take system specs with a pinch of salt. They’ll say it needs Windows XP and 128Mb RAM, I wouldn’t run Notepad on that system, it will be using VRAM the whole time. They just want to sell software and not put you off thinking you need a new computer. If it says 128Mb RAM, then 256Mb MINIMUM will be really needed, likewise for processor, hard disk space etc. 3 Page -68- Higher Systems Software 3. The iPod Nano is on sale at the apple store (Google iPod nano). PC requirements are given on the bottom right of the page. Does your PC meet these requirements? Are there any hardware requirements?? Various tasks a computer is used for have certain software requirements. Newsletter Word plus some graphics / photo editing software. Possibly a Desk Top Publishing program like Microsoft Publisher. Website Development A web authoring package like Dreamweaver, graphics software -- Fireworks, possibly photo editing, Animation (possibly Fireworks for gifs) or Flash. 4. Using the web find a list of software for the above tasks with their costs and system requirements. IN SUMMARY The main software compatability issues are: Memory requirements Storage requirements Operating Systems requirements A program that requires 256 Mb RAM will not run on a machine with 128Mb RAM. A program that needs 2 Gbytes of space to install on a hard disk won’t install if your disk is full. A program that runs on Windoews XP will not work on a Windows ME computer. Page -69- Higher Systems Software STANDARD GRAPHIC FILE FORMATS A standard file format means any program of that type can load the file. In Word Processing , Microsoft Word cannot load Appleworks files and vice versa. However they can both save and load a file as RTF (Rich Text Format), so RTF is called a standard file format for Word Processsing. JPEG JPEG (Joint Photographic Experts Group) is group of experts that develop and maintain standards for compression algorithms for computer image files. JPEG processing makes image files small by removing detail. This is called lossy compression. This will reduce the number of colours used to store the image and avoid unnecessary repetition of bit patterns. GIF The Web also supports GIF (Graphics Interchange Format) images. These images are based on a compression algorithm that creates a codebook or dictionary of particular bit patterns. These in turn, are then substituted resulting in a smaller file. When decoding, the algorithm uncompresses the file to generate the original image. This is lossless compression lossless compression. An algorithm that allows compression to be decompressed is called a CODEC COmpress – DECompress)4 TIFF Tagged Image File, can be any resolution, any number of colours. Used for bitmaps especially scanner images. 4 LOSSLESS COMPRESSION using a CODEC Here is a simple sentence. There are simply too many ways from here to there to impel one to go there by the impossible route. Our CODEC is: Here replaced by X Imp replaced by Y Our sentence becomes: X is a sYle sentence. TX are sYly too many ways from X to tX to Yel one to go tX by the Yossible route. Not counting spaces we have replaced 102 characters by 79, about 23% reduction. The full message can be reassembled as we know the codec. Page -70- Higher Systems Software Viruses A virus is a programming code that causes some undesirable and unexpected event to happen in a computer. Viruses are usually disguised as something innocent and are designed so that they automatically spread between computer systems. Viruses can be enter a system as attachments to an e-mail, a download from the Web, or from on a disk or CD. Some viruses take effect as soon as their code is executed and others can wait until circumstances cause their code to be executed by the computer e.g. a certain date. Viruses can be quite harmful and erase data or close down a system. Virus types Viruses are classed by three main types: � File virus; � Boot sector virus; � Macro virus. File virus A file virus can be attached to a program file (.exe) so that when you load the program, you load the virus. A file viruses can also take the form of a complete program attached to something else, e.g. an e-mail. They then take up residence in the computer ready to cause havoc. These are often disguised as something else. Boot sector virus These viruses infect the boot sector on disks where programs are executed when the Operating System starts. Every time your computer starts up, the virus is loaded in with the O/S. Macro virus Macro viruses are fairly common viruses, but tend to do the least damage. Macro viruses infect applications and typically cause a sequence of actions within the application e.g. inserting unwanted words or phrases in a document. These also often come attached to documents in an email. Page -71- Higher Systems Software Virus code actions Viruses don’t all follow the same course of action. They can, and do, use a combination of the following actions: � Replication; � Camouflage; � Watching; � Delivery. Replication Like a biological virus they can spread quickly and are can be difficult to control. They can attach themselves to almost any type of file and spread as files copied and sent between computer users. A virus programmed to operate on a certain date can have a long time to replicate itself before activation happens. This gives it time to be spread over many computers before being discovered. Camouflage It is possible for a virus to avoid detection by taking on the characteristics that detection software is programmed to look for and ignore. However, detection software has evolved to prevent this happening. (Today’s anti-virus software does much more than simply check particular characteristics (or signature) of a virus. They also check the virus code and even checksum the virus code to identify it. With these cross-checks it would be extremely difficult for a virus to camouflage itself and get past detection. This is dealt with later) Watching A virus can lie in wait and ambush a computer when something routine is carried out e.g. opening a particular application. The damage routines will activate when certain conditions are met. A certain date or when the infected user performs a particular action may trigger the virus. Delivery Infected disks brought in from the outside used to be the main source of viruses until e-mail provided the ideal delivery vehicle. Downloads from Peer –2–Peer sites are another common source Once delivered the virus will wait for the trigger to wreak its havoc. Page -72- Higher Systems Software Worm A worm is a self-replicating code that does not alter any files but takes up residence in the computer’s active memory and duplicates itself. They only become noticeable once their replication consumes the memory to the extent that the system slows down or is unable to carry out particular tasks. Worms tend to use the parts of the computers operating system that is not seen by the user -until it’s too late. Trojan horse A Trojan horse is a program where harmful code is contained inside another code that can appear to be harmless. Once the apparently harmless code is in the computer, it releases the malicious code to do its damage. Trojan horses may even claim to be anti-virus in order to get the user to install it. Anti-virus techniques The best protection against a virus is to know that each file you open from an e-mail, disk or from the Web is free from any virus. This requires anti-virus software that can screen e-mail attachments or Web downloads, and checks all of your files from timeto-time removing any viruses that are found. Techniques used by ant-virus software to detect a virus include: � Checksum; � Signature; � Heuristic; � Memory monitoring. Page -73- Higher Systems Software Checksum In checksum detection, the binary of all the machine code for the key files (particularly boot files) is added up as numbers and stored in the system. When these files are called to execute the checksum is calculated and compared with what it should be. If there are any anomalies, then the file about to be run could have been infected and a warning given. Virus signatures A virus signature is a unique pattern of bits within a virus. Once known, the anti-virus software uses the virus signature to scan for the presence of malicious code and removes it. This is why anti-virus software has to be regularly updated, known signatures are added and then they can be easily detected. Heuristic detection Heuristic detection describes the technique of approaching a problem through previous experience. The technique is used to find unknown viruses that have not yet been identified by their signatures by looking for characteristics in a file that have previously been associated with a known virus. Heuristics can also detect a virus that has disguised its signature, by recognising a particular sort of behaviour. For example, if a file attempts to access your address book then that might be suspicious. If the same file includes code that checks a date, then the suspicion rises. There will come a point when a warning is issued on the possibility of a virus. Memory resident monitoring Some anti-virus software is memory resident and is loaded on start up. It actively monitors the system for viruses whilst the computer is switched on and checks programs for infection every time they run. This will include the boot files on start up checking any disk as it is accessed, checking any files accessed during operation, and checking any files being loaded on the hard drive. The price to be paid with memory resident programs is that they can cause delays in program loading and execution whilst the checks are being carried out. Page -74- Higher Systems Software Using a virus information library Activity Literally thousands of viruses have been detected and catalogued using a Virus Information Library (VIL). One such VIL can be found at: http://vil.nai.com/vil/default.asp Using this or another source, find the details of at least one of each type of virus: file boot macro worm Trojan A hoax5 Use the search box to enter your criteria although this won’t always work. Also use Google for Virus Information Library or just general searches like ‘boot virus’. For each one, make a note of the following: 1 2 3 4 5 6 name: type: symptoms: date discovered (try to find one from this year): delivery (e.g. email, website): cure (if any): 5 Another problem with viruses is that some of them are hoaxes (and some of the hoaxes are themselves hoaxes). A simple example that has become quite common is for an email to arrive warning that such and such is a terrible virus and avoid at ball costs. The email goes into great detail and exhorts you to pass this urgent message on. Victims (suckers?) immediately spread this news to everyone in their office, who spread it on and so on. The whole thing of course is just a load of rubbish. I have known many people in Computing and IT who have passed such messages to me. Page -75- Higher Systems Software Exercise 1. Give two examples of hardware requirements that would have to be considered when purchasing software. 2. Give one example of a lossy graphic compression and one example of lossless. 3. In simple terms, how can compression be accomplished losslessly? 4. List three virus types. 5. Which type of virus starts up with your computer? 6. Which type of virus could be found in a Word document? 7. What does it mean to say a virus is ‘watching’? 8. What is a worm? 9. Why are some viruses called Trojans? 10. How does checksum detection work? 11. What is a virus signature? Page -76-