CPU Structure, Function, Performance Key Words BUSES – The communication channels linking the CPU with the RAM and I/O devices. CLOCK SPEED - The speed at which the CPU can process an instruction. CORES - Separate processing units contained in a single CPU PIPELINING - To process of queuing up jobs so that time is used efficiently leading to greater EXECUTE - The action of a CPU performing an instruction REGISTERS - Tiny memory locations, situated inside the CPU productivity. What is the purpose of the CPU? Its job is to process data. And by processing we mean things like searching, sorting, calculating and decision making. Whenever you are on working on your computer, it is the CPU which is at the heart of everything. The Fetch-Decode-Execute Cycle The CPU follows three steps in order to process data: Fetch, Decode, Execute To begin with, whenever you open and work with a program, its data and instructions is loaded onto the RAM. As the RAM is accessed directly by the CPU, the CPU can get to work! Fetch In this step the CPU fetches some data and instructions from main memory (RAM) and then store them in its own temporary memory called 'registers'. For this to happen, the CPU uses a piece of hardware path called the 'address bus'. The address of the next item that the CPU wants is put onto the ‘address bus’. Data from this area then travels from the RAM to the CPU on another piece of hardware called the ‘Data Bus’ Decode The decode step is where the CPU understands / works out what the instruction it has just fetched actually means. The CPU ‘decodes’ the instruction and gets things ready for the next step. Execute The Execute stage is where data processing happens. Instructions are carried out on the data. Once a cycle has completed, another begins. Control Unit CPU Structure ALU Main Memory Registers Input Devices Output Devices The CPU is made up of 3 main areas: Control Unit - Controls the signals required to coordinate the running of the processor including the movement of data between components and decoding instructions. Arithmetic Logic Unit (ALU) - This is responsible for the calculation and decision making carried out by the CPU. Registers - This is the CPUs own memory store which is extremely quick and contains several specialised registers which are at the heart of the fetch-decode-execute cycle Control Unit CPU Buses and the FDE Cycle ALU Main Memory Registers D Input Devices Output Devices Connecting the CPU to the main memory and I/O devices are buses of which there are 3: Address Bus, Data Bus, and Control Bus. BUSES - These are simply a collection of wires/channels which allow data / instructions to be communicated between devices. DATA BUS - This bus is responsible for moving data between the CPU and the main memory (and I/O devices) ADRESS BUS - This bus is responsible for communicating the memory locations of the data/instructions that needs to be fetched / moved between devices. THE CONTROL BUS - This bus is responsible for communicating control signals from the Control Unit (for example the clock signal to synchronise the processes) Fetch: The Control Unit fetches the next instruction/data item, requesting the address of the instruction/data along the address bus, whilst sending control signals (such as the clock signal) along the control bus. The instruction/data is then sent back to the CPU along the data bus and is stored in the appropriate register(s). Decode: The Control Unit then receives the instruction and decodes it (make sense of it) by looking it up in its instruction set. Execute: The ALU will then perform the appropriate calculation after which the Control Unit will use the address bus to request a memory location to store the output. The Control unit will then send the data back to memory along the data bus. The CPU’s structure and ‘Assembly Language’ All ‘Von Neumann’ CPUs have an instruction set. This means that it has a list of instructions that its Control Unit understands and knows what to do with. Surprisingly, a CPU’s instruction set only consists of a few simple instructions. Here is an example instruction set: Instruction Purpose ADD Add numbers together SUB Subtract numbers from each other STA Store a value LDA Load a value BRA Branch to a different part of the program (often used to repeat code) OUT Output a value (after processing) CPU Registers A CPU is able to perform each of those instructions and it does so using some very important parts of the its architecture. These parts are called registers, which are simply tiny memory locations (memory stores). The important ones have special names. The following registers are used during each Fetch-Decode-Execute cycle in order to carry out each instruction: o The Accumulator (ACC) o The Program Counter (PC) o The Memory Address Register (MAR) o The Current Instruction Register (CIR) o The Memory Data Register (MDR) An example of the CPU at work: Below is an example of some instructions and data from a program which has just been loaded into the CPU ready for processing. Fetch Stage: In the first part of the cycle, the instructions and data are fetched from the CPU memory and loaded into the specialised registers: Decode Cycle: Now that the CPU has fetched the first instruction, it needs to decode it. It does this by looking up the IR instruction in its character set. The Control Unit does this job and recognises LDA 4 as meaning LOAD the contents of address 04. The CPU has now decoded the first instruction and is ready to execute it! Execute Cycle: Now that the CPU has made sense of the instruction, it now needs to execute the instruction. This then repeats for another cycle starting at the next address. Common Characteristics of a CPU and how they affect performance. There are a number of common characteristics of a CPU: Clock Speed: The speed of the Fetch-Decode-Execute cycle is determined by the CPU’s clock chip. This chip uses a vibrating crystal that maintains a constant rate. The speed of the clock is measured in hertz (Hz) which is the amount of cycles per second. A clock speed of 500Hz means 500 cycles per second. Current computers have CPU clock speeds of 3GHz which means 3 Billion cycles per second. Overclocking: It is possible to increase the clock speed for a CPU. This is known as overclocking. In theory, if the clock is faster then the CPU can perform more calculations and perform faster. The problem is that CPUs get hot the more work they do – so overclocking is dangerous without the appropriate heat management. CPU Cores There are two ways in which a CPU with more than one core can process more instructions in the same time period: They can carry out Parallel Processing which is when the same program can have two instructions processed at the same time. More cores also enables Multi-Tasking where each core can each process two different programs’ instructions at the same time. Either way, more cores means more instructions being processed at once. However, it doesn’t necessarily mean CPUs work faster as some programs do not allow more than one instruction to be processed at once Primary and Secondary Storage Primary and Secondary Storage A computer system will have two types of storage: o Primary – Cache, RAM (aka main memory), ROM o Secondary – Hard Disk, Flash Drive They differ in a number of ways: Primary Storage Secondary Storage Most Primary Storage is volatile (temporary). All secondary storage is (non-volatile) permanent Primary Storage is expensive and smaller. Secondary storage is usually cheaper and large Primary storage is smaller in capacity (L2 Cache is only 2MB) Secondary Storage is bigger in capacity Primary storage is closer to the CPU and/or is integrated onto it and therefore is faster. Secondary storage connects to the CPU via cables and therefore is slower The Speeds of Different Storage types The higher up on the pyramid a storage type is, the faster it is. The Cache As we have just seen CPUs can work very quickly indeed but unfortunately CPUs can only work when supplied with data. The RAM (that supplies the data) cannot work at the same speed. To overcome this the CPU’s cache memory will not just copy the instruction needed at that time, instead it will also copy the continuing instructions. Cache memory has read speeds similar to the CPU and is therefore much faster than RAM. So, to improve efficiency the CPU’s ‘Control Unit’ will look first in the cache for the next instruction to see if it has already been copied which reduces the time taken to access data. If the cache is larger, it is more likely that the next required instruction has already been transferred from the RAM to the CPU thus improving process time. Pipelining Pipelining is where the output of one stage is the input for the next stage and pipelining is an important computational method as the idea of pipelining can be used to speed up tasks being carried out. Consider the process of washing and drying 4 sets of laundry: In this example, it takes 8 hours to complete 4 sets of laundry. But on closer analysis, there are 4 separate pipelines in this entire process as the output of the drying stage is not the input of the washing stage. Because of this we can use the principles of pipelining to better organise this process, make use of concurrency. In this example, queuing up jobs as separate pipelines reduces the time of the entire process down to 5 hours. Pipelining is therefore a really important computational method. Pipelining has many uses in computer systems but most notably instruction pipelining is used in CPUs to speed up processing time. “As one instruction is executed, the next can be decoded and the next can be fetched and in this way, instructions can be queued up reducing processing time.” Limitations of Pipelining Pipelining does have some limitations. It only works if the CPU knows what instruction needs to be fetched next. If it fails to correctly forecast the next instruction and as a result fetches the wrong instruction, the entire pipeline needs to be flushed – cancelling out any benefit that pipelining gave in that round of cycles. Workbook 1. Copy the diagram opposite, label all aspects of the diagram and for each structure, explain its role during the processing of an instruction. [18] 1 2 4 3 5 6 1- This is the control unit. The control unit contains an instruction set which is used to decode the instructions stored in the in the address data has been fetched from and then after decoding the instruction, the data is sent to the accumulator where it can be executed. 2- This is the ALU. The alu is used to execute the instruction which is inside the CIR. The alu can also perform any mathematical operations required. After the instruction has been executed, the ALU copies the result into the accumulator. 3- These are the registers. The registers are small parts of memory which are able to perform the FDE cycle. The Program counter shows the next instruction to be executed. The MAR shows us the current instruction that needs to be fetched. The MDR shows us the data that has been fetched. The IR shows us the instruction needed to be executed from an instruction set. Finally, the accumulator shows us the outcome of the instruction that has been executed. 4- This is the main memory. This contains all the data and instructions which aren’t stored in the cache. This is where everything is fetched from to get the next instructions. 5- These are input devices. These devices are used to take inputs from the user so that the user can tell the cpu what the next instruction they want to be performed is. 6- These are output devices. These devices will output the result of the instruction chosen such as play a song or print an image onto the screen. 2. Using the tables below, explain how the CPU’s specialist registers will be used during the processing of the following assembly code. [9] The address in the PC is copied to the MAR, 00. The address bus then goes to address 00. The data bus then brings back the instruction stored in address 00, LDA 4, and copies it to the CIR. The CIR then decodes the instruction from its instruction set , Load 4. The data stored in address 4 is now fetched and loaded into the accumulator. The program counter is now incremented. The address in the PC is copied to the MAR, 01. The address bus then goes to address 01. The data bus then brings back the instruction stored in address 01, ADD 5, and copies it to the CIR. The CIR then decodes the instruction from its instruction set, ADD 5. The data stored in address 05 is now fetched and added onto the data in the accumulator. The accumulator now reads 8. The program counter is now incremented. The address in the PC is copied to the MAR, 02. The address bus then goes to address 02. The data bus then brings back the instruction stored in address 02, STA 6, and copies it to the CIR. The CIR then decodes the instruction from its instruction set, Store 6. The data in the accumulator is now stored in address 06. The program counter is now incremented. The address in the PC is copied to the MAR, 03. The address bus then goes to address 03. There is no data in address 3 and so the program now ends.