Lecture 10 Buses, CPU and I/O System Pradondet Nilagupta Spring 2000 204521 Digital System Architecture 1 Buses, CPU and I/O System • HSA - hardware system architecture • Specifies the structural characteristics: – organization of hardware elements – performance • We will look at the Bus architecture, CPU and I/O System and how all of these components work together to execute program instructions 204521 Digital System Architecture 2 BUSES • Move Info between components of the computer. • Information can be: – Control signals (control bus) – Data and program instructions (data bus) – Addresses (address bus) • Composed of a set of wires that allow current to flow between components. • 1 wire = 1 bit, 8 wires for 1 byte, typical sizes are powers of 2 such as 8, 16, 32 204521 Digital System Architecture 3 Type of Buses • Local Buses – Address buses – Data buses – Control buses • System Buses – Bus controller – Arbiter • Expand Local Buses 204521 Digital System Architecture 4 Local Buses LOCAL BUSES • The simplest buses consist of set of wires • The buses are local buses because they are part of the device that uses and controls them. Address buses – tend to be specialized in purpose – usually unidirectional – most frequently transfer addresses from the program counter (PC), stack, or addresscomputation circuitry to memory 204521 Digital System Architecture 5 Local Buses Data Buses – tend to be more general in purpose – bidirectional – carry data, instruction s, and also addresses to and from main memory system , attatched I/O devices, and the ALU Control Buses – carry signals from the control unit to other components of the computer and back to the control unit 204521 Digital System Architecture 6 System Buses • independent components of the computer used to connect devices together controlled by its own controller • has its own control circuit, called a bus controller, and within each bus controller is an arbitrer which process to use the bus • tend to have well-documented and stable definitions so that designer can attach a wide variety of devices to them • ex. DEC UNIBUS, S-100 Bus, Apple NuBus 204521 Digital System Architecture 7 Typical system bus with two attached devices Address Lines Bus Controller Data Lines } Bus Arbiter Bus-grant lines } Bus-request Line Device 1 204521 Digital System Architecture Device 2 8 A CPU Bus and Expaned Local Bus } Control Lines Address Lines Bus Controller Data Lines } Bus-grant lines } Bus-request Line Bus Arbiter CPU Device 1 204521 Digital System Architecture Device 2 9 Expanded local Buses • found mostly in PCs, these are local buses with extensions for devices outside of the CPU. • similar to system buses except that the CPU’s clock and timing circuits regulate them. • processors-specific • flexible operationally and provide a platform for system expansion. • ex. IBM PC I/O Channel Bus and IBM Micro Channel Architecture (MCA) 204521 Digital System Architecture 10 Bus Transfers and Control Signals • Bus Transfer - transmission of one or more items across the bus. Each type of transmission is called a bus cycle • Bus cycle types include: memory read, memory write, I/O read, I/O write, interrupt • Bus states - transfers that occur in stages • Bus cycle consists of well-defined sequence of bus states. • Clock regulates bus states 204521 Digital System Architecture 11 Bus Control Signals • Bus masters are the devices that can use the system buses. A device must first have permission to use the bus from the bus arbiter. • Slaves are passive devices which respond to requests from the bus master, e.g. main memory • Devices are connected to bus arbiter by Busrequest line. Device sends Bus-request signal to arbiter and receive accept-signal over bus-grant line 204521 Digital System Architecture 12 Typical read cycle using an Expanded Local Bus CPU sends read request and address-enable signal via the control bus and address via address bus Address Bus CPU Data Bus The CPU sends R, AEN, and the address to the storage system } Control Bus R W Local Bus AEN R W AEN Storage System 204521 Digital System Architecture 13 Typical read cycle using an Expanded Local Bus Storage system receives request and address and decodes the address Address Bus CPU Data Bus Control Bus R W AEN R 204521 Digital System Architecture W AEN The storage system decodes the control signals and retrives the required datum 14 Typical read cycle using an Expanded Local Bus storage sends back requested datum along data bus Address Bus CPU Data Bus Control Bus R W AEN R W AEN The storage system places the recalled datum on the data bus 204521 Digital System Architecture 15 CPU • Central Processing Unit (also called the processor or the microprocessor in PCs) • Consists of – Register set – ALU (series of hardware circuits for performing the arithmetic, logic and shift operations) – Control Unit 204521 Digital System Architecture 16 ALU • Functional units - hardware elements that perform arithmetic, logic and shift ops • Some computers have multiple independent functional units, others have a single functional unit. Whichever, these make up the ALU. • Includes a register for temporary storage, flags/status register, and multiplexor to select appropriate hardware circuit. 204521 Digital System Architecture 17 ALU of Simple computer ALU ALU data input bus Arithmetic and logic circuitry Shifter control Control signals (from control unit) control Multiplexor C control V N Flags temporary register control 204521 Digital System Architecture Z Status signals (to control unit) control ALU data output bus 18 More on the ALU • Some computers use a math-coprocessor to perform floating point operations. • RISC computers use two or three independent functional units in performing some operations such as branch processing and floating-point operations. • Cray computers have many independent functional units for paralellism. Other architectures use pipelining of functional units 204521 Digital System Architecture 19 Control Unit • Fetch from memory the next instruction, place it into the IR and increment the PC • Decode instruction and execute it • Decoding actually means decoding the instruction into a microinstruction or into microorders, control signals sent to various hardware elements (such as ALU components, registers, bus arbiter, etc…) 204521 Digital System Architecture 20 A simple Von Neumann machine A program Main Memory current inst next inst Address bus Addr generation PC Operational Register 204521 Digital System Architecture Data Bus IR Control unit ALU CPU 21 During Fetch Cycle Main Memory A program current inst next inst Address bus 1 Addr generation 2 PC Operational Register 204521 Digital System Architecture Data Bus IR Control unit ALU CPU 22 Type of Control Units • Microprogrammed - most computers built during the 1970’s and 1980’s have this type where the computer architect programs the microinstructions for each machine language instruction. • Conventional (or hard-wired) - used in highperformance and RISC computers, as the name implies, the microinstructions are hard-wired into the computer. This is much faster but much less flexible. 204521 Digital System Architecture 23 Microprogram • IR stores current machine language instruction from program in main memory • Each machine instruction is represented as a series of microinstructions, in the Control Store • MicroPC points to the appropriate location of the current microinstruction in the Control Store • Executing a machine instruction requires finding the microinstructions and executing them until the microinstruction has completed 204521 Digital System Architecture 24 Main Component of Microprogrammed control unit IR Address computation circuitry C mPC C INC Status bits Control and Clock inputs Control Store EN C m instruction buffer CL 1 2 3 4 5 Sequencer 6 7 8 204521 Digital System Architecture Internal Address bus Microorder m instruction decoder 9 10 25 Ordinary Operation • Sequencer causes address of current machine instruction to be placed in microPC • It may also clear the microinstruction buffer • Sequencer initiates control-store read and transfers microinstruction to microinstruction buffer 204521 Digital System Architecture 26 Ordinary Operation • Microinstruction decoder decodes microinstruction into microorders and issues those orders over control lines • If a non-branching microinstruction, then microPC is incremented, otherwise new address in control store is calculated and placed in microPC 204521 Digital System Architecture 27 Organization of Microprogram • Each Machine Language operation is actually a sequence of microinstructions • Each of these microprograms is placed in the control store • Included are the microprograms for instruction fetch, interrupt intiation and other routines separate from machine language instructions 204521 Digital System Architecture 28 One organization for microcode in a control store Control Store A0 Microcode for op code 0 A1 Microcode for op code 1 An Microcode for op code n AIF AII Microcode for instruction fetch Microcode for interrupt initiation Other Microcode 204521 Digital System Architecture 29 Branching/Non-Branching • Branching microinstructions hold the branch address inside of the microinstruction itself. • Branching occurs only within the Control Store (as opposed to a machine language branch to another location in main memory) • Special bit designates whether a microinstruction is a branch or nonbranch. 204521 Digital System Architecture 30 Branching/Non-Branching • Internal Address Bus is used to compute new control store location • Nonbranching instructions require that the microPC be incremented 204521 Digital System Architecture 31 Machine Startup • Instead of the ordinary operation, at machine startup, a special routine occurs: –Registers are initialized –A hardware-generated address found in the reset-vector is moved into the microPC or indirect address is used to find the address to be moved into the microPC 204521 Digital System Architecture 32 Machine Startup Main Memory Initial program First Instruction PC Reset vector (hardware generated) using a hardware-generated reset vector 204521 Digital System Architecture 33 Machine Startup Main Memory Reset vector PC Initial program Reset vector (hardware generated) First Instruction using a hardware-generated reset vector address 204521 Digital System Architecture 34 Microinstruction forms • Microinstructions are simply a list of control orders for the various buses and gates in the computer (predominantly in the CPU) • Horizontal Control - each microinstruction is a series of bits (perhaps 50-150), each of which represents a single control line 204521 Digital System Architecture 35 Microinstruction Forms • Vertical Control - groups of bits in the microinstruction represent commands. This requires decoding or demultiplexing before the control commands can be issued • Horizontal is faster but requires greater lengthed microinstructions 204521 Digital System Architecture 36 Horizontal Control microinstruction Control unit microorders Individual microorders Microinstruction Branch address 204521 Digital System Architecture 37 Vertical control microinstruction using decoder Control unit microorders Decoder Decoder Individual microorders Decoder Microinstruction Branch address 204521 Digital System Architecture 38 Examples of Microprogramming • Instruction Fetch • Fetch instruction from memory: • Place Content of PC on the A-bus (1D) • Enable memory to D-bus (MD) • Signal read operation (RM) • Transfer value to the IR (CI) – Increment PC (IP) – Branch to Microprogram instruction (1 in bits 0 and 16) • 00000100001000001100 • 00000000000010000000 • 00010000000000000001 204521 Digital System Architecture 39 Complement value in accumulator • Perform NOT operation – Send signal to functional unit to perform NOT (101) – Transfer result into X (CX) – Instruct functional unit to send X to D-bus (UD) – Transfer result to Accumulator (CA) • Branch to fetch microprogram – 10100011010000000000 – 000--------------001 where --…-- is the location of fetch 204521 Digital System Architecture 40 ADD operand from Mem. to Accumulator • Perform the Add – Send address field of IR to the A-bus (IA) – Signal read-memory operation (RM) – Send data from memory to D-bus (MD) and onto the functional unit - might take some time – Instruciton functional unit to peform add (001) • Store result – Send functional unit result to D-bus (UD) – Store result in Accumulator (CA) • Branch to fetch microprogram 204521 Digital System Architecture 41 Conventional Units • Alternative to microprogrammed control is to hardware the control, that is, each machine language instruction causes the control unit to send out its sequenced control commands directly. • Used in supercomputers and some RISC computers. Must faster performance. 204521 Digital System Architecture 42 Conventional Units • Some computers combine both approaches, using conventional units for arithmetic types of instructions. • Outcome is the same whether using Conventional or Microprogrammed Control. 204521 Digital System Architecture 43 Exception Processing • Exceptions - branches initiated by hardware (interrupts) or program (traps). • Interrupts are asynchronous (computer’s clock does not control them, instead other devices such as I/O devices control them) • Traps are synchronous - examples include arithmetic overflow, memory protection violation or illegal op code. 204521 Digital System Architecture 44 Exception Processing Hardware • Preserve processor context (registers values) by saving onto special hardware (e.g. save area in memory or a stack) • Branch to exception handling code (which may cause additional info to be saved such as parts of main memory) • After exception handling terminates, restore state from stack and continue normal execution • Special instructions are usually available such as RETURN FROM INTERRUPT which restores state 204521 Digital System Architecture 45 Priority Exceptions • Priority Exceptions - exceptions may arise from more than one source. Exceptions are often prioritized so that lower priority exceptions may be disabled while handling higher priority exceptions. • If a device requests an interrupt, it sets a flag in the Interrupt Code Register. A priority encoder selects the highest priority interrupt and the interrupt disabled flip-flop is set to disallow further interrupts. 204521 Digital System Architecture 46 Interrupt Polling – Once an exception is raised, the hardware must determine the appropriate exception handler and determine which device raised the exception. – Interrupt polling is used to determine which device raised the interrupt. An alternative is to have an interrupting device place a designation code in a register. 204521 Digital System Architecture 47 Interrupt Polling – The correct exception-handler is determined by searching the exception-vector-table for the starting address of the given exception (which can be determined by finding what flag(s) was set in the ICR. – Placing this new address in the PC causes a branch to the exception handler. 204521 Digital System Architecture 48 Exception Masking • Rather than disabled all interrupts, an exception at one level of priority may wish to disable all exceptions at a lower level of priority. • Exception masking allows higher priority exceptions to interrupt the current exception but causes lower priority exceptions to wait until the current exception terminates. • An Interrupt Mask Register is used to disable lower priority exceptions. 204521 Digital System Architecture 49 I/O System • Set of all I/O devices in the computer system including physical devices and I/O interface devices • In the early days of computers, I/O devices were limited to line printer, punch card reader. Today there are numerous types of I/O devices (terminal, mouse, scanner, etc…) • CPU-controlled I/O - the CPU would interrupt its current process to directly handle all I/O. I/O commands were Write A to Device N, Read A from Device N and possibly Test Device N. 204521 Digital System Architecture 50 Faster methods of I/O • Multiprogramming operating system - the OS loads several programs in memory. When I/O occurs in one program, set it aside and run another program until I/O completes. • Multiported Storage System - allow several processes to directly access memory simultaneously • I/O processors - proivde I/O devices with special interfaces control I/O without CPU intervention 204521 Digital System Architecture 51 Multiprogramming OS • When a program wants I/O, it requests an I/O service from the OS by placing an I/O request into a prespecified memory location • It then calls the OS with a supervisor call • The OS generates a trap, causing the CPU to save this process, perform the exception handling (which initiates the I/O operation) and commences a new process until interrupted again to denote the end of the operation 204521 Digital System Architecture 52 Multiported Storage • Memory-port controller - switching circuit that accepts requests from any device connected to it and arbitrates traffic on the bus connecting memory and I/O • Each device connects to the controller through a read request line, a write request line, a grant requested line and Address and Data buses. • Controller uses some priority scheme to arbitrate who should gain access next. 204521 Digital System Architecture 53 DMA I/O • Direct-memory-access controllers allow hardware to be able to transfer data to and from memory directly without going through an arbiter or CPU. • Only interrupts the CPU when finished. • Generates control signals to the bus while it is bus master (I.e. while it has control of bus) • Note that the CPU may actually have to wait for the DMA device to finish before the CPU itself can access the bus or main memory! • More sophisticated types of devices are called PPUs (Peripheral processing units) 204521 Digital System Architecture 54 DMA Channels • Used mostly by IBM computers, simple von Neumann devices that have their own register set including a PC • Have simple instruction set dealing with I/O transfer • Used primarily to control I/O devices • Can operate in single-cycle mode (one byte at a time) or burst-mode (in which case the controller does not relinquish the bus until all info has been transferred) 204521 Digital System Architecture 55 Cycle Stealing • If the CPU and a memory-port controller both request memory access at the same time, usually priority is given to the controller: • If the controller controls a slow device, the transfer is usually small as opposed to the CPU and so the time taken will be short • If the device is fast (e.g. hard disk), if it is not given access, it may miss an access and have to wait before it can achieve the access (e.g. the head may spin passed the file on disk) 204521 Digital System Architecture 56 Memory-Mapped I/O • Used so that I/O addresses and memory addresses are uniform • A computer that uses memory-mapped I/O does not need specific I/O instructions, instead it uses one “language” to communicate all storage requests whether it is to memory or an I/O device • To differentiate between memory and I/O, separate port addresses are used 204521 Digital System Architecture 57 I/O devices • Tape Drives - sequential access, cassettes or reel-to-reel • Disk Drives - hard vs. floppy vs. optical – Uses indexed or direct access with files spread across different platters, sectors and tracks – Disk access time measured by seek time and rotational latency • Printers, Monitors, Keyboards, Mice, etc... 204521 Digital System Architecture 58