Modern Computers von Neumann Machines and Other Computers Computer Types • Recall that a “von Neumann” machine is a standard stored-program computer. • Most of the computers we use are of this type. • We shall discuss the essential organization of such a computer, and then mention a few changes to yield a faster CPU. • We then mention several other designs, called “non Von” – not plain stored program devices. 2 The von Neumann Inheritance • The EDVAC, designed in 1945, was one of the first stored program machines. • All modern computers are modifications of this design; only the technical implementation has been changed. • Other designs have been suggested, emulated on a von Neumann machine, and abandoned because the emulation worked so well. 3 Types of Modern Computers 1. 2. 3. 4. 5. 6. General numerical computation, involving both integers and real numbers. Device automation and control. Message switching, including routers and firewalls on the Internet. Computer–generated graphics. Graphics–based computer games. Computer–enhanced video. (How about those extra lines superimposed on football fields?) 4 General-purpose and Special-purpose • Most computers are general-purpose devices, tailored to a given purpose by the software. • Sometimes a high-volume market demand justifies a special-purpose design. Examples: • 1. Routers used on the Internet. • 2. Graphics cards, such as the NVIDIA cards. • NVIDIA has launched the CUDA (Compute Unified Device Architecture), allowing the graphics card to be used as a standard CPU. 5 Components of a Stored Program Computer • The four major components of a modern stored program computer are: 1. The Central Processing Unit (CPU) 2. The Primary Memory (also called “core memory” or “main memory”) 3. The Input / Output system 4. One or more system busses to allow the components to communicate. 6 Components of a Stored Program Computer (Page 2) 7 Major Components • The system memory (of which this computer has 512 MB) is used for transient storage of programs and data. This is accessed much like an array, with the memory address serving the function of an array index. • The Input / Output system (I/O System) is used for the computer to save data and programs and for it to accept input data and communicate output data. Technically the hard drive is an I/O device. It is also considered a memory device. 8 Major Components (Page 2) The Central Processing Unit (CPU) handles execution of the program. It has four main components: 1. 2. 3. 4. The ALU (Arithmetic Logic Unit), which performs all of the arithmetic and logical operations of the CPU, including logic tests for branching. The Control Unit, which causes the CPU to follow the instructions found in the assembly language program being executed. The register file, which stores data internally in the CPU. There are general purpose (user) registers and special purpose registers used by the Control Unit. A set of 3 internal busses to allow the CPU units to communicate. This number is set by the ALU structure. 9 The ALU • The Arithmetic Logic Unit is designed to handle standard arithmetic, such as the dyadic (2 input) operation C = A + B. • It has two inputs and one output. 10 More on the CPU • The CPU has a number of special-purpose registers, including the IP, IR, MAR, and MBR. • The IR holds the instruction being executed. It is accessed by the control unit, which emits control signals to cause the CPU to execute the instruction properly. • The IP (Instruction Pointer) holds the address of the instruction to be executed next. 11 Structure of the CPU • The control unit interprets the instruction in the Instruction Register. 12 The Simple Top Level Model Logically speaking, the computer has one bus used to allow all components to communicate. Early designs, such as the PDP-11, had only one bus. For these slow machines, it worked well. 13 Realistic Organizations • The design on the previous slide is logically correct, but IT WON’T WORK. • IT IS TOO SLOW. Problem: A single system level bus cannot handle the load. • Modern gamers demand fast video; this requires a fast bus to the video chip. • The memory system is always a performance bottleneck. We need a dedicated memory bus in order to allow acceptable performance. 14 First Revision of the Design • Here we have separate dedicated busses to each of memory and the graphics system. • Here, the slow I/O devices do not hold back the very much faster ones. 15 Second Revision of the Design • Legacy I/O devices of various speeds must be accommodated by the design. • Here an I/O Control Hub (ICH) manages two busses, one for legacy devices. 16 Current State of the Design 17 The Motherboard • In personal computers, a motherboard is the central printed circuit board (PCB) used to connect many of the system components. • All busses external to the CPU are on the motherboard. • The motherboard includes connector sockets for the CPU and many I/O peripherals. • The motherboard must be matched to the CPU, especially in the bus speed. 18 Picture of a Motherboard 19 The Chip Set • The term “chip set” refers to a set of VLSI chips that are designed to work together. • The chip set is specific to a given CPU. It mostly relates to chips on the mother board. • Based on Intel Pentium-class microprocessors, the term chipset often refers to a specific pair of chips on the motherboard: the Northbridge and the Southbridge. 20 North and South • The Northbridge links the CPU to very highspeed devices, especially main memory and graphics controllers, and the Southbridge connects to lower-speed peripheral buses (such as PCI or ISA). In many modern chipsets, the Southbridge contains some on-chip integrated peripherals, such as Ethernet, USB, and audio devices. 21 The Memory Component • The memory stores the instructions and data for an executing program. • Memory is characterized by the smallest addressable unit: Byte addressable - the smallest unit is an 8–bit byte. Word addressable - the smallest unit is a word, usually 16 or 32 bits in length. • Most modern computers are byte addressable, facilitating access to character data. 22 Logical View of Memory • Logically, computer memory should be considered as an array. The index into this array is called the address or “memory address”. • A logical view of such a byte addressable memory might be written in code as: Const MemSize = 65536 byte Memory[MemSize] // Indexed 0 … (MemSize – 1) 23 Simplistic View of Memory • Memory is seen as a single linear array. • The MAR holds the address; the MBR, data. 24 Memory Operations • Read sequence: Put address into MAR; command a READ. • Then copy the contents of the MBR into an internal CPU register. • Write sequence: Put address into MAR; data into the MBR. • Then command a WRITE. 25 Multi-Level Memory • What we want is a very large memory, in which each memory element is fabricated from very fast components. But fast means expensive. • What we can afford is a very large memory, in which each memory element is fabricated from moderately fast, but inexpensive, components. • Modern computers achieve good performance from a large, moderately fast, main memory by using a multi-level cache memory. • Quite often, there are 2 or 3 levels of cache. 26 The Principle of Locality • It is the principle of locality that allows cache memory to deliver faster performance. • This principle is based on observed memory access patterns of real programs. • When a program accesses a memory location: 1. It will likely access memory close to it. 2. It will likely access that location again. 27 A Modern Two-Level Cache • All Pentium designs have at least two levels of cache memory. L2 holds 1 to 2 MB. • The L1 cache is split, with an Instruction Cache and a Data Cache. 28 The Split L1 Cache • Memory can do only one thing at a time. • The split L1 cache is realized as 2 independent very fast memories. The CPU can access both at the same time. • This parallel access allows modern CPU design tricks, especially pipelining. • Note that the CPU does not write back to the I cache. That makes it simpler and faster. 29 Multicore Designs Have 3 Cache Levels 30 A Reasonably Modern Memory Chip 31 The Fetch-Execute Cycle • Each instruction is fetched from the memory and executed. The IP (Instruction Pointer) holds the address of the instruction. • The common fetch cycle follows this plan. • MAR IP. // Get address of the instruction. • READ. // Put the address into the MAR // and read memory. • IR MBR. // Place the instruction into IR. 32 Memory Stores Both Instructions and Data • A stored program computer can allow instructions and data to be intermixed. • This normally causes no problems, but it does have that potential. 33 Conventional Division of Memory • One conventional way to handle memory is to divide it into logical segments. • Here we see 2 segments for one program: data and instructions. • For historical reasons, the instructions may be called “text”. 34 Two Programs in Memory • Here is a depiction of 2 programs in memory. • Each has a unique subroutine, as well as using common trig functions. • Note that the two copies of the trig functions. 35 Share Common Subprograms • Dynamic Link Libraries are an outgrowth of the idea that common subprograms can be shared. • Any shared subprogram must be written in a special style, called “reentrant”. 36