+ CS 325: CS Hardware and Software Organization and Architecture Computer Evolution and Performance 1 + Outline Generations in Computer Organization Milestones in Computer Organization Von Neumann Architecture Moore’s Law CPU Transistor sizes and count Memory Hierarchy Performance Cache Memory Performance Issues and Solutions + History – Generations in Computer Organization Zeroth generation – mechanical computers (1642 – 1945) First generation – vacuum tubes (1945 –1955) Harwell CADET, TX-0 Third generation – integrated circuits (1965 –1980) ENIAC, Colossus Second generation – transistors (1955 –1965) Blaise Pascal’s mechanical calculator IBM System/360 Fourth generation – VLSI (Very Large Scale Integration ) (1980 - ?) Microprocessors Supercomputers + History – Milestones in Computer Organization 1642 – Blaise Pascal, c. 1670 – Baron Gottfried Wilhelm von Leibniz, Mechanical machine with gears used to add and subtract. Mechanical machine with gears used to add, subtract, multiply, and divide. 1834 – Charles Babbage, Analytical Engine Ada Lovelace wrote its “assembler” 1936 – Konrad Zuse, Z1 Calculator made of electromagnetic relays + History – Milestones in Computer Organization 1940’s – John Atanasoff, George Stibbitz, Howard Aiken, Binary arithmetic Capacitors for memory COLOSSUS, the first electronic computer. 1946 – John Mauchley, John Presper Eckert, 1943 – Alan Turing and British Government, Each worked independently on calculating machines with properties such as: ENIAC, vacuum tubes 1949 – Maurice Wilkes, EDSAC, first stored program computer. + History – Milestones in Computer Organization 1952 – John von Neumann, von Neumann architecture – most current computers use this basic design. 1950’s – Researchers at MIT, Transistorized Experimental Computer Zero (TX-0), first computer to use transistors (3600 transistors). 1960 – Digital Equipment Corporation (DEC), Stored program computer w/ shared memory for instructions and data. PDP-1, first mini computer 1961 – IBM, 1401, popular small business computer. + History – Milestones in Computer Organization 1962 – IBM, 7094 and 709 using transistors. 1963 – Burroghs, B500, first computer designed for a high-level language (Algol, precursor to C). 1964 – Seymour Cray, Control Data Group, 6600, 10x faster than IBM 7094 Used mainly for scientific computing. Highly parallelized CPU 1965 – PDP-8, First mass-market mini computer. Used a single bus. + History – Milestones in Computer Organization 1964 – IBM, First computers with multiprogramming. Used integrated circuits (dozens of transistors on one “chip”). 8080, first general purpose computer on a chip. 1974 – Cray-1, 1974 – Intel, System/360, family of compatible computers (low end to high end). First vector computer (single instructions on vectors of numbers). 1978 – DEC, VAX, first 32-bit mini computer. + History – Milestones in Computer Organization 1978 – Steve Jobs, Steve Wozniak, 1981 – IBM, MIPS, first commercial RISC computer. 1987 – Sun Microsystems, IBM PC becomes the most popular personal computer. Used MS-DOS by Microsoft as OS. CPU developed by Intel. 1985 – MIPS (company), Apple personal computer SPARC, popular RISC computer. 1990 – IBM, RS6000, first superscalar machine + History – Milestones in Computer Organization 1993 – Intel, Pentium CPU released. 32-bit, 60 Mhz 3.2 million transistors $878 1993 – NVIDIA is founded. 1994 – Intel, Pentium 2 CPU released. 1999 – Intel, Pentium 3 CPU released. 32-bit, 550 Mhz + History – Milestones in Computer Organization 2003 – AMD, Athlon 64 CPU is released. 2005 – AMD, Intel, Dual core CPU’s released. + Computer and CPU Organization Definition: The terms processor, CPU, and computational engine refer broadly to any mechanism that drives computation. + Von Neumann Architecture Stored Program/Data concept Characteristic of most modern CPUs Main ALU memory stores programs (instructions) and data. operates on binary data. Control unit interprets instructions from memory and passes information along to ALU. + Von Neumann Architecture – Three Basic Components CPU Memory I/O facilities Control units Busses All interact to form a complete computer + Structure of Von Neumann Architecture + Milestones in Computer Organization Moore’s Law: Number of transistors on a chip doubles every 18 months. + Milestones in Computer Organization Moore’s Law: Increased density of components on chip. Originally, thought number of transistors on a chip will double every year. Since the 1970’s, development has slowed. Number of transistors doubles every 18 months. Cost of chip has remained almost unchanged. Higher packing density means shorter electrical path, giving higher performance. Smaller size gives increased flexibility. Reduced power and cooling requirements. Fewer interconnections increases reliability. + CPU Transistor Sizes - Intel 8086 – 29K transistors, 3µm 80186 – 29K transistors, 2µm 80286 – 134k transistors, 1.5µm 80386 – 855k transistors, 1µm 80486 – 1.6m transistors, 0.6µm Pentium 1 – 4.5m transistors, 0.35µm Pentium 2 – 7.5m transistors, 0.35µm Pentium 3 – 9.5m transistors, 0.25µm Pentium 4 – 42m transistors, 0.18µm Pentium m – 140m transistors, 0.13µm Pentium D – 230m transistors, 90nm Core 2 – 291m transistors, 65nm Current Intel architecture: Core I series “Haswell” I7: 1.4b transistors, 22nm + Growth in CPU Transistor Count + Memory Hierarchy Importance 1980: No cache memory on CPU. 1989: First Intel CPU that included cache memory. 1995: 2-level cache on CPU. 2003: 3-level cache on CPU. 2013: 4-level cache on CPU. Intel Haswell architecture w/ integrated Iris Pro Graphics + How to Increase Performance? Pipelining On board cache memory Branch Data prediction flow analysis Speculative execution + Performance Balance CPU performance increasing Memory capacity increasing Memory speed lagging behind CPU performance + Core Memory 1950’s 1 – 1970’s core = 1 bit Polarity determines logical “1” or “0” Roughly Up 1Mhz clock rate. to 32kB storage. + Semiconductor Memory 1970’s - Today Fairchild Size of a single core i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive SDRAM most common, uses capacitors. Much faster than core Today: 1.3 – 3.1 Ghz Capacity read, but volatile approximately doubles each year. Today: 64GB per single DIMM + CPU (Logic) and Memory Performance Gap + Solutions Increase Make DRAM “wider” rather than “deeper” Change DRAM interface Cache Reduce frequency of memory access More complex cache and cache on chip Increase number of bits retrieved at one time interconnection bandwidth High speed buses Hierarchy of buses + Improvements in CPU Organization and Architecture Increase Fundamentally due to shrinking logic gate size More gates, packed more tightly, increasing clock rate Propagation time for signals reduced Increase size and speed of caches Dedicating part of processor chip Cache access times drop significantly Change hardware speed of processor processor organization and architecture Increase effective speed of execution Parallelism + Problems with Clock Speed and Logic Density Power Resistor-Capacitor (RC) delay Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them Delay increases as RC product increases Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance Memory latency Power density increases with density of logic and clock speed Dissipating heat Memory speeds lag processor speeds Solution: More emphasis on organizational and architectural approaches + Intel CPU Performance + Increased Cache Capacity Typically two or three levels of cache between processor and main memory. Chip density increased More cache memory on chip Faster cache access Pentium chip devoted about 10% of chip area to cache. Pentium 4 devotes about 50%. + Increased Cache Capacity + + More Complex Execution Logic Enable parallel execution of instructions Pipeline works like assembly line Different stages of execution of different instructions at same time along pipeline Superscalar allows multiple pipelines within single processor Instructions that do not depend on one another can be executed in parallel + Diminishing Returns Internal organization of processors complex Can get a great deal of parallelism Further significant increases likely to be relatively modest Benefits from cache are reaching limit Increasing clock rate runs into power dissipation problem Some fundamental physical limits are being reached + New Approach – Multiple Cores Multiple processors on single chip Large shared cache Within a processor, increase in performance proportional to square root of increase in complexity If software can use multiple processors, doubling number of processors almost doubles performance So, use two simpler processors on the chip rather than one more complex processor With two processors, larger caches are justified Power consumption of memory logic less than processing logic