Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank Competency Area 1: Computer System Components Lecture 2 ENIAC - background • • • • Electronic Numerical Integrator And Computer Eckert and Mauchly University of Pennsylvania Proposed to develop a computer for the calculation of Trajectory tables for weapons during WWII (Army Ballistics Research Laboratory) • Started 1943 • Finished 1946 —Too late for war effort —Used to help determine feasibility of H-bomb • Used until 1955 ENIAC - details • Decimal (not binary) • Its memory contained 20 accumulators of 10 digits. • 10 vacuum tubes represented each digit. • Programmed manually by switches • 18,000 vacuum tubes • 30 tons • 1500 square feet • 140 kW power consumption • 5,000 additions per second von Neumann/Turing • Stored Program concept • Main memory storing programs and data • “Turing Machine” (Alan Turing): Given enough memory and sufficient time the general purpose computer can compute all functions that are computable. • ALU operating on binary data • Control unit interpreting instructions from memory and executing • Input and output equipment operated by control unit • Princeton Institute for Advanced Studies — IAS Computer (major components in a computer system) — Foundation for general-purpose computer • Completed 1952 Picture of the IAS Computer Smithsonian Image 95-06151 Structure of von Neumann machine IAS - details • 1000 x 40 bit words —1000 storage locations of 40-bit words —Binary number —2 x 20 bit instructions • Set of registers (storage in CPU) —Memory Buffer Register (MBR) —Memory Address Register (MAR) —Instruction Register (IR) —Instruction Buffer Register (IBR) —Program Counter (PC) —Accumulator (AC) —Multiplier Quotient (MQ) Structure of IAS – detail MBR Contains a word to be stored In memory, or is used to receive a Word from memory. MAR Specifies the address in memory of the word to be written from or into MBR IR Contains the 8-bit opcode instruction being executed IBR Temporarily holds the right hand instruction from a word in memory PC Contains the address of the next instruction pair to be fetched from memory IAS - details • The IAS computer had 21 instructions which are grouped as follows: —Data Transfer: Moves data between memory and ALU registers or between two ALU registers —Unconditional Branch: Changes the sequence of instructions to execute repetitive operations —Conditional Branch: The branch can be made dependent on a condition, thus, allowing decision points. —Arithmetic: Operations performed by the ALU — Address /modify: Permits addresses to be computed in the ALU and then inserted into instructions stored in memory. Commercial Computers • 1947 – Eckert-Mauchly developed their own Computer Corporation • UNIVAC I (Universal Automatic Computer) • Designed to perform mainly scientific calculations (e.g. US Bureau of Census 1950 calculations) • Became part of Sperry-Rand Corporation • Late 1950s - UNIVAC II —Faster —More memory IBM • Punched-card processing equipment • 1953 - the 701 —IBM’s first stored program computer —Scientific calculations • 1955 - the 702 —Business applications • Lead to 700/7000 series Transistors • The second generation of technology: Transistors replaced vacuum tubes • Smaller • Cheaper • Less heat dissipation • Solid State device • Made from Silicon (Sand) • Invented 1947 at Bell Labs • William Shockley et al. • Discrete components Transistor Based Computers • Second generation machines • More complex arithmetic and logic units • Incorporated the use of high-level programming languages • Also used system software with machines (e.g. operating systems) • NCR & RCA produced small transistor machines • IBM 7000 Series • Digital Equipment Corporation (DEC) - 1957 —Produced PDP-1 which began the minicomputer phenomenon Transistors Computer Generations: Generation Approximate Dates Technology Typical Speed (ops/sec) 1 1946-1957 Vacuum Tubes 40,000 2 1958-1964 Transistor 200,000 3 1965-1971 Small and medium scale integration 1,000,000 4 1972-1977 Large-scale integration (LSI) 10,000,000 5 1978- Very LSI 100,000,000 Microelectronics • Up to this point, computers were manufactured using discrete components which was becoming more expensive and cumbersome as computers continued to improve in performance. • Microelectronics dominated the next generation of computers. • Literally - “small electronics” • A computer is made up of gates, memory cells and interconnections • These can be manufactured on a semiconductor • e.g. silicon wafer Generations of Computer • Vacuum tube - 1946-1957 • Transistor - 1958-1964 • Small scale integration - 1965 on —Up to 100 devices on a chip • Medium scale integration - to 1971 —100-3,000 devices on a chip • Large scale integration - 1971-1977 —3,000 - 100,000 devices on a chip • Very large scale integration - 1978 to date —100,000 - 100,000,000 devices on a chip • Ultra large scale integration —Over 100,000,000 devices on a chip Moore’s Law • As microelectronics grew in the computer industry, an increase in the density of components on chip became evident. • Gordon Moore - cofounder of Intel • Gordon’s Observation: Number of transistors on a chip will double every year. • Since 1970’s development has slowed a little — Number of transistors doubles every 18 months • Cost of a chip has remained almost unchanged • Higher packing density means shorter electrical paths, giving higher performance • Smaller size gives increased flexibility • Reduced power and cooling requirements • Fewer interconnections increases reliability Moore’s Law Formal Consequences of Moore’s Law: 1. Cost of chip has remained relatively stable during a period of rapid growth in density. This implies the cost of computer logic and memory circuitry has fallen at a drastic rate. 2. Because logic and memory elements are placed closer together on more densely packed chips, the electrical path length is shortened, increasing operating speeds. 3. The computer becomes smaller, making it more convenient to placed in a variety of environments. 4. There is a reduction in power and cooling requirements. 5. The interconnections on the integrated circuit are much more reliable than solder connections. With more circuitry on each chip, there are fewer interchip connections. Growth in CPU Transistor Count IBM 360 series • Introduced in 1964 • Replaced (& not compatible with) 7000 series • First planned “family” of computers —Similar or identical instruction sets —Similar or identical O/S —Increasing speed —Increasing number of I/O ports (i.e. more terminals) —Increased memory size —Increased cost • Multiplexed switch structure • The introduction of this family cemented IBM as a world leader in computer manufacturing industry. Picture of IBM 360 DEC PDP-8 • • • • • Also introduced in 1964 First minicomputer Did not need room w. A/C Small, could sit on a lab bench Relatively cheap: $16,000 —Compared to $100k+ for IBM 360 • Embedded applications & Original Equipment Manufacturers (OEM) allowed users to buy PDP8 machines and integrate them into a total system for resale. • BUS STRUCTURE DEC - PDP-8 Bus Structure Console Controller CPU Main Memory I/O Module OMNIBUS - Highly flexible - All systems share a common set of signal paths - Allows other modules to be plugged into the bus to create various configurations I/O Module Intel • 1971 - 4004 —First microprocessor —Whole CPU on a single chip —4 bit • Followed in 1972 by 8008 —8 bit —Both for specific applications • 1974 - 8080 —Intel’s first general purpose microprocessor Speeding it up • • • • • • Pipelining On board cache On board L1 & L2 cache Branch prediction Data flow analysis Speculative execution Pentium Evolution (1) • 8080 — first general purpose microprocessor — 8 bit data path — Used in first personal computer – Altair • 8086 — much more powerful — 16 bit — instruction cache, prefetch few instructions — 8088 (8 bit external bus) used in first IBM PC • 80286 — 16 Mbyte memory addressable — up from 1Mb • 80386 — 32 bit — Support for multitasking Performance 1970s Processors: 4004 8008 8080 8086 8088 1971 1972 1974 1978 1979 108 KHz 108 KHz 2 MHz 5 MHz, 8MHz, 10MHz 5 MHz, 8MHz Bus Width 4 bits 8 bits 8 bits 16 bits 8 bits Number of Transistors 2300 3500 6000 29,000 29,000 Addressable Memory 640 bytes 16 KBytes 64 KBytes 1 MB 1 MB -- -- -- -- -- Introduced Clock Speeds Virtual Memory Performance 1980s Processors: 80286 386TM DX 386TM SX 486TM DX CPU 1982 1985 1988 1989 6 MHz – 12.5 MHz 16 MHz-33 MHz 16 MHz-33 MHz 25 MHz- 50 MHz Bus Width 16 bits 32 bits 16 bits 32 bits Number of Transistors 134,000 275,000 275,000 1.2 million Addressable Memory 16 MB 4 GB 4GB 4GB 1 GB 64 TB 64 TB 64 TB Introduced Clock Speeds Virtual Memory Performance 1990s Processors: 486TM SX Pentium Pentium Pentium II 1991 1993 1995 1997 16 MHz133MHz 60 MHz –166 MHz 150 MHz200MHz 200 MHz300MHz Bus Width 32 bits 32 bits 64 bits 64 bits Number of Transistors 1.185 million 3.1 million 5.5 million 7.5 million Addressable Memory 4 GB 4 GB 64 GB 64 GB Virtual Memory 64 TB 64TB 64 TB 64 TB Introduced Clock Speeds Performance Recent Processors: Pentium III Pentium 4 1999 2000 450 MHz 1.3-1.8 GHz Bus Width 64 bits 64 bits Number of Transistors 95 million 42 million Addressable Memory 64 GB 64 GB Virtual Memory 64 GB 64 TB Introduced Clock Speeds Performance Mismatch • Processor speed increased • Memory capacity increased • Memory speed lags behind processor speed!! DRAM and Processor Characteristics Pentium Evolution (2) • 80486 —sophisticated powerful cache and instruction pipelining —built in maths co-processor • Pentium —Superscalar —Multiple instructions executed in parallel • Pentium Pro —Increased superscalar organization —Aggressive register renaming —branch prediction —data flow analysis —speculative execution Pentium Evolution (3) • Pentium II —MMX technology —graphics, video & audio processing • Pentium III —Additional floating point instructions for 3D graphics • Pentium 4 —Note Arabic rather than Roman numerals —Further floating point and multimedia enhancements • Itanium —64 bit • See Intel web pages for detailed information on processors Intel Itanium 2 (McKinley) • 64b Processor • 221 million transistors! (~US adult population) • How are they used? • What will we do as transistor counts continue to grow? • Most of chip is used for memories, inst. decoding, dynamic scheduling… • Why is it done this way? • How much more efficient could it be if more of area went to actual processing? Even More Recent Example • Runs 64-bit IA-64 ISA • Die: 3.74 cm2 • .13µ process • 410M transistors • 1.5GHz core • 1.3V logic • 130W power consumption! • 6.4GB/s bus • Cost: $2,247$4,226 • 9MB L3 cache later this year… Internet Resources • http://www.williamstallings.com —Computer Organization and Architecture • http://www.intel.com/ —Search for the Intel Museum • • • • • http://www.ibm.com http://www.dec.com Charles Babbage Institute PowerPC Intel Developer Home