Four decades of microprocessors P K Mukherjee Dept. of Electronics Engineering Institute of Technology Banaras Hindu University Varanasi – 221 005 Agenda •Pre Vacuum Tube •Post Vacuum Tube •Bipolar Junction Transistor •MOSFET •Integrated Circuit •Microprocessor •Segmentation •Paging, Cache and Cache Coherence •Pipelining •RISC •Protection, Paging and Multitasking •Super Scalar Architecture •VLIW •Multi threading •Multiple Core Architecture Ancient Calculators Abacus Napier bones The First Computer The Babbage Difference Engine (1832) 25,000 parts Cost: £ 17,470 The invention of Difference Engine made Charles Babbage “The Father of Computer” ENIAC (The first in vacuum tube era) ENIAC contained approximately 18,000 vacuum tubes, 70,000 resistors, 10,000 capacitors, and 6,000 switches. It was 100 feet long, 10 feet high, and 3 deep. It consumed 140 kilowatts of power. The Transistor revolution! Advantages over vacuum tube: 1. Smaller and light weight 2. No heater requirement 3. Less power consumption 4. Lower operating voltages Co inventors: Dr William Shockley, Dr John Bardeen and Dr Walter H. Brattain. The Transistor John Bardeen, Walter Brattain and William Shockley discovered the transistor effect and developed the first device in December 1947, while the three were members of the technical staff at Bell Laboratories in Murray Hill, NJ. They were awarded the Nobel Prize in physics in 1956. Developed as a replacement for bulky and inefficient vacuum tubes and mechanical relays, the transistor later revolutionized the entire electronics world Intel 1950's: Shockley leaves Bell Labs to establish Shockley Labs in California. Some of the best young electronic engineers and solid-state physicists come to work with him.These include Robert Noyce and Gordon Moore. 1969: Intel (Integrated Electronics) was a tiny start-up company in Santa Clara, headed by Noyce and Moore, working in MOS memory. Marcian E Ted Hoff Jr. joins Intel as its 12th employee. 1970: Busicom Corporation, Japan placed an order with Intel for custom calculator chips. Intel had no experience of custom-chip design and sets out to design a general-purpose solution.Ted and Stanley Mazor did the design. 1971: Intel has problems translating architectures into working chip designs - the project runs late. Federico Faggin joins Intel and solves the problems in weeks. The result is the Intel 4000 family (later renamed MCS-4, Microcomputer System 4bit), comprising the 4001 (2k-bit ROM), the 4002 (320-bit RAM), the 4003 (10-bit I/O shift-register) and the 4004, a 4-bit CPU. Intel 4004: The Serendipitous Invention. Introduced in 1971, the Intel 4004 "Computer-on-aChip" was a 2300 transistor device capable of performing 60,000 operations per second. It was the first-ever microprocessor and had approximately the same performance as the 18,000 vacuum tube ENIAC. The 4-bit Intel C4004 ran at a Clock Speed of 108 KHz and occupied 12 sq. mm. Intel 4004 Federico Faggin designed the Intel 4004 processor. His initials were printed on the circuit. The Busicom Calculator The Busicom calculator used five Intel 4001’s, two 4002’s, three 4003’s and the 4004 CPU The original engineering prototype of the Busicom desk-top printing calculator, the world’s first commercial product to use a microprocessor. Intel 8085 Microprocessor Introduced in 1974 First true sense single chip microprocessor 8-bit architecture Still used in some microcontroller applications! 8085: Concepts 8 Bit Data and 16 Bit Address bus, time multiplexed for the first time. READY pin made interface with slower memories possible. Two dedicated pins for generation and reception of serial data (SID and SOD). Could be interfaced with 256 input and 256, output ( 8 bit) devices. Intel 8086 Microprocessor Introduced in 1979 29,000 transistors 33 mm2 Clock: 5 MHz 16 bit architecture 8086: New concepts Segmentation Pipelining Multiprocessing Min/Max modes Multiple ground pins Aligned and non-aligned Word access Little endian, as described by John Cohen. The term coined from Gulliver Twist by Jonathon Swift. IBM PC The IBM PC was introduced in 1981 to much fanfare in the computer industry. IBM, being the biggest computer company of all time, helped legitimize the PC revolution by participating in it. Released: O/S: CPU: Memory: 1981 IBM BASIC / PC-DOS 1.0 Intel 8088@4.77MHz 16kB- 256kB Intel 80286 Microprocessor Introduced in 1982 134,000 transistors x86-16 architecture Clock: 6 to 25 MHz Double performance per clock cycle compared to 8086 IBM PC/AT The IBM PC-AT was the first upgrade from the basic PC architecture introduced in 1981 (the XT expanded storage options, but not the processor, bus, chipset, etc.) The first ATs were released with a conservative 6 MHz processor which was quickly upgraded to 8 MHz. The BIOS allowed for the definition of aftermarket hard drives. The AT also introduced the 1.2 MB high-density 5.25" floppy drive which became an industry standard until the advent of 3.5" disk drives. Released: O/S: CPU: Memory: 1984 PC-DOS 3.0+, OS/2 1.x Intel 80286@6 and 8MHz 256KB- 16MB Intel 80386 Microprocessor Introduced in 1985 275,000 transistors 43 mm2 Clock: 16 MHz 32 bit architecture 80386: The Concepts, on chip 4GB memory addressing 64 TB Virtual Memory Paging was introduced Protected environment imperative for multi-tasking and multi-user operating systems like UNIX On the fly switch between Real, Protected and Virtual modes Debugging features on chip POST on chip Intel 80486 Microprocessor Introduced in 1989 1,200,000 transistors 81 mm2 Clock: 25 MHz 32 bit architecture 1st pipelined implementation of IA32 Coprocessor was a part of die for the first time Intel Pentium Microprocessor Introduced in 1993 3,100,000 transistors 296 mm2 Clock: 60 MHz 32 bit architecture Split Cache for Instruction and Data 1st superscalar implementation of IA32 Pentium Processor Details State ◦ Registers ◦ Memory Control ROM Combinational logic Intel Pentium III Introduced in 1999 9,500,000 transistors 125 mm2 Clock: 450 MHz 32 bit architecture Intel Pentium IV Introduced in 2000 42,000,000 transistors Clock: 1.3-3.8 GHz 32 bit architecture Superscalar, Super pipelined- Intel’s Net burst Micro architecture In order Issue, Out of Order Execution Reorder Buffer Moore’s Law In 1965, Gordon Moore noted that the number of transistors on a chip doubled every 18 to 24 months Die size (mm) Die Size 10 8080 8008 4004 1 1970 8086 8085 1980 286 386 P6 Pentium ® proc 486 ~7% growth per year ~2X growth in 10 years 1990 Year 2000 2010 Die size grows by 14% to satisfy Moore’s Law Frequency Frequency (Mhz) 10000 Doubles every 2 years 1000 100 10 8085 1 0.1 1970 8086 286 386 486 P6 Pentium ® proc 8080 8008 4004 1980 1990 Year 2000 2010 Lead Microprocessors frequency doubles every 2 years Power Dissipation Power (Watts) 100 P6 Pentium ® proc 10 8086 286 1 8008 4004 486 386 8085 8080 0.1 1971 1974 1978 1985 1992 2000 Year Lead Microprocessors power continues to increase Power Dissipation is a major problem! Power Density (W/cm2) 10000 1000 100 Rocket Nozzle Nuclear Reactor 8086 Hot Plate 10 4004 P6 8008 8085 Pentium® proc 386 286 486 8080 1 1970 1980 1990 2000 2010 Year Power density too high to keep junctions at low temp What is in today’s processor? Multi-core processors Combine superscalar and multithreadingSMT or Simultaneous multithreading. Pipelining Multiple instructions are overlapped in execution just like an assembly line. It takes advantage of parallelism that exists among the actions needed to execute an instruction Hazards in pipeline Structural hazards Data hazards Control hazards Ways to remove hazards: Forwarding Multiple functional units Branch prediction….etc Concepts tried and proposed Static code scheduling Dynamic Code Scheduling, TOMASULO VLIW Software Pipelining Trace scheduling IRAM Harvard architecture Physically separate instruction and data memory Prevents structural hazards- IF and MEM can occur in same clock cycle System throughput increases RISC Berkeley School, Prof Patterson Stanford School, Prof Hennessey Reduced Number of Instructions Hardwired Massive Pipelining Sufficient number of Registers Load Store Architecture Intelligent Compilers RISC Vs CISC Reduced Instruction Set Computer Faster and cheaper processors - an Apple Mac G3 offers a significant performance advantage over its Intel equivalent. Instructions are executed over 4x faster providing a significant performance boost! Require more lines of code to produce the same results and are increasingly complex. Despite the speed advantages of the RISC processor, it cannot compete with a CISC CPU that boasts twice the number of clock cycles. Complex Instruction Set Computer CISC microprocessors are more expensive to make than their RISC cousins. The x86 market has been opened by the development of several competing processors, from the likes of AMD, Cyrix, and Intel. This has continually reduced the price of a CPU for many months. In contrast, the PowerPC Macintosh market is dictated by Apple. This reduces the cost of x86 - based microprocessors, while the PowerPC market remains stagnant. Future challenges: Dissipation of Power generated Technology limitation Track length and associated capacitance Dynamic Branch Prediction is Super Pipelined processors References: Intel Corporation, Santa Clara Computer Architecture, a Quantitative Approach, 4th edition, Hennessey and Patterson. Thanks Professor A K Agrawal Mr. Praharsh Sharma Special Thanks Mr. Shantanu Singh Mr. Himanshu Questions To be answered to the best of my ability If not able, Shantanu and Himanshu will answer.