EEL 5708 High Performance Computer Architecture Lecture 2 Introduction: the big picture August 27, 2004 Lotzi Bölöni Fall 2004 Fall 2004 EEL5708/Bölöni Lec 2.1 Acknowledgements • All the lecture slides were adopted from the slides of David Patterson (1998, 2001) and David E. Culler (2001), Copyright 19982002, University of California Berkeley Fall 2004 EEL5708/Bölöni Lec 2.2 Research Paper Reading • As graduate students, you are now researchers. • Most information of importance to you will be in research papers. • Ability to rapidly scan and understand research papers is key to your success. • So: about 1 paper / week in this course – Quick 1 paragraph summaries will be due as homework – Important supplement to book. – Will discuss papers in class • Links to the papers will be posted on the course webpage Fall 2004 EEL5708/Bölöni Lec 2.3 First reading • G.Amdahl, G.A.Blaauw, F.P. Brooks, Jr – Architecture of the IBM System 360 • Link from the course website • A good paper to improve your skills in reading papers. Fall 2004 EEL5708/Bölöni Lec 2.4 Why take EEL5708? • To design the next great instruction set?...well... – instruction set architecture has largely converged – especially in the desktop / server / laptop space – dictated by powerful market forces • Tremendous organizational innovation relative to established ISA abstractions • Many new instruction sets or equivalent – embedded space, controllers, specialized devices, ... • Design, analysis, implementation concepts vital to all aspects of EE & CS – systems, PL, theory, circuit design, VLSI, comm. • Equip you with an intellectual toolbox for dealing with a host of systems design challenges Fall 2004 EEL5708/Bölöni Lec 2.5 Example Hot Developments ca. 2002 • Manipulating the instruction set abstraction – – – – – Itanium: translate ISA64 -> micro-op sequences Pentium IV - hyperthreading Transmeta: continuous dynamic translation of IA32 Tensilica: synthesize the ISA from the application reconfigurable HW • Virtualization – vmware: emulate full virtual machine – JIT: compile to abstract virtual machine, dynamically compile to host • Parallelism – wide issue, dynamic instruction scheduling, EPIC – multithreading (SMT) – chip multiprocessors • Communication – network processors, network interfaces • Exotic explorations – nanotechnology, quantum computing Fall 2004 EEL5708/Bölöni Lec 2.6 Forces on Computer Architecture Technology Programming Languages Applications Computer Architecture Operating Systems History (A = F / M) Fall 2004 EEL5708/Bölöni Lec 2.7 Amazing Underlying Technology Change Fall 2004 EEL5708/Bölöni Lec 2.8 Original Big Fishes Eating Little Fishes Fall 2004 EEL5708/Bölöni Lec 2.9 1988 Computer Food Chain Mainframe Supercomputer Minisupercomputer Work- PC Ministation computer Massively Parallel Processors Fall 2004 EEL5708/Bölöni Lec 2.10 Massively Parallel Processors Minisupercomputer Minicomputer 1998 Computer Food Chain Mainframe Server Supercomputer Fall 2004 Work- PC station Now who is eating whom? EEL5708/Bölöni Lec 2.11 Why Such Change in 10 years? • Performance – Technology Advances » CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance – Computer architecture advances improves low-end » RISC, superscalar, RAID, … • Price: Lower costs due to … – Simpler development » CMOS VLSI: smaller systems, fewer components – Higher volumes » CMOS VLSI : same dev. cost 10,000 vs. 10,000,000 units – Lower margins by class of computer, due to fewer services • Function – Rise of networking/local interconnection technology Fall 2004 EEL5708/Bölöni Lec 2.12 Technology Trends: Microprocessor Capacity 100000000 “Graduation Window” ATI Radeon 9700: 110 million (graphics processor) 10000000 Moore’s Law Pentium i80486 Transistors 1000000 Pentium 4: 55 million Athlon XP: 37.5 million Alpha 21264: 15 million Pentium Pro: 5.5 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Sparc Ultra: 5.2 million i80386 i80286 100000 i8086 10000 i8080 i4004 1000 1970 1975 1980 1985 Year Fall 2004 1990 1995 2000 CMOS improvements: • Die size: 2X every 3 yrs • Line width: halve / 7 yrs EEL5708/Bölöni Lec 2.13 Processor Performance Trends 1000 Supercomputers 100 Mainframes 10 Minicomputers Microprocessors 1 0.1 1965 1970 1975 1980 1985 1990 1995 2000 Year Fall 2004 EEL5708/Bölöni Lec 2.14 Memory Capacity (Single Chip DRAM) size 1000000000 100000000 Bits 10000000 1000000 100000 10000 1000 1970 1975 1980 1985 1990 1995 year 1980 1983 1986 1989 1992 1996 2000 2000 size(Mb) cyc time 0.0625 250 ns 0.25 220 ns 1 190 ns 4 165 ns 16 145 ns 64 120 ns 256 100 ns Year Fall 2004 EEL5708/Bölöni Lec 2.15 Technology Trends (Summary) Fall 2004 Capacity Speed (latency) Logic 2x in 3 years 2x in 3 years DRAM 4x in 3 years 2x in 10 years Disk 4x in 3 years 2x in 10 years EEL5708/Bölöni Lec 2.16 Technology Trends • • • • • • Clock Rate: ~30% per year Transistor Density: ~35% Chip Area: ~15% Transistors per chip: ~55% Total Performance Capability: ~100% by the time you graduate... – 3x clock rate (3-4 GHz) – 10x transistor count (1 Billion transistors) – 30x raw capability • plus 16x dram density, 32x disk density Fall 2004 EEL5708/Bölöni Lec 2.17 Newest trends (Fall 2004) • Moore’s law is probably over. • Future VLSI improvements will probably be linear (as opposed to exponential). • Multi-core chips will be the new standard, from as early as 2005. • Parallel programs will become much more important, even for mainstream. • And many developments which we can not foresee at this moment. Fall 2004 EEL5708/Bölöni Lec 2.18 What is “Computer Architecture”? Application Operating System Compiler Firmware Instr. Set Proc. I/O system Instruction Set Architecture Datapath & Control Digital Design Circuit Design Layout • Coordination of many levels of abstraction • Under a rapidly changing set of forces • Design, Measurement, and Evaluation Fall 2004 EEL5708/Bölöni Lec 2.19