Computer Science 246 Advanced Computer Architecture Spring 2008 Harvard University Instructor: Prof. David Brooks dbrooks@eecs.harvard.edu Course Outline • • • • • • Instructor Prerequisites Topics of Study Course Expectations Grading Class Scheduling Conflicts 2 Instructor • Instructor: David Brooks (dbrooks@eecs.harvard.edu) • Office Hours: TBD, MD141, stop by/email whenever 3 Prerequisites • CS141 (or equivalent) • • • • • • Logic Design (computer arithmetic) Basic RISC ISA Pipelining (control/data hazards, forwarding) i.e. Hennessey & Patterson Jr. (HW/SW Interface) C Programming, UNIX for Project (or similar skills) Compilers, OS, Circuits/VLSI background is a plus, not needed 4 Readings and Resources • • • Text: “Computer Architecture: A Quantitative Approach,” Hennessy and Patterson Key research papers (will be available on the web or paper copies) SimpleScalar toolset • • • • Widely used architectural simulator SPEC2000 benchmarks Will be used for some HW/Projects Power/thermal modeling extensions available 5 Course Expectations • Lecture Material • Short homework/quizzes to cover this material • Seminar style part of course: • Expectation: you will read the assigned papers before class so we can have a lively discussion about them • Paper reviews –short “paper review” highlighting interesting points, strengths/weaknesses of the paper • Bring discussion points to class • Discussion leadership – Students will be assigned to present the paper/lead the discussions 6 Course Expectations • Course project • Several possible ideas will be given • Also you may come up with your own • Depending on enrollment, I will schedule weekly/bi-weekly meetings with each individual/group (1/2hr per project) to discuss results/progress • There will be two presentations –First, a short “progress update” (late April) –Second, a final presentation scheduled at the end of reading week –Finally, a project writeup written in research paper style 7 Grading • Grade Formula • Homeworks and/or Quiz – 25% • Class Participation – 25% • Project (including final project presentation)– 50% 8 Topics of CS246 • • Introduction to Computer Architecture and Power-Aware Computing Modern CPU Design • • • • • • • • • • • Deep pipelines/Multiple Issue Dynamic Scheduling/Speculative Execution Memory Hierarchy Design Pentium Architecture Case Study Multiprocessors and Multithreading Embedded computing devices Dynamic Frequency/Voltage Scaling Thermal-aware processor design Power-related reliability issues System-Level Power Issues Software approaches power management 9 Readings from Previous Years • • • • • • • • • • • • • • • High-level Power modeling (power abstractions) Power Measurement for OS control Temperature Aware Computing Di/Dt Modeling Leakage Power Modeling and Control Frequency and Voltage Scheduling Algorithms Power in Data Centers, Google Cluster Architecture Dynamic Power Reduction in Memory Disk Power Modeling/Management Application and Compiler Power Management Dynamic adaptation for Multimedia applications Architectures for wireless sensor devices Low-power routing algorithms for wireless networked devices Intel XScale Microprocessor, IBM Watchpad Human Powered Computing 10 Why take CS246? • • Learn how modern computing hardware works Understand where computing hardware is going in the future • And learn how to contribute to this future… • How does this impact system software and applications? • Essential to understand OS/compilers/PL • For everyone else, it can help you write better code! • How are future technologies going to impact computing systems? 11 Architectural Inflection Point • Computer architects strive to give maximum performance with programmer abstraction • Compilers, OS part of this abstraction • e.g. Pipelining, superscalar, speculative execution, branch prediction, caching, virtual memory… • Technology has brought us to an inflection point • Multiple processors on a single chip -- Why? –Design complexity, ILP/pipelining-limits, power dissipation, etc • How to provide the abstraction? • Some burden will shift back to programmers 12 Topics of Study • • • • Focus on what modern computer architects worry about (both academia and industry) Get through the basics of modern processor design Look at technology trends: multithreading, CMP, power-, reliability-aware design Recent research ideas, and the future of computing hardware 13 What is Computer Architecture? Operating System Applications (AI, DB, Graphics) Software Instruction Set Architecture Microarchitecture System Architecture Technology Trends Prog. Lang, Compilers Application Trends Hardware VLSI/Hardware Implementations 14 Application Areas • General-Purpose Laptop/Desktop • Productivity, interactive graphics, video, audio • Optimize price-performance • Examples: Intel Pentium 4, AMD Athlon XP • Embedded Computers • PDAs, cell-phones, sensors => Price, Energy efficiency, Form-Factor • Examples: Intel XScale, StrongARM (SA-110) • Game Machines, Network uPs => PricePerformance • Examples: Sony Emotion Engine, IBM 750FX 15 Application Areas • Commercial Servers • Database, transaction processing, search engines • Performance, Availability, Scalability • Server downtime could cost a brokerage company more than $6M/hour • Examples: Sun Fire 15K, IBM p690, Google Cluster • Scientific Applications • Protein Folding, Weather Modeling, CompBio, Defense • Floating-point arithmetic, Huge Memories • Examples: IBM DeepBlue, BlueGene, Cray T3E, etc. 16 Moore’s Law • Every 18-24 months • Feature sizes shrink by 0.7x • Number of transistors per die increases by 2x • Speed of transistors increases by 1.4x • • But we are starting to hit some roadblocks… Also, what to do with all of these transistors??? 17 Moore’s Law (Tx Density) Transistors Per Die 1,000,000,000 Itanium2 100,000,000 Itanium Pentium III 10,000,000 PentiumII Pentium 1,000,000 486 386 100,000 80286 8086 10,000 4004 1,000 1970 1980 1990 2000 2010 18 How have we used these transistors? • More functionality on one chip • Early 1980s – 32-bit microprocessors • Late 1980s – On Chip Level 1 Caches • Early/Mid 1990s – 64-bit microprocessors, superscalar (ILP) • Late 1990s – On Chip Level 2 Caches • Early/Mid 2000s – Chip Multiprocessors, On Chip Level 3 Caches • What is next? • How much more cache can we put on a chip? (Itanium2) • How many more cores can we put on a chip? (Niagara, etc) • What else can we put on chips? 19 Moore’s Law (Performance) Relative Performance 1600 1400 1200 1000 Pentium III Why has Performance Outstripped Moore’s Law? 800 DEC Alpha 600 400 200 0 1984 1.58x per year MIPS R2000 1986 1988 IBM Power1 1990 DEC HP Alpha 9000 1992 1994 HP 9000 1.35x per year 1996 1998 2000 20 Performance vs. Technology Scaling • Architectural Innovations • Massive pipelining (good and bad!) • Huge caches • Branch Prediction, Register Renaming, OOOissue/execution, Speculation (hardware/software versions of all of these) • Circuit/Logic Innovations • New logic circuit families (dynamic logic) • Better CAD tools • Advanced computer arithmetic 21 Power Dissipation Trends 2 Power Density (W/cm ) 1000 Nuclear Reactor 100 Pentium 4 (Prescott) Pentium 4 Pentium 3 Hot Plate Pentium 2 10 Pentium Pro Pentium 1 1980 386 486 1990 2000 2010 22 Power Issues in Microprocessors Capacitive (Dynamic) Power Static (Leakage) Power Vdd Vin VIN Vout VOUT IGate CL ISub CL Di/Dt (Vdd/Gnd Bounce) Voltage (V) Current (A) Temperature 20 cycles Minimum Voltage 23 Temperature/di-dt-Constrained Power-Aware Computing Applications Energy-Constrained Computing 24 The Battery Gap Diverging Gap Between Actual Battery Capacities and Energy Needs 5000 10kbps 64kbps 384kbps Interactive Energy (mAh) 4000 3000 2000 2Mbps Downlink dominated PIM, SMS, Voice Mobile videoConferencing, Collaboration Video email, Voice recognition, Mobile commerce Fuel Cells Web browser, MMS, Video clips 1000 Lithium Polymer Lithium Ion 0 2000 2001 2002 2003 2004 2005 2006 2007 Battery capacity (mAh) Energy requirement (mAh) Source: Anand Raghunathan, NEC Labs 25 Where does the juice go in laptops? • Others have measured ~55% processor increase under max load in laptops [Hsu+Kremer, 2002] 26 Packaging cost From Cray (local power generator and refrigeration)… Source: Gordon Bell, “A Seymour Cray perspective” http://www.research.microsoft.com/users/gbell/craytalk/ 27 Packaging cost To today… • IBM S/390: refrigeration: • Provides performance (2% perf for 10ºC) and reliability Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” 28 IBM Journal of R&D Intel Itanium packaging Complex and expensive (note heatpipe) Source: H. Xie et al. “Packaging the Itanium Microprocessor” Electronic Components and Technology Conference 2002 29 P4 packaging • Simpler, but still… Total Power-Related PC System Cost ($) 40 30 20 10 0 0 10 20 30 40 Power (Watts) From Tiwari, et al., DAC98 Source: Intel web site 30 31 Cooking Aware Computing 32 Server Farms • Internet data centers are like heavy-duty factories • e.g. small Datacenter 25,000 sq.feet, 8000 servers, 2MegaWatts • Intergate Datacenter, Tukwila, WA: 1.5 Mill. Sq.Ft, ~500 MW • Wants lowest net cost per server per sq foot of data center space • Cost driven by: • • • • • Racking height Cooling air flow Power delivery Maintenance ease (access, weight) 25% of total cost due to power 33 Environment • Environment Protection Agency (EPA): computers consume 10% of commercial electricity consumption • • • • • • • • • This incl. peripherals, possibly also manufacturing A DOE report suggested this percentage is much lower (3.0-3.5%) No consensus, but it’s still a lot Interesting to look at the numbers: – http://enduse.lbl.gov/projects/infotech.html Data center growth was cited as a contribution to the 2000/2001 California Energy Crisis Equivalent power (with only 30% efficiency) for AC CFCs used for refrigeration Lap burn Fan noise 34 Now we know why power is important • What can we do about it? • Two components to the problem: • #1: Understand where and why power is dissipated • #2: Think about ways to reduce it at all levels of computing hierarchy • In the past, #1 is difficult to accomplish except at the circuit level • Consequently most low-power efforts were all circuit related 35 Modeling + Design • First Component (Modeling/Measurement): • Come up with a way to: –Diagnose where power is going in your system –Quantify potential savings • Second Component (Design) • Try out lots of ideas • Or characterize tradeoffs of ideas… • This class will focus on both of these at many levels of the computing hierarchy 36 Next Time • • Course website: http://www.eecs.harvard.edu/~dbrooks/cs246 Some readings for next week: • • • • Web Page Browsing: • • • “Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors,” IEEE MICRO. “Power: A First-Class Architectural Design Constraint,” IEEE Computer “Thermal Crisis: Challenges and Potential Solutions” Information Technology and Resource Use – http://enduse.lbl.gov/projects/infotech.html Sun’s ECO-responsible data center – http://www.sun.com/aboutsun/environment/epa.jsp Questions? 37