Computer Science 246 Advanced Computer Architecture

advertisement
Computer Science 246
Advanced Computer
Architecture
Spring 2008
Harvard University
Instructor: Prof. David Brooks
dbrooks@eecs.harvard.edu
Course Outline
•
•
•
•
•
•
Instructor
Prerequisites
Topics of Study
Course Expectations
Grading
Class Scheduling Conflicts
2
Instructor
•
Instructor: David Brooks
(dbrooks@eecs.harvard.edu)
• Office Hours: TBD, MD141, stop by/email
whenever
3
Prerequisites
•
CS141 (or equivalent)
•
•
•
•
•
•
Logic Design (computer arithmetic)
Basic RISC ISA
Pipelining (control/data hazards, forwarding)
i.e. Hennessey & Patterson Jr. (HW/SW
Interface)
C Programming, UNIX for Project (or similar
skills)
Compilers, OS, Circuits/VLSI background is
a plus, not needed
4
Readings and Resources
•
•
•
Text: “Computer Architecture: A
Quantitative Approach,” Hennessy and
Patterson
Key research papers (will be available on
the web or paper copies)
SimpleScalar toolset
•
•
•
•
Widely used architectural simulator
SPEC2000 benchmarks
Will be used for some HW/Projects
Power/thermal modeling extensions available
5
Course Expectations
•
Lecture Material
• Short homework/quizzes to cover this material
•
Seminar style part of course:
• Expectation: you will read the assigned papers
before class so we can have a lively discussion
about them
• Paper reviews –short “paper review”
highlighting interesting points,
strengths/weaknesses of the paper
• Bring discussion points to class
• Discussion leadership – Students will be
assigned to present the paper/lead the
discussions
6
Course Expectations
•
Course project
• Several possible ideas will be given
• Also you may come up with your own
• Depending on enrollment, I will schedule
weekly/bi-weekly meetings with each
individual/group (1/2hr per project) to discuss
results/progress
• There will be two presentations
–First, a short “progress update” (late April)
–Second, a final presentation scheduled at the
end of reading week
–Finally, a project writeup written in research
paper style
7
Grading
•
Grade Formula
• Homeworks and/or Quiz – 25%
• Class Participation – 25%
• Project (including final project presentation)–
50%
8
Topics of CS246
•
•
Introduction to Computer Architecture and
Power-Aware Computing
Modern CPU Design
•
•
•
•
•
•
•
•
•
•
•
Deep pipelines/Multiple Issue
Dynamic Scheduling/Speculative Execution
Memory Hierarchy Design
Pentium Architecture Case Study
Multiprocessors and Multithreading
Embedded computing devices
Dynamic Frequency/Voltage Scaling
Thermal-aware processor design
Power-related reliability issues
System-Level Power Issues
Software approaches power management
9
Readings from Previous Years
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
High-level Power modeling (power abstractions)
Power Measurement for OS control
Temperature Aware Computing
Di/Dt Modeling
Leakage Power Modeling and Control
Frequency and Voltage Scheduling Algorithms
Power in Data Centers, Google Cluster Architecture
Dynamic Power Reduction in Memory
Disk Power Modeling/Management
Application and Compiler Power Management
Dynamic adaptation for Multimedia applications
Architectures for wireless sensor devices
Low-power routing algorithms for wireless networked devices
Intel XScale Microprocessor, IBM Watchpad
Human Powered Computing
10
Why take CS246?
•
•
Learn how modern computing hardware
works
Understand where computing hardware is
going in the future
• And learn how to contribute to this future…
•
How does this impact system software
and applications?
• Essential to understand OS/compilers/PL
• For everyone else, it can help you write better
code!
•
How are future technologies going to
impact computing systems?
11
Architectural Inflection Point
•
Computer architects strive to give
maximum performance with programmer
abstraction
• Compilers, OS part of this abstraction
• e.g. Pipelining, superscalar, speculative
execution, branch prediction, caching, virtual
memory…
•
Technology has brought us to an
inflection point
• Multiple processors on a single chip -- Why?
–Design complexity, ILP/pipelining-limits,
power dissipation, etc
• How to provide the abstraction?
• Some burden will shift back to programmers
12
Topics of Study
•
•
•
•
Focus on what modern computer architects
worry about (both academia and industry)
Get through the basics of modern
processor design
Look at technology trends: multithreading,
CMP, power-, reliability-aware design
Recent research ideas, and the future of
computing hardware
13
What is Computer Architecture?
Operating
System
Applications
(AI, DB,
Graphics)
Software
Instruction Set Architecture
Microarchitecture
System Architecture
Technology
Trends
Prog. Lang,
Compilers
Application
Trends
Hardware
VLSI/Hardware
Implementations
14
Application Areas
•
General-Purpose Laptop/Desktop
• Productivity, interactive graphics, video, audio
• Optimize price-performance
• Examples: Intel Pentium 4, AMD Athlon XP
•
Embedded Computers
• PDAs, cell-phones, sensors => Price, Energy
efficiency, Form-Factor
• Examples: Intel XScale, StrongARM (SA-110)
• Game Machines, Network uPs => PricePerformance
• Examples: Sony Emotion Engine, IBM 750FX
15
Application Areas
•
Commercial Servers
• Database, transaction processing, search
engines
• Performance, Availability, Scalability
• Server downtime could cost a brokerage
company more than $6M/hour
• Examples: Sun Fire 15K, IBM p690, Google
Cluster
•
Scientific Applications
• Protein Folding, Weather Modeling, CompBio,
Defense
• Floating-point arithmetic, Huge Memories
• Examples: IBM DeepBlue, BlueGene, Cray T3E,
etc.
16
Moore’s Law
•
Every 18-24 months
• Feature sizes shrink by 0.7x
• Number of transistors per die increases by 2x
• Speed of transistors increases by 1.4x
•
•
But we are starting to hit some
roadblocks…
Also, what to do with all of these
transistors???
17
Moore’s Law (Tx Density)
Transistors Per Die
1,000,000,000
Itanium2
100,000,000
Itanium
Pentium
III
10,000,000
PentiumII
Pentium
1,000,000
486
386
100,000
80286
8086
10,000
4004
1,000
1970
1980
1990
2000
2010
18
How have we used these transistors?
•
More functionality on one chip
• Early 1980s – 32-bit microprocessors
• Late 1980s – On Chip Level 1 Caches
• Early/Mid 1990s – 64-bit microprocessors,
superscalar (ILP)
• Late 1990s – On Chip Level 2 Caches
• Early/Mid 2000s – Chip Multiprocessors, On Chip
Level 3 Caches
•
What is next?
• How much more cache can we put on a chip?
(Itanium2)
• How many more cores can we put on a chip?
(Niagara, etc)
• What else can we put on chips?
19
Moore’s Law (Performance)
Relative Performance
1600
1400
1200
1000
Pentium III
Why has Performance
Outstripped Moore’s
Law?
800
DEC
Alpha
600
400
200
0
1984
1.58x per year
MIPS
R2000
1986 1988
IBM
Power1
1990
DEC
HP
Alpha
9000
1992 1994
HP 9000
1.35x per year
1996 1998
2000
20
Performance vs. Technology
Scaling
•
Architectural Innovations
• Massive pipelining (good and bad!)
• Huge caches
• Branch Prediction, Register Renaming, OOOissue/execution, Speculation (hardware/software
versions of all of these)
•
Circuit/Logic Innovations
• New logic circuit families (dynamic logic)
• Better CAD tools
• Advanced computer arithmetic
21
Power Dissipation Trends
2
Power Density (W/cm )
1000
Nuclear Reactor
100
Pentium 4 (Prescott)
Pentium 4
Pentium 3
Hot Plate
Pentium 2
10
Pentium Pro
Pentium
1
1980
386
486
1990
2000
2010
22
Power Issues in Microprocessors
Capacitive (Dynamic) Power
Static (Leakage) Power
Vdd
Vin
VIN
Vout
VOUT
IGate
CL
ISub
CL
Di/Dt (Vdd/Gnd Bounce)
Voltage (V)
Current (A)
Temperature
20 cycles
Minimum Voltage
23
Temperature/di-dt-Constrained
Power-Aware Computing Applications
Energy-Constrained Computing
24
The Battery Gap
Diverging Gap Between Actual Battery Capacities and Energy Needs
5000
10kbps
64kbps
384kbps
Interactive
Energy (mAh)
4000
3000
2000
2Mbps
Downlink
dominated
PIM, SMS,
Voice
Mobile videoConferencing,
Collaboration
Video email,
Voice recognition,
Mobile commerce
Fuel Cells
Web browser,
MMS, Video clips
1000
Lithium
Polymer
Lithium Ion
0
2000
2001
2002
2003
2004
2005
2006
2007
Battery
capacity
(mAh)
Energy
requirement
(mAh)
Source:
Anand
Raghunathan,
NEC Labs
25
Where does the juice go in laptops?
•
Others have measured ~55% processor
increase under max load in laptops
[Hsu+Kremer, 2002]
26
Packaging cost
From Cray (local power generator and refrigeration)…
Source: Gordon Bell, “A Seymour Cray perspective”
http://www.research.microsoft.com/users/gbell/craytalk/
27
Packaging cost
To today…
• IBM S/390: refrigeration:
•
Provides performance (2% perf for 10ºC) and
reliability
Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling”
28
IBM Journal of R&D
Intel Itanium packaging
Complex and expensive (note heatpipe)
Source: H. Xie et al. “Packaging the Itanium Microprocessor”
Electronic Components and Technology Conference 2002
29
P4 packaging
•
Simpler, but still…
Total Power-Related
PC System Cost ($)
40
30
20
10
0
0
10
20
30
40
Power (Watts)
From Tiwari, et al., DAC98
Source: Intel web site
30
31
Cooking Aware Computing
32
Server Farms
•
Internet data centers are like heavy-duty
factories
• e.g. small Datacenter 25,000 sq.feet, 8000
servers, 2MegaWatts
• Intergate Datacenter, Tukwila, WA: 1.5 Mill. Sq.Ft,
~500 MW
• Wants lowest net cost per server per sq foot of
data center space
•
Cost driven by:
•
•
•
•
•
Racking height
Cooling air flow
Power delivery
Maintenance ease (access, weight)
25% of total cost due to power
33
Environment
•
Environment Protection Agency (EPA): computers
consume 10% of commercial electricity
consumption
•
•
•
•
•
•
•
•
•
This incl. peripherals, possibly also manufacturing
A DOE report suggested this percentage is much lower
(3.0-3.5%)
No consensus, but it’s still a lot
Interesting to look at the numbers:
– http://enduse.lbl.gov/projects/infotech.html
Data center growth was cited as a contribution to the
2000/2001 California Energy Crisis
Equivalent power (with only 30% efficiency) for AC
CFCs used for refrigeration
Lap burn
Fan noise
34
Now we know why power is important
•
What can we do about it?
•
Two components to the problem:
• #1: Understand where and why power is
dissipated
• #2: Think about ways to reduce it at all levels of
computing hierarchy
• In the past, #1 is difficult to accomplish except
at the circuit level
• Consequently most low-power efforts were all
circuit related
35
Modeling + Design
•
First Component (Modeling/Measurement):
• Come up with a way to:
–Diagnose where power is going in your
system
–Quantify potential savings
•
Second Component (Design)
• Try out lots of ideas
• Or characterize tradeoffs of ideas…
•
This class will focus on both of these at
many levels of the computing hierarchy
36
Next Time
•
•
Course website:
http://www.eecs.harvard.edu/~dbrooks/cs246
Some readings for next week:
•
•
•
•
Web Page Browsing:
•
•
•
“Power-Aware Microarchitecture: Design and Modeling
Challenges for Next-Generation Microprocessors,” IEEE
MICRO.
“Power: A First-Class Architectural Design Constraint,”
IEEE Computer
“Thermal Crisis: Challenges and Potential Solutions”
Information Technology and Resource Use
– http://enduse.lbl.gov/projects/infotech.html
Sun’s ECO-responsible data center
– http://www.sun.com/aboutsun/environment/epa.jsp
Questions?
37
Download