MIT OpenCourseWare 6.189 Multicore Programming Primer, January (IAP) 2007

advertisement
MIT OpenCourseWare
http://ocw.mit.edu
6.189 Multicore Programming Primer, January (IAP) 2007
Please use the following citation format:
Saman Amarasinghe, 6.189 Multicore Programming Primer, January
(IAP) 2007. (Massachusetts Institute of Technology: MIT
OpenCourseWare). http://ocw.mit.edu (accessed MM DD, YYYY).
License: Creative Commons Attribution-Noncommercial-Share Alike.
Note: Please use the actual date you accessed this material in your citation.
For more information about citing these materials or our Terms of Use, visit:
http://ocw.mit.edu/terms
6.189 IAP 2007
Lecture 1
Multicore Programming Primer
and Programming Competition
Introduction
Prof. Saman Amarasinghe, MIT.
1
6.189 IAP 2007 MIT
The “Software Crisis”
“To put it quite bluntly: as long as there were no
machines, programming was no problem at all;
when we had a few weak computers,
programming became a mild problem, and now
we have gigantic computers, programming has
become an equally gigantic problem."
-- E. Dijkstra, 1972 Turing Award Lecture
Prof. Saman Amarasinghe, MIT.
2
6.189 IAP 2007 MIT
The First Software Crisis
● Time Frame: ’60s and ’70s
● Problem: Assembly Language Programming
• Computers could handle larger more complex programs
● Needed to get Abstraction and Portability without
losing Performance
Prof. Saman Amarasinghe, MIT.
3
6.189 IAP 2007 MIT
How Did We Solve the First Software Crisis?
● High-level languages for von-Neumann machines
• FORTRAN and C
● Provided “common machine language” for
uniprocessors
Common Properties
Single flow of control
Single memory image
Differences:
Register File
ISA
Functional Units
Prof. Saman Amarasinghe, MIT.
4
6.189 IAP 2007 MIT
The Second Software Crisis
● Time Frame: ’80s and ’90s
● Problem: Inability to build and maintain complex and
robust applications requiring multi-million lines of code
developed by hundreds of programmers
• Computers could handle larger more complex programs
● Needed to get Composability, Malleability and
Maintainability
• High-performance was not an issue • left for Moore’s Law
Prof. Saman Amarasinghe, MIT.
5
6.189 IAP 2007 MIT
How Did We Solve the
Second Software Crisis?
● Object Oriented Programming
• C++, C# and Java
● Also…
• Better tools
–
• Component libraries, Purify
Better software engineering methodology
–
Design patterns, specification, testing, code reviews
Prof. Saman Amarasinghe, MIT.
6
6.189 IAP 2007 MIT
Today: Programmers are Oblivious to Processors
● Solid boundary between Hardware and Software
● Programmers don’t have to know anything about the
processor
•
High level languages abstract away the processors
–
•
Ex: Java bytecode is machine independent
Moore’s law does not require the programmers to know
anything about the processors to get good speedups
● Programs are oblivious of the processor • work on all
processors
•
A program written in ’70 using C still works and is much faster
today
● This abstraction provides a lot of freedom for the
programmers
Prof. Saman Amarasinghe, MIT.
7
6.189 IAP 2007 MIT
The Origins of a Third Crisis
● Time Frame: 2005 to 20??
● Problem: Sequential performance is left behind by Moore’s law
● Needed continuous and reasonable performance improvements •
•
to support new features
to support larger datasets
● While sustaining portability, malleability and maintainability
without unduly increasing complexity faced by the programmer
• critical to keep-up with the current rate of evolution in software
Prof. Saman Amarasinghe, MIT.
8
6.189 IAP 2007 MIT
The March to Multicore:
Moore’s Law
Image removed due to copyright restrictions.
Graph of number of transistors versus year. From Hennessy, J. L., D. A.
Patterson, and A. C. Arpaci-Dusseau. Computer Architecture: A Quantitative
Approach. 4th ed. Amsterdam, The Netherlands: Morgan Kaufmann, 2006.
ISBN: 9780123704900.
Prof. Saman Amarasinghe, MIT.
9
6.189 IAP 2007
From MIT
David Patterson
The March to Multicore:
Uniprocessor Performance (SPECint)
i nt el 386
i nt el 486
i nt el pent i um
i nt el pent i um 2
10000.00
Specint2000
i nt el pent i um 3
i nt el pent i um 4
i nt el i t ani um
A l pha 21064
A l pha 21164
1000.00
A l pha 21264
Spar c
Super Spar c
Spar c 64
100.00
M i ps
HP P A
P ower P C
AMD K6
AMD K7
10.00
A M D x86-64
1.00
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07
Prof. Saman Amarasinghe, MIT.
10
6.189 IAP 2007 MIT
The March to Multicore:
Uniprocessor Performance (SPECint)
● General-purpose unicores have stopped historic
performance scaling
• • • • Power consumption
Wire delays
DRAM access latency
Diminishing returns of more instruction-level parallelism
Prof. Saman Amarasinghe, MIT.
11
6.189 IAP 2007
From MIT
David Patterson
Power Consumption (watts)
i ntel 386
i ntel 486
1000
Power
i ntel penti um
i ntel penti um2
i ntel penti um3
i ntel penti um4
i ntel i tani um
Al pha 21064
100
Al pha 21164
Al pha 21264
Spar c
Super Spar c
Spar c64
Mi ps
10
HP PA
Power PC
AMD K6
AMD K7
AMD x86-64
1
85
87
89
Prof. Saman Amarasinghe, MIT.
91
93
95
12
97
99
01
03
05
07
6.189 IAP 2007 MIT
Power Efficiency (watts/spec)
0.7
intel 386
intel 486
0.6
intel pentium
intel pentium 2
intel pentium 3
0.5
intel pentium 4
intel itanium
Watts/Spec
A lpha 21064
0.4
A lpha 21164
A lpha 21264
Sparc
0.3
SuperSparc
Sparc64
M ips 0.2
HP P A
P o wer P C A M D K6 0.1
A M D K7 A M D x86-64
0
1982
1984
1987
1990
1993
1995
1998
2001
2004
2006
Year
Prof. Saman Amarasinghe, MIT.
13
6.189 IAP 2007 MIT
Range of a Wire in One Clock Cycle
0.28
0.26
0.24
• 400 mm2 Die
• From the SIA Roadmap
700 MHz
0.22
Process (microns)
0.2
1.25 GHz
0.18
0.16
0.14
2.1 GHz
0.12
0.1
0.08
6 GHz
10 GHz
0.06
13.5 GHz
0.04
0.02
0
1996
1998
2000
Prof. Saman Amarasinghe, MIT.
2002
2004
2006
Year
14
2008
2010
2012
6.189 IAP 2007 MIT
2014
DRAM Access Latency
Images removed due to copyright restrictions.
● Access times are a speed
of light issue
● Memory technology is also
changing
SRAM are getting harder to scale
• DRAM is no longer
cheapest cost/bit
● Power efficiency is an issue here as well
µProc
60%/yr.
(2X/1.5yr)
DRAM
9%/yr.
(2X/10 yrs)
•
Performance
1000000
10000
100
Year
Prof. Saman Amarasinghe, MIT.
15
6.189 IAP 2007 MIT
20
04
20
02
20
00
19
98
19
96
19
94
19
92
19
90
19
88
19
86
19
84
19
82
19
80
1
Diminishing Returns ● The ’80s: Superscalar expansion
• • 50% per year improvement in performance
Transistors applied to implicit parallelism
–
pipeline processor (10 CPI --> 1 CPI)
● The ’90s: The Era of Diminishing Returns
• Squeaking out the last implicit parallelism
–
–
• • 2-way to 6-way issue, out-of-order issue, branch prediction
1 CPI --> 0.5 CPI
performance below expectations
projects delayed & canceled
● The ’00s: The Beginning of the Multicore Era
• The need for Explicit Parallelism
Prof. Saman Amarasinghe, MIT.
16
6.189 IAP 2007 MIT
Unicores are on the verge of extinction
Multicores are here
MIT Raw
16 Cores
Since 2002
Intel Montecito
1.7 Billion transistors
Dual Core IA/64
Intel Pentium D
(Smithfield)
Intel Tanglewood
Dual Core IA/64
Intel Dempsey
Dual Core Xeon
Intel Pentium Extreme
3.2GHz Dual Core
Cancelled
Intel Tejas & Jayhawk
Unicore (4GHz P4)
Intel Yonah
Dual Core Mobile
AMD Opteron
Dual Core
Sun Olympus and Niagara
8 Processor Cores
IBM Cell
Scalable Multicore
IBM Power 6
Dual Core
IBM Power 4 and 5
Dual Cores Since 2001
…
2H 2004
1H 2005
Prof. Saman Amarasinghe, MIT.
2H 2005
17
1H 2006
2H 2006
6.189 IAP 2007 MIT
Multicores are Here
512
Picochip
PC102
256
Ambric
AM2045
Cisco
CSR-1
128
Intel
Tflops
64
32
# of
cores 16
Raw
8
Niagara
Boardcom 1480
4
2
1
Raza
XLR
4004
8080
8086
286
386
486
Pentium
8008
1970
1975
Prof. Saman Amarasinghe, MIT.
1980
1985
1990
18
Cavium
Octeon
Cell
Opteron 4P
Xeon MP
Xbox360
PA-8800 Opteron
Tanglewood
Power4
PExtreme Power6
Yonah
P2 P3 Itanium
P4
Itanium 2
Athlon
1995
2000
2005
20??
6.189 IAP 2007 MIT
Requirements and Outcomes
● Requirements
•
•
A good programmer with experience
Fluent in C
● Outcomes
•
•
•
•
Know fundamental concepts of parallel programming
(both hardware and software)
Understand issues of parallel performance
Able to synthesize a fairly complex parallel program
Hands-on experience with the IBM Cell processor
Prof. Saman Amarasinghe, MIT.
19
6.189 IAP 2007 MIT
The Project
● You proposed the projects
● We selected 7 teams
• Mainly by the strength of the project proposals
● Seven Great Projects
• • • • • • • Distributed Real-time Ray Tracer
Global Illumination
Linear Algebra Pack
Molecular Dynamics Simulator
Speech Synthesizer
Soft Radio
Backgammon Tutor
● Project Characteristics
• • • Ambitious but accomplishable
Important and Relevant
Opportunity to sizzle
Courtesy of Sony Computer Entertainment Inc.
Used with permission.
● Get them started ASAP!
Prof. Saman Amarasinghe, MIT.
20
6.189 IAP 2007 MIT
A Note of Caution
●
●
●
●
●
Cell processor is very new
It is not an easy architecture to work with
The tool chain is thin and brittle
Most of the staff have limited experience
Projects you are doing are of your own making.
They aren’t canned exercises that are tried and proven.
● You will face unexpected problems.
● WE ARE ALL IN THIS TOGETHER!!
Prof. Saman Amarasinghe, MIT.
21
6.189 IAP 2007 MIT
Grading
● Mini Quizzes
• • 16%
At the beginning of each class day
5 minutes each
● Lab Projects
24%
● Final Group Project
60%
Prof. Saman Amarasinghe, MIT.
22
6.189 IAP 2007 MIT
Final Competition ● The competition will be decided on
• • • • Performance
Completeness
Algorithmic complexity
Demo and Presentation
● The winning team will
• • Get gift certificates ($150 each)
Be invited to IBM TJ Watson Research Center for a day
–
–
Tour of the facilities
Present your project
Prof. Saman Amarasinghe, MIT.
23
6.189 IAP 2007 MIT
Staff
● Prof. Saman Amarasinghe
• • • • Interested in languages, compilers and computer architecture
Raw Processor (with Prof. Anant Agarwal)
StreamIt language
SUIF parallelizing compiler
● Dr. Rodric Rabbah
• • • Currently a researcher at IBM Watson Research Center
Was a research scientist at CSAIL before that
Interested in compilers, computer architecture and FPGAs
Prof. Saman Amarasinghe, MIT.
24
6.189 IAP 2007 MIT
Guest Lectures
● Dr. Michael Perrone
• • IBM Watson Research Center
Expert in Cell Architecture and Application Development
● Prof. Alan Edelman
• Math and CS. Interested in parallel algorithms
● Prof. Arvind
• Parallel architectures, compilers and languages
● Dr. Bradley Kuszmaul
• Research scientist at CSAIL working on Cilk
● Mike Acton
• Professional game developer
● Bill Thies
• • CSAIL PhD candidate
Architect of StreamIt
Prof. Saman Amarasinghe, MIT.
25
6.189 IAP 2007 MIT
Lecture Organization
Extracting Parallelism
Implicit
Explicit
Hardware
Compiler
Languages
Superscalar
Processors
Parallelizing
Compilers
StreamIt (Lecture 8)
Star-P (Lecture 13)
BlueSpec (Lecture 14)
Cilk (Lecture 15)
(start of Lecture 3)
(Lectures 11 & 12)
Prof. Saman Amarasinghe, MIT.
26
Library
Concurrency
(Lecture 4)
Design Patterns
(Lectures 5,6 7)
6.189 IAP 2007 MIT
Schedule
Monday
10:00 – Lecture 1: Course
Introduction
10:55
Jan
8
Jan
15
Tuesday
Recitation 1: Getting
to Know Cell
Jan
29
Lecture 3:
Introduction to
Parallel
Architectures
11:05 – Lecture 2:
Introduction to Cell
12:00
Processor
Lecture 4:
Introduction to
Concurrent
Programming
10:00 –
10:55
Lecture 7: Design
Patterns for Parallel
Programming II
Holiday
Recitation 2-3: Cell
Programming
Hands-On
11:05 –
12:00
Jan
22
Wednesday
10:00 – Lecture 11: Classic
Parallelizing
10:55
Compilers
Lecture 12: StreamIt
11:05 –
Parallelizing
12:00
Compiler
Recitation 5, 6: Cell
Performance
Monitoring Tools
Thursday
Friday
Lecture 5: Parallel
Programming
Concepts
Project Reviews
Lecture 6: Design
Patterns for Parallel
Programming I
Lecture 9:
Debugging and
Performance
Monitoring
Recitation 4: Cell
Debugging Tools
Lecture 8: StreamIt
Language
Lecture 10:
Performance
Optimizations
Lecture 13: Star-P
Lecture 15: Cilk
Lecture 14:
Synthesizing Parallel
Programs
Lecture 16: Anatomy
of a Game
10:00 – Lecture 17: The Raw
Experience
10:55
Group Presentations
Awards & Reception
11:05 –
18: The Future
12:00
Prof. Saman Amarasinghe, MIT.
27
6.189 IAP 2007 MIT
Download