Multicore Chips and Parallel Programming Mary Hall Dept. of Computer Science and Information

advertisement
Multicore Chips and Parallel
Programming
Mary Hall
Dept. of Computer Science and Information
Sciences Institute
March 18, 2008
SSE Meeting
1
The Multicore Paradigm Shift:
Technology Drivers
March 18, 2008
SSE Meeting
2
Part 1: Technology Trends
What to do with all these transistors?
• Key ideas:
– Movement away from increasingly complex
processor design and faster clocks
– Replicated functionality (i.e., parallel) is
simpler to design
– Resources more efficiently utilized
– Huge power management advantages
March 18, 2008
SSE Meeting
3
The Architectural Continuum
Supercomputer:
IBM BG/L
Commodity Server:
Sun Niagara
Embedded:
Xilinx Virtex 4
March 18, 2008
SSE Meeting
4
Multicore:
Impact on Software
Consequences:
– Individual processors will no longer get faster. At
first, they might get a little slower.
– Today’s software may not perform as well on
tomorrow’s hardware as written.
• And forget about adding capability!
The very future of the computing industry
demands successful strategies for applications
to exploit parallelism across cores!
March 18, 2008
SSE Meeting
5
The Multicore Paradigm Shift:
Computing Industry Perspective
We are at the cusp of a transition to
multicore, multithreaded architectures,
and we still have not demonstrated the
ease of programming the move will
require… I have talked with a few people at
Microsoft Research who say this is also at
or near the top of their list [of critical CS
research problems].
Justin Rattner, CTO, Intel Corporation
March 18, 2008
SSE Meeting
6
The Rest of this Talk
•
•
•
Convergence of high-end, conventional and embedded computing
– Application development and compilation strategies for high-end
(supercomputers) are now becoming important for the masses
Why?
– Technology trends (Motivation)
Looking to the future
1. Automatically generating parallel code is useful, but insufficient.
2. Parallel computing for the masses demands better parallel
programming paradigms.
3. Compiler technology will become increasingly important to deal
with a diversity of optimization challenge… and must be engineered
for managing complexity and adapting to new architectures.
4. Potential to exploit vast machine resources to automatically
compose applications and systematically tune application
performance.
5. New tunable library and component technology.
March 18, 2008
SSE Meeting
7
1. Automatic Parallelization
From Hall et al., “Maximizing Multiprocessor Performance with the SUIF
Compiler”, IEEE Computer, Dec. 1996.
• Old approaches:
– Limited to loops and array computations
– Difficult to find sufficient granularity (parallel work between
synchronization)
– Success from fragile, complex software
• New ideas in this area:
– Finer granularity of parallelism -- more plentiful
– Combine with hardware support (e.g., speculation and multithreading)
March 18, 2008
SSE Meeting
8
2. Parallel Programming
State of the Art
Three dominant classes of applications
Domains
Appl. Characteristics Programming
Paradigms
Scientific
Computing
Very large arrays
MPI dominant,
representing simulation
Also, OpenMP, PGAS
region, loops, data parallel Grids & distributed
computing
Databases
Queries over large data
sets, often distributed
Query languages like
SQL
Systems and
Embedded
Software
Fine-grain threads, small
number of processors
Low-level threading
such as Pthreads
Domain-specific, intellectually challenging and low-level
March 18, 2008
SSE Meeting
9
programming models not suitable for the masses.
2. New Parallel Programming
Paradigms
• Transactional memory
– Section of code executes atomically with subsequent
commit or rollback
– Programming model + hardware support
• Streams and data-parallel models
– Data streams describe the flow of data
– Well-suited for certain applications and hardware
(IBM Cell, GPUs)
• Domain-specific languages and libraries
– Parallelism implicit within implementation
Different applications and users demand different
solutions. Convergence unlikely. Architecture
March 18, 2008
SSE Meeting
independence?
10
3. Engineering a Compiler
• Compiler research will play a crucial role in
achieving performance and programmability
of multi-core hardware.
• What is the state of compilers today?
– Roughly 5 year lag between introducing a new
architecture and a robust compiler
– Many interesting new architectures fail in the
marketplace due to inadequate software tools
• Today’s compilers are complex and monolithic
– SUIF has ~500K LOC, Open64 has ~12M LOC
The best research ideas do not always
make it into practice
March 18, 2008
SSE Meeting
11
3. A New Kind of “Compiler”
Traditional view:
code
Batch
Compiler
input data
March 18, 2008
SSE Meeting
12
3 & 4. Performance Tuning
“Compiler”
transformation
script(s)
code
Experiments Engine
Code
Translation
input data
(characteristics)
search script(s)
March 18, 2008
SSE Meeting
13
4. Auto-tuner
Experiments Engine
code
transformation
script(s)
Code
Translation
input data
(characteristics)
March 18, 2008
search script(s)
SSE Meeting
14
Heterogeneous:
Additional Complexity
Other:
• Utilizing highly
tuned libraries
• Differences in
programming models
(GPP +FPGA is
extreme example)
Device
Type 1
Memory
Staging
Data to/from
global
memory
Managing data
movement and
synchronization
Device
Type 2
Device
Type 3
Device
Type 4
Partitioning:
Where to execute?
March 18, 2008
SSE Meeting
15
5. Libraries and Component
Technology
Expanded
View
Traditional
View
Interface:
Provides/
Requires
Interface:
Abstract Provides/
Requires
Code
(source or binary)
Partial Code
(source or tunable binary)
Performance:
Device,
Data Features
Code
Generator
Data Description:
Types, Sizes
Interface:
Device
Dependencies
Data Description:
Types, Sizes
Data Description:
Map Features to
Optimization
Support for automatic selection, tuning, scheduling, etc.
March 18, 2008
SSE Meeting
16
Summary
• Parallel computing is everywhere!
– And we need software tools
– Can we find some common ground?
• Strategies
– Automatic parallelization
– Libraries and domain-specific tools that hide parallelism
component technology
– New programming languages
– Auto-tuners to “test” alternative solutions
• General approach to solving challenges
– Education: CS503, Parallel Programming
– Organize the community to support incremental LONG
TERM development.
March 18, 2008
SSE Meeting
17
Download