slides on algorithm design slides

advertisement
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Programming
in C with MPI and OpenMP
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Programming
in C with MPI and OpenMP
Michael J. Quinn
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Algorithm design and
basic algorithms
Slides are modified from those found in
Parallel Programming in C with MPI and
OpenMP, Michael Quinn
3
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
n
Tuesday, April 14, 15
Task/channel model
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
Task/channel model
n Algorithm design methodology
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
Task/channel model
n Algorithm design methodology
n Case studies
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n
Tuesday, April 14, 15
Parallel computation = set of tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Parallel computation = set of tasks
n Task
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Parallel computation = set of tasks
n Task
u Program
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Parallel computation = set of tasks
n Task
u Program
u Local memory
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Parallel computation = set of tasks
n Task
u Program
u Local memory
u Collection of I/O ports
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Parallel computation = set of tasks
n Task
u Program
u Local memory
u Collection of I/O ports
n Tasks interact by sending messages through
channels
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
channel
task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
n
Tuesday, April 14, 15
Partitioning
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
Partitioning
n Communication
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
Partitioning
n Communication
n Agglomeration
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
Partitioning
n Communication
n Agglomeration
n Mapping
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Methodology
Problem
Partitioning
Communication
Mapping
Tuesday, April 14, 15
Agglomeration
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
u Divide data into pieces
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
Functional decomposition
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
Functional decomposition
u Divide computation into pieces
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
n
n
Tuesday, April 14, 15
Dividing computation and data into pieces
Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
Functional decomposition
u Divide computation into pieces
u Determine how to associate data with the
computations
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example Domain Decompositions
1-D
2-D
3-D
Tuesday, April 14, 15
Primitive tasks
is the number of
scope, or order
of magnitude, of
the parallelism.
1-D has, in the
example, n-way
||ism along the
n-faces, 2-D has
n^2 ||ism along
the faces, and 3way has n^3 ||
ism along the
faces.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example Functional Decomposition
Track position of
instruments
Determine image
location
Register image
Determine image
location
Display image
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n
Numerical algorithms often have data-parallelism
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n
n
Numerical algorithms often have data-parallelism
Non-numerical algorithms often have functional
parallelism.
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n
n
n
Numerical algorithms often have data-parallelism
Non-numerical algorithms often have functional
parallelism.
Many algorithms, especially complex numerical
algorithms, have both, e.g., data parallelism
within an function, many functions that can be
done in parallel.
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n
n
n
n
Numerical algorithms often have data-parallelism
Non-numerical algorithms often have functional
parallelism.
Many algorithms, especially complex numerical
algorithms, have both, e.g., data parallelism
within an function, many functions that can be
done in parallel.
Functional parallelism often scales worse with
increasing data size (concurrency-limited in
isoefficiency terms)
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
n
Tuesday, April 14, 15
At least 10x more primitive tasks than
processors in target computer
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
n Primitive tasks roughly the same size
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
n Primitive tasks roughly the same size
n Number of tasks an increasing function of
problem size
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
Tuesday, April 14, 15
Determine values passed among tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
u Task needs values from a small number of other
tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
Global communication
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
Global communication
u Significant number of tasks contribute data to
perform a computation
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n
n
n
Tuesday, April 14, 15
Determine values passed among tasks
Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
Global communication
u Significant number of tasks contribute data to
perform a computation
u Don’t create channels for them early in design
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
n
Tuesday, April 14, 15
Communication operations balanced among
tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
n Tasks can perform communications
concurrently
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
n Tasks can perform communications
concurrently
n Task can perform computations
concurrently
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n
Tuesday, April 14, 15
Grouping tasks into larger tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Grouping tasks into larger tasks
n Goals
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Grouping tasks into larger tasks
n Goals
u Improve performance
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Grouping tasks into larger tasks
n Goals
u Improve performance
u Maintain scalability of program
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Grouping tasks into larger tasks
n Goals
u Improve performance
u Maintain scalability of program
u Simplify programming
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Grouping tasks into larger tasks
n Goals
u Improve performance
u Maintain scalability of program
u Simplify programming
n In MPI programming, goal often to create
one agglomerated task per processor
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
n
Eliminate communication between
primitive tasks agglomerated into
consolidated task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
Eliminate communication between
primitive tasks agglomerated into
consolidated task
n Combine groups of sending and receiving
tasks
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Data replication doesn’t affect scalability
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Data replication doesn’t affect scalability
Agglomerated tasks have similar computational
and communications costs
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
n
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Data replication doesn’t affect scalability
Agglomerated tasks have similar computational
and communications costs
Number of tasks increases with problem size
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
n
n
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Data replication doesn’t affect scalability
Agglomerated tasks have similar computational
and communications costs
Number of tasks increases with problem size
Number of tasks suitable for likely target systems
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n
n
n
n
n
n
n
Tuesday, April 14, 15
Locality of parallel algorithm has increased
Replicated computations take less time than
communications they replace
Data replication doesn’t affect scalability
Agglomerated tasks have similar computational
and communications costs
Number of tasks increases with problem size
Number of tasks suitable for likely target systems
Tradeoff between agglomeration and code
modifications costs is reasonable
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n
Tuesday, April 14, 15
Process of assigning tasks to processors
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
u Maximize processor utilization
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
u Maximize processor utilization
u Minimize interprocessor communication
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Example
While this may reduce communication, load
balance may be an issue
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n
Tuesday, April 14, 15
Finding optimal mapping is NP-hard
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Finding optimal mapping is NP-hard
n Must rely on heuristics
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
u Minimizes the number of edges between
nodes in a graph
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
u Minimizes the number of edges between
nodes in a graph
u Edges, for our purposes, can be thought
of as communication
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
• Use a static load balancing algorithm
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n
n
Tuesday, April 14, 15
Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
• Use a static load balancing algorithm
Dynamic number of tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
Static number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
u Frequent communications between tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
u Frequent communications between tasks
t Use a dynamic load balancing algorithm
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
u Frequent communications between tasks
t Use a dynamic load balancing algorithm
u Many short-lived tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
u Frequent communications between tasks
t Use a dynamic load balancing algorithm
u Many short-lived tasks
t Use a run-time task-scheduling algorithm
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n
n
Static number of tasks
Dynamic number of tasks
u Frequent communications between tasks
t Use a dynamic load balancing algorithm
u Many short-lived tasks
t Use a run-time task-scheduling algorithm
t Cilk and Galois, discussed in the next couple
of weeks, do this.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
n
Tuesday, April 14, 15
Considered designs based on one task per
processor and multiple tasks per processor
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
n If dynamic task allocation chosen, task
allocator is not a bottleneck to performance
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
n If dynamic task allocation chosen, task
allocator is not a bottleneck to performance
n If static task allocation chosen, ratio of tasks
to processors is at least 10:1
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
n
Tuesday, April 14, 15
Boundary value problem
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
Boundary value problem
n Finding the maximum
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
Boundary value problem
n Finding the maximum
n The n-body problem
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
Boundary value problem
n Finding the maximum
n The n-body problem
n Adding data input
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Boundary Value Problem
Ice water
Tuesday, April 14, 15
Rod
Insulation
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Rod Cools as Time Progresses
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
time
position
Tuesday, April 14, 15
28
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
time
n
Each circle
represents a
computation
position
Tuesday, April 14, 15
28
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
time
position
29
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
time
Temperature at
time t+1 for a
position on the rod
represented by a
node depends on
the temperature of
neighbors at time t
position
29
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
time
n
One data item
per grid point
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
time
n
One data item
per grid point
Associate one
primitive task
with each grid
point
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
time
n
n
One data item
per grid point
Associate one
primitive task
with each grid
point
Twodimensional
domain
decomposition
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
time
n
Identify
communication
pattern between
primitive tasks
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
time
n
n
Identify
communication
pattern between
primitive tasks
Each interior
primitive task has
three incoming
and three
outgoing channels
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration and Mapping
(a)
(b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration and Mapping
Agglomeration
(a)
(b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
n
Tuesday, April 14, 15
χ – time to update element
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
χ – time to update element
n n – number of elements
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
χ – time to update element
n n – number of elements
n m – number of iterations
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
χ – time to update element
n n – number of elements
n m – number of iterations
n Sequential execution time: m (n-1) χ
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n
Tuesday, April 14, 15
χ – time to update element
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
χ – time to update element
n n – number of elements
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
χ – time to update element
n n – number of elements
n m – number of iterations
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
n λ – message latency
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
n λ – message latency
n Parallel execution time m(χ⎡(n-1)/p⎤+2λ)
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding the Maximum Error from
measured data
Need to do
a reduction.
Tuesday, April 14, 15
6.25%
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Reduction Evolution
n -1 tasks
(a)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Reduction Evolution
n /2 -1 tasks
n /2-1 tasks
(b)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
(a)
(b)
Parallel Reduction Evolution
n /4 - 1 tasks
n /4 -1 tasks
n /4 - 1 tasks
n /4 -1 tasks
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Subgraph of hypercube
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
4
2
0
7
-3
5
-6
-3
8
1
2
3
-4
4
6
-1
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
4
2
0
7
-3
5
-6
-3
8
1
2
3
-4
4
6
-1
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
1
7
-6
4
4
5
8
2
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
1
7
-6
4
4
5
8
2
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
Tuesday, April 14, 15
8
-2
9
10
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
Tuesday, April 14, 15
8
-2
9
10
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
17
Tuesday, April 14, 15
8
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
17
Tuesday, April 14, 15
8
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
25
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
Binomial Tree
25
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
Tuesday, April 14, 15
sum
sum
sum
sum
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
Tuesday, April 14, 15
sum
sum
sum
sum
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n
Tuesday, April 14, 15
Domain partitioning
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Domain partitioning
n Assume one task per particle
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
u Get positions of all other particles
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
u Get positions of all other particles
u Compute new position, velocity
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
All-gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
All-gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Complete Graph for All-gather -operations shown, no ordering required
a
b
a, b
a, b
c
d
c, d
c,d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Complete Graph for All-gather -operations shown, no ordering required
a, b, c
a, b, d
a, b,
c, d
a, b,
c, d
a, c, d
b, c, d
a,b,
c, d
a, b,
c, d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Hypercube-based All-gather -ordering required
a
b
a, b
a, b
c
d
c, d
c,d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Complete Graph for All-gather
Tuesday, April 14, 15
a, b,
c, d
a, b,
c, d
a, b,
c, d
a, b,
c, d
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Time
Complete graph
Complete graph
n/ p
n( p − 1)
( p − 1)(λ +
) = ( p − 1)λ +
β
βp
Hypercube
Hypercube
log p
i -1
)
2 n&
n( p − 1)
'' λ +
$$ = λlogp +
∑
βp %
βp
i =1 (
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Adding Data Input
Input
0
1
2
3
Output
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter in log p Steps
in an application
buffer
in a system
buffer
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n
Tuesday, April 14, 15
Parallel computation
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n
Tuesday, April 14, 15
Parallel computation
u Set of tasks
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n
Tuesday, April 14, 15
Parallel computation
u Set of tasks
u Interactions through channels
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Parallel computation
u Set of tasks
u Interactions through channels
n Good designs
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Parallel computation
u Set of tasks
u Interactions through channels
n Good designs
u Maximize local computations
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Parallel computation
u Set of tasks
u Interactions through channels
n Good designs
u Maximize local computations
u Minimize communications
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Parallel computation
u Set of tasks
u Interactions through channels
n Good designs
u Maximize local computations
u Minimize communications
u Scale up
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n
Tuesday, April 14, 15
Partition computation
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Partition computation
n Agglomerate tasks
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Partition computation
n Agglomerate tasks
n Map tasks to processors
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
u Maximize processor utilization
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
u Maximize processor utilization
u Minimize inter-processor communication
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
n
Tuesday, April 14, 15
Reduction
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
Reduction
n Gather and scatter
n
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
Reduction
n Gather and scatter
n All-gather
n
Tuesday, April 14, 15
Download