Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Algorithm design and basic algorithms Slides are modified from those found in Parallel Programming in C with MPI and OpenMP, Michael Quinn 3 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline n Tuesday, April 14, 15 Task/channel model Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Task/channel model n Algorithm design methodology n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Task/channel model n Algorithm design methodology n Case studies n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model n Tuesday, April 14, 15 Parallel computation = set of tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Parallel computation = set of tasks n Task n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Parallel computation = set of tasks n Task u Program n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Parallel computation = set of tasks n Task u Program u Local memory n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Parallel computation = set of tasks n Task u Program u Local memory u Collection of I/O ports n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model Parallel computation = set of tasks n Task u Program u Local memory u Collection of I/O ports n Tasks interact by sending messages through channels n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Task/Channel Model channel task Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Design Methodology Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Design Methodology n Tuesday, April 14, 15 Partitioning Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Design Methodology Partitioning n Communication n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Design Methodology Partitioning n Communication n Agglomeration n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Design Methodology Partitioning n Communication n Agglomeration n Mapping n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Foster’s Methodology Problem Partitioning Communication Mapping Tuesday, April 14, 15 Agglomeration Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n Tuesday, April 14, 15 Dividing computation and data into pieces Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition u Divide data into pieces Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition u Divide data into pieces u Determine how to associate computations with the data Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition u Divide data into pieces u Determine how to associate computations with the data Functional decomposition Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition u Divide data into pieces u Determine how to associate computations with the data Functional decomposition u Divide computation into pieces Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n n n Tuesday, April 14, 15 Dividing computation and data into pieces Domain decomposition u Divide data into pieces u Determine how to associate computations with the data Functional decomposition u Divide computation into pieces u Determine how to associate data with the computations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example Domain Decompositions 1-D 2-D 3-D Tuesday, April 14, 15 Primitive tasks is the number of scope, or order of magnitude, of the parallelism. 1-D has, in the example, n-way ||ism along the n-faces, 2-D has n^2 ||ism along the faces, and 3way has n^3 || ism along the faces. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example Functional Decomposition Track position of instruments Determine image location Register image Determine image location Display image Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Types of parallelism 12 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Types of parallelism n Numerical algorithms often have data-parallelism 12 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Types of parallelism n n Numerical algorithms often have data-parallelism Non-numerical algorithms often have functional parallelism. 12 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Types of parallelism n n n Numerical algorithms often have data-parallelism Non-numerical algorithms often have functional parallelism. Many algorithms, especially complex numerical algorithms, have both, e.g., data parallelism within an function, many functions that can be done in parallel. 12 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Types of parallelism n n n n Numerical algorithms often have data-parallelism Non-numerical algorithms often have functional parallelism. Many algorithms, especially complex numerical algorithms, have both, e.g., data parallelism within an function, many functions that can be done in parallel. Functional parallelism often scales worse with increasing data size (concurrency-limited in isoefficiency terms) 12 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Checklist Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Checklist n Tuesday, April 14, 15 At least 10x more primitive tasks than processors in target computer Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Checklist At least 10x more primitive tasks than processors in target computer n Minimize redundant computations and redundant data storage n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Checklist At least 10x more primitive tasks than processors in target computer n Minimize redundant computations and redundant data storage n Primitive tasks roughly the same size n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Checklist At least 10x more primitive tasks than processors in target computer n Minimize redundant computations and redundant data storage n Primitive tasks roughly the same size n Number of tasks an increasing function of problem size n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n Tuesday, April 14, 15 Determine values passed among tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n Tuesday, April 14, 15 Determine values passed among tasks Local communication Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n Tuesday, April 14, 15 Determine values passed among tasks Local communication u Task needs values from a small number of other tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n Tuesday, April 14, 15 Determine values passed among tasks Local communication u Task needs values from a small number of other tasks u Create channels illustrating data flow Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n n Tuesday, April 14, 15 Determine values passed among tasks Local communication u Task needs values from a small number of other tasks u Create channels illustrating data flow Global communication Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n n Tuesday, April 14, 15 Determine values passed among tasks Local communication u Task needs values from a small number of other tasks u Create channels illustrating data flow Global communication u Significant number of tasks contribute data to perform a computation Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication n n n Tuesday, April 14, 15 Determine values passed among tasks Local communication u Task needs values from a small number of other tasks u Create channels illustrating data flow Global communication u Significant number of tasks contribute data to perform a computation u Don’t create channels for them early in design Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Checklist Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Checklist n Tuesday, April 14, 15 Communication operations balanced among tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Checklist Communication operations balanced among tasks n Each task communicates with only small group of neighbors n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Checklist Communication operations balanced among tasks n Each task communicates with only small group of neighbors n Tasks can perform communications concurrently n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Checklist Communication operations balanced among tasks n Each task communicates with only small group of neighbors n Tasks can perform communications concurrently n Task can perform computations concurrently n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration n Tuesday, April 14, 15 Grouping tasks into larger tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Grouping tasks into larger tasks n Goals n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Grouping tasks into larger tasks n Goals u Improve performance n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Grouping tasks into larger tasks n Goals u Improve performance u Maintain scalability of program n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Grouping tasks into larger tasks n Goals u Improve performance u Maintain scalability of program u Simplify programming n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Grouping tasks into larger tasks n Goals u Improve performance u Maintain scalability of program u Simplify programming n In MPI programming, goal often to create one agglomerated task per processor n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Can Improve Performance Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Can Improve Performance n Eliminate communication between primitive tasks agglomerated into consolidated task Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Can Improve Performance Eliminate communication between primitive tasks agglomerated into consolidated task n Combine groups of sending and receiving tasks n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n Tuesday, April 14, 15 Locality of parallel algorithm has increased Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Data replication doesn’t affect scalability Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Data replication doesn’t affect scalability Agglomerated tasks have similar computational and communications costs Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n n n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Data replication doesn’t affect scalability Agglomerated tasks have similar computational and communications costs Number of tasks increases with problem size Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n n n n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Data replication doesn’t affect scalability Agglomerated tasks have similar computational and communications costs Number of tasks increases with problem size Number of tasks suitable for likely target systems Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Checklist n n n n n n n Tuesday, April 14, 15 Locality of parallel algorithm has increased Replicated computations take less time than communications they replace Data replication doesn’t affect scalability Agglomerated tasks have similar computational and communications costs Number of tasks increases with problem size Number of tasks suitable for likely target systems Tradeoff between agglomeration and code modifications costs is reasonable Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping n Tuesday, April 14, 15 Process of assigning tasks to processors Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Process of assigning tasks to processors n Shared memory system: mapping done by operating system n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Process of assigning tasks to processors n Shared memory system: mapping done by operating system n Distributed memory system: mapping done by user n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Process of assigning tasks to processors n Shared memory system: mapping done by operating system n Distributed memory system: mapping done by user n Conflicting goals of mapping n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Process of assigning tasks to processors n Shared memory system: mapping done by operating system n Distributed memory system: mapping done by user n Conflicting goals of mapping u Maximize processor utilization n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Process of assigning tasks to processors n Shared memory system: mapping done by operating system n Distributed memory system: mapping done by user n Conflicting goals of mapping u Maximize processor utilization u Minimize interprocessor communication n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Example While this may reduce communication, load balance may be an issue Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping n Tuesday, April 14, 15 Finding optimal mapping is NP-hard Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping Finding optimal mapping is NP-hard n Must rely on heuristics n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping Finding optimal mapping is NP-hard n Must rely on heuristics n Metis is a popular package for partitioning graphs n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping Finding optimal mapping is NP-hard n Must rely on heuristics n Metis is a popular package for partitioning graphs u Minimizes the number of edges between nodes in a graph n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Optimal Mapping Finding optimal mapping is NP-hard n Must rely on heuristics n Metis is a popular package for partitioning graphs u Minimizes the number of edges between nodes in a graph u Edges, for our purposes, can be thought of as communication n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task • Cyclically map tasks to processors Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task • Cyclically map tasks to processors • GSS (guided self scheduling) Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task • Cyclically map tasks to processors • GSS (guided self scheduling) u Unstructured communication Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task • Cyclically map tasks to processors • GSS (guided self scheduling) u Unstructured communication • Use a static load balancing algorithm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Decision Tree n n Tuesday, April 14, 15 Static number of tasks u Structured communication t Constant computation time per task • Agglomerate tasks to minimize communication • Create one task per processor t Variable computation time per task • Cyclically map tasks to processors • GSS (guided self scheduling) u Unstructured communication • Use a static load balancing algorithm Dynamic number of tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n Static number of tasks Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks u Frequent communications between tasks Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks u Frequent communications between tasks t Use a dynamic load balancing algorithm Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks u Frequent communications between tasks t Use a dynamic load balancing algorithm u Many short-lived tasks Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks u Frequent communications between tasks t Use a dynamic load balancing algorithm u Many short-lived tasks t Use a run-time task-scheduling algorithm Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Strategy n n Static number of tasks Dynamic number of tasks u Frequent communications between tasks t Use a dynamic load balancing algorithm u Many short-lived tasks t Use a run-time task-scheduling algorithm t Cilk and Galois, discussed in the next couple of weeks, do this. Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Checklist Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Checklist n Tuesday, April 14, 15 Considered designs based on one task per processor and multiple tasks per processor Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Checklist Considered designs based on one task per processor and multiple tasks per processor n Evaluated static and dynamic task allocation n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Checklist Considered designs based on one task per processor and multiple tasks per processor n Evaluated static and dynamic task allocation n If dynamic task allocation chosen, task allocator is not a bottleneck to performance n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Mapping Checklist Considered designs based on one task per processor and multiple tasks per processor n Evaluated static and dynamic task allocation n If dynamic task allocation chosen, task allocator is not a bottleneck to performance n If static task allocation chosen, ratio of tasks to processors is at least 10:1 n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Case Studies Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Case Studies n Tuesday, April 14, 15 Boundary value problem Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Case Studies Boundary value problem n Finding the maximum n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Case Studies Boundary value problem n Finding the maximum n The n-body problem n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Case Studies Boundary value problem n Finding the maximum n The n-body problem n Adding data input n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Boundary Value Problem Ice water Tuesday, April 14, 15 Rod Insulation Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Rod Cools as Time Progresses Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Want to use finite-difference method over multiple time steps time position Tuesday, April 14, 15 28 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Want to use finite-difference method over multiple time steps time n Each circle represents a computation position Tuesday, April 14, 15 28 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Want to use finite-difference method over multiple time steps time position 29 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Want to use finite-difference method over multiple time steps time Temperature at time t+1 for a position on the rod represented by a node depends on the temperature of neighbors at time t position 29 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning time position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning time n One data item per grid point position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n time n One data item per grid point Associate one primitive task with each grid point position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n time n n One data item per grid point Associate one primitive task with each grid point Twodimensional domain decomposition position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication time position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication time n Identify communication pattern between primitive tasks position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication time n n Identify communication pattern between primitive tasks Each interior primitive task has three incoming and three outgoing channels position Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration and Mapping (a) (b) ( c) Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration and Mapping Agglomeration (a) (b) ( c) Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sequential execution time Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sequential execution time n Tuesday, April 14, 15 χ – time to update element Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sequential execution time χ – time to update element n n – number of elements n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sequential execution time χ – time to update element n n – number of elements n m – number of iterations n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Sequential execution time χ – time to update element n n – number of elements n m – number of iterations n Sequential execution time: m (n-1) χ n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time n Tuesday, April 14, 15 χ – time to update element Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time χ – time to update element n n – number of elements n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time χ – time to update element n n – number of elements n m – number of iterations n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time χ – time to update element n n – number of elements n m – number of iterations n p – number of processors n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time χ – time to update element n n – number of elements n m – number of iterations n p – number of processors n λ – message latency n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Execution Time χ – time to update element n n – number of elements n m – number of iterations n p – number of processors n λ – message latency n Parallel execution time m(χ⎡(n-1)/p⎤+2λ) n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding the Maximum Error from measured data Need to do a reduction. Tuesday, April 14, 15 6.25% Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Reduction Evolution n -1 tasks (a) Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Reduction Evolution n /2 -1 tasks n /2-1 tasks (b) Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. (a) (b) Parallel Reduction Evolution n /4 - 1 tasks n /4 -1 tasks n /4 - 1 tasks n /4 -1 tasks ( c) Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Binomial Trees Subgraph of hypercube Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 4 2 0 7 -3 5 -6 -3 8 1 2 3 -4 4 6 -1 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 4 2 0 7 -3 5 -6 -3 8 1 2 3 -4 4 6 -1 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 1 7 -6 4 4 5 8 2 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 1 7 -6 4 4 5 8 2 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum Tuesday, April 14, 15 8 -2 9 10 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum Tuesday, April 14, 15 8 -2 9 10 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 17 Tuesday, April 14, 15 8 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 17 Tuesday, April 14, 15 8 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum 25 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Finding Global Sum Binomial Tree 25 Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration leads to actual communication Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration leads to actual communication Tuesday, April 14, 15 sum sum sum sum Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration leads to actual communication Tuesday, April 14, 15 sum sum sum sum Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. The n-body Problem Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning n Tuesday, April 14, 15 Domain partitioning Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Domain partitioning n Assume one task per particle n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Domain partitioning n Assume one task per particle n Task has particle’s position, velocity vector n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Domain partitioning n Assume one task per particle n Task has particle’s position, velocity vector n Iteration n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Domain partitioning n Assume one task per particle n Task has particle’s position, velocity vector n Iteration u Get positions of all other particles n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Partitioning Domain partitioning n Assume one task per particle n Task has particle’s position, velocity vector n Iteration u Get positions of all other particles u Compute new position, velocity n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Gather Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Gather Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. All-gather Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. All-gather Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Complete Graph for All-gather -operations shown, no ordering required a b a, b a, b c d c, d c,d Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Complete Graph for All-gather -operations shown, no ordering required a, b, c a, b, d a, b, c, d a, b, c, d a, c, d b, c, d a,b, c, d a, b, c, d Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Hypercube-based All-gather -ordering required a b a, b a, b c d c, d c,d Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Complete Graph for All-gather Tuesday, April 14, 15 a, b, c, d a, b, c, d a, b, c, d a, b, c, d Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Communication Time Complete graph Complete graph n/ p n( p − 1) ( p − 1)(λ + ) = ( p − 1)λ + β βp Hypercube Hypercube log p i -1 ) 2 n& n( p − 1) '' λ + $$ = λlogp + ∑ βp % βp i =1 ( Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Adding Data Input Input 0 1 2 3 Output Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Scatter Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Scatter Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Scatter in log p Steps in an application buffer in a system buffer Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model n Tuesday, April 14, 15 Parallel computation Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model n Tuesday, April 14, 15 Parallel computation u Set of tasks Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model n Tuesday, April 14, 15 Parallel computation u Set of tasks u Interactions through channels Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model Parallel computation u Set of tasks u Interactions through channels n Good designs n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model Parallel computation u Set of tasks u Interactions through channels n Good designs u Maximize local computations n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model Parallel computation u Set of tasks u Interactions through channels n Good designs u Maximize local computations u Minimize communications n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Task/channel Model Parallel computation u Set of tasks u Interactions through channels n Good designs u Maximize local computations u Minimize communications u Scale up n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps n Tuesday, April 14, 15 Partition computation Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Partition computation n Agglomerate tasks n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Partition computation n Agglomerate tasks n Map tasks to processors n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Partition computation n Agglomerate tasks n Map tasks to processors n Goals n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Partition computation n Agglomerate tasks n Map tasks to processors n Goals u Maximize processor utilization n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Design Steps Partition computation n Agglomerate tasks n Map tasks to processors n Goals u Maximize processor utilization u Minimize inter-processor communication n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Fundamental Algorithms Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Fundamental Algorithms n Tuesday, April 14, 15 Reduction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Fundamental Algorithms Reduction n Gather and scatter n Tuesday, April 14, 15 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary: Fundamental Algorithms Reduction n Gather and scatter n All-gather n Tuesday, April 14, 15