presentation slides

advertisement
csinparallel.org
Patterns and Exemplars: Compelling
Strategies for Teaching Parallel and
Distributed Computing to CS
Undergraduates
Libby Shoop
Joel Adams
Dick Brown
csinparallel.org
Today’s messages
• Parallel Design Patterns provide an
established, practical set of principles for
teaching PDC
• “Exemplar” example applications with
multiple implemented solutions provide
motivation for students and teaching
materials for instructors
• Patterns and Exemplars fit together naturally
and are ready for deployment
csinparallel.org
Parallel Design Patterns
• Following on the original Gang of Four design
patterns work
Active work on parallel design patterns and
parallel pattern languages:
• Catalog parallel patterns used in solutions and
describe a methodology for using the pattern
csinparallel.org
Past Work
• Lea :
2010
1999
2012
2004
– Java Concurrency Patterns book
• Mattson, Saunders, and Massingil :
– PPLP book
2010
2011
• Ralph Johnson et al. :
– Parallel Programming Patterns online;
books of Visual C++, .NET examples
• Oretega-Arjona book
• McCool, Reinders, and Robison book
• Kreutzer, Mattson, et al. :
– Our Pattern Language (OPL) online
• ParaPLoP Workshop on Parallel
Programming Patterns
ParaPLoP ‘10
csinparallel.org
Pattern Approach
• Using existing design knowledge when
designing new parallel programs
• Leads to parallel software systems that are:
– modular, adaptable, understandable and evolve
easily
• Also provides an effective problem-solving
framework and a guide for teaching about
good parallel solutions
csinparallel.org
PATTERNLETS
csinparallel.org
Patternlets…
… are minimalist, scalable, executable programs,
each illustrating a particular pattern’s behavior:
– Minimalist so that students can grasp the concept
without non-essential details getting in the way
– Scalable so that students see different behaviors as
the number of threads changes
– Executable so that
• Instructors can use it in a live-coding demo
• Students can use it in a hands-on exercise
Patternlets let students see the pattern in action
csinparallel.org
Existing Patternlets (so far)
• MPI
–
–
–
–
–
–
–
–
–
–
• OpenMP
SPMD
Master-Worker
Message Passing
Parallel For Loop (stripes)
Parallel For Loop (blocks)
Broadcast
Reduction
Scatter
Gather
Barrier
–
–
–
–
–
–
–
–
–
–
–
–
Fork-Join
SPMD
Master-Worker
Parallel For Loop (blocks)
Parallel For Loop (stripes)
Reduction
Private
Atomic
Critical
Critical2
Sections
Barrier
OpenMP Patternlets
MPI Patternlets
csinparallel.org
/* masterWorker.c (MPI) … */
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int id = -1, numProcs= -1, length = -1;
char hostName[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
MPI_Get_processor_name (hostName, &length);
if ( id == 0 ) { // process with ID == 0 is the master
printf("Greetings from the master, #%d (%s) of %d processes\n”, id, hostName, numProcs);
} else {
// processes with IDs > 0 are workers
printf("Greetings from a worker, #%d (%s) of %d processes\n”, id, hostName, numProcs);
}
MPI_Finalize();
return 0;
}
csinparallel.org
Sample Executions
$ mpirun -np 1 ./masterWorker
Greetings from the master, #0 (node-01) of 1 processes
$ mpirun –np 8 ./masterWorker
Greetings from the master, #0 (node-01) of 8 processes
Greetings from a worker, #1 (node-02) of 8 processes
Greetings from a worker, #5 (node-06) of 8 processes
Greetings from a worker, #3 (node-04) of 8 processes
Greetings from a worker, #4 (node-05) of 8 processes
Greetings from a worker, #7 (node-08) of 8 processes
Greetings from a worker, #2 (node-03) of 8 processes
Greetings from a worker, #6 (node-07) of 8 processes
csinparallel.org
/* masterWorker.c (OpenMP) … */
#include <stdio.h>
#include <omp.h>
int main(int argc, char** argv) {
int id = -1, numThreads = -1;
// #pragma omp parallel
{
id = omp_get_thread_num();
numThreads = omp_get_num_threads();
if ( id == 0 ) { // thread with ID 0 is master
printf(”Greetings from the master, #%d of %d threads\n\n”, id, numThreads);
} else {
// threads with IDs > 0 are workers
printf(”Greetings from a worker, #%d of %d threads\n\n”, id, numThreads);
}
}
return 0;
}
csinparallel.org
Sample Executions
$ ./masterWorker
// pragma omp parallel disabled
Greetings from the master, #0 of 1 threads
$ ./masterWorker
// pragma omp parallel enabled
Greetings from a worker, #1 of 8 threads
Greetings from a worker, #2 of 8 threads
Greetings from a worker, #5 of 8 threads
Greetings from a worker, #3 of 8 threads
Greetings from a worker, #6 of 8 threads
Greetings from the master, #0 of 8 threads
Greetings from a worker, #4 of 8 threads
Greetings from a worker, #7 of 8 threads
csinparallel.org
EXEMPLARS
csinparallel.org
Motivation
• Everyone in CS needs PDC
• Not everyone is naturally drawn to PDC topics
How shall we motivate every CS
undergraduate to learn the PDC they
will need for their careers?
csinparallel.org
Motivation
• Everyone in CS needs PDC
• Not everyone is naturally drawn to PDC topics
How shall we motivate every CS
undergraduate to learn the PDC they
will need for their careers?
Proposal: Teach PDC concepts with compelling
applications.
• Some CS students draw by concepts and tech
• Other CS students drawn by the applications
csinparallel.org
Exemplars
An exemplar is:
• A representative applied problem
plus
• multiple code solutions implemented in
various PDC technologies, with commentary
csinparallel.org
Exemplar A (from EAPF Practicum)
• Compute π via numerical integration
• Implemented solutions
– Serial
– Shared memory (OpenMP, TBB, pthreads, Windows
Threads, go language)
– Distributed computing (MPI)
– Accelerators (CUDA, Array Building Blocks)
• Comments:
– Flexible uses: demo, concepts, tech, compare
– But not a compelling application
csinparallel.org
Introduction to the Drug Design Exemplar
Exemplar B (from EAPF Practicum)
Problem definition
An important problem in the biological sciences is the drug design problem. The goal is to find
small molecules, called ligands, that are good candidates for use as drugs.
• Drug design
• Implemented solutions
At a high level, the problem is simple to state. A protein associated with an interesting disease
is identified. The three-dimensional structure of a target protein for the desired drug is found by
some means (experimentally or through a molecular modeling computation). A collection of
ligands is tested against the protein using a docking algorithm: for every orientation of the
ligand relative to the protein, a computation tests if the ligand binds with the protein in useful
ways (for example, tying up a biologically active region on the protein). A score is set
depending on these binding properties and the best scores are flagged to identify the ligands
that would make good drug candidates.
– Serial
– Shared memory (OpenMP, boost threads, go lang)
– Map-reduce
framework (Hadoop)
Application Architecture Level Patterns
The application architecture level patterns constitute the highest level in the OPL hierarchy of
patterns, and concern the architectural design of large software. As described in the module
Introduction to Parallel Design Patterns, there are two kinds of application architecture level
patterns: structural patterns describe the overall organization of a software application, and how
csinparallel.org
·
·
Proteins and ligands will be represented as (randoml
Exemplar B (from EAPF Practicum)
The docking-problem computation will be represente
• Comments
a protein string P. The score for a pair [L, P] will be t
all possibilities when L is compare
– characters
Compelling among
application
insertions
and deletions. For exam
– allowing
Molecularpossible
dynamics,
docking algorithm
is the string “lcacxtqvivg” then the score is 4, arising
– segment
SubstituteofforP:docking algorithm to score ligands:
lcacxet
q v i v g (score is maximal
cx tbcr v
match count)
This
is nottothe
only
comparison
of that ligand to that
• Relates
genetic
alignment
algorithm
characters.
Another
is ligand length, # cores
• Multiple ways
to scale: one
# ligands,
l c a strings
c x e with
t q vrandom
i v glengths for variable
• Random
c
x load
t r per
b cligand
v
computational
However, there is no comparison that matches five c
right, so the score is 4.
csinparallel.org
Exemplars + Patterns
• Exemplar implementations offer a rich
opportunity for learning patterns
• Examples
– π as area (among 8 PDC implementations):
• Data Decomposition, Geometric Decomposition; Parallel
For Loop, Master-Worker, Strict Data Parallel, Distributed
Array; SIMD, Thread Pool, Message Passing, Collective
Communication, Mutual Exclusion
– Drug design (among 4 PDC implementations):
• Map-Reduce; Data Decomposition; Parallel For Loop, ForkJoin, BSP, Master-Worker, Task Queue, Shared Array,
Shared Queue; Thread Pool, Message Passing, Mutual
Exclusion
π as area
Drug design
csinparallel.org
Conclusion
• Patterns – a meaning for “parallel thinking,”
best practice from industry
• Patternlets – minimalist, scalable, executable
programs, each illustrating a particular
pattern’s behavior
• Exemplars – motivation, hands-on/demo,
teaching resource, opportunities for PDC
• These are naturally combined and ready for
deployment
Download