csinparallel.org Patterns and Exemplars: Compelling Strategies for Teaching Parallel and Distributed Computing to CS Undergraduates Libby Shoop Joel Adams Dick Brown csinparallel.org Today’s messages • Parallel Design Patterns provide an established, practical set of principles for teaching PDC • “Exemplar” example applications with multiple implemented solutions provide motivation for students and teaching materials for instructors • Patterns and Exemplars fit together naturally and are ready for deployment csinparallel.org Parallel Design Patterns • Following on the original Gang of Four design patterns work Active work on parallel design patterns and parallel pattern languages: • Catalog parallel patterns used in solutions and describe a methodology for using the pattern csinparallel.org Past Work • Lea : 2010 1999 2012 2004 – Java Concurrency Patterns book • Mattson, Saunders, and Massingil : – PPLP book 2010 2011 • Ralph Johnson et al. : – Parallel Programming Patterns online; books of Visual C++, .NET examples • Oretega-Arjona book • McCool, Reinders, and Robison book • Kreutzer, Mattson, et al. : – Our Pattern Language (OPL) online • ParaPLoP Workshop on Parallel Programming Patterns ParaPLoP ‘10 csinparallel.org Pattern Approach • Using existing design knowledge when designing new parallel programs • Leads to parallel software systems that are: – modular, adaptable, understandable and evolve easily • Also provides an effective problem-solving framework and a guide for teaching about good parallel solutions csinparallel.org PATTERNLETS csinparallel.org Patternlets… … are minimalist, scalable, executable programs, each illustrating a particular pattern’s behavior: – Minimalist so that students can grasp the concept without non-essential details getting in the way – Scalable so that students see different behaviors as the number of threads changes – Executable so that • Instructors can use it in a live-coding demo • Students can use it in a hands-on exercise Patternlets let students see the pattern in action csinparallel.org Existing Patternlets (so far) • MPI – – – – – – – – – – • OpenMP SPMD Master-Worker Message Passing Parallel For Loop (stripes) Parallel For Loop (blocks) Broadcast Reduction Scatter Gather Barrier – – – – – – – – – – – – Fork-Join SPMD Master-Worker Parallel For Loop (blocks) Parallel For Loop (stripes) Reduction Private Atomic Critical Critical2 Sections Barrier OpenMP Patternlets MPI Patternlets csinparallel.org /* masterWorker.c (MPI) … */ #include <stdio.h> #include <mpi.h> int main(int argc, char** argv) { int id = -1, numProcs= -1, length = -1; char hostName[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &id); MPI_Comm_size(MPI_COMM_WORLD, &numProcs); MPI_Get_processor_name (hostName, &length); if ( id == 0 ) { // process with ID == 0 is the master printf("Greetings from the master, #%d (%s) of %d processes\n”, id, hostName, numProcs); } else { // processes with IDs > 0 are workers printf("Greetings from a worker, #%d (%s) of %d processes\n”, id, hostName, numProcs); } MPI_Finalize(); return 0; } csinparallel.org Sample Executions $ mpirun -np 1 ./masterWorker Greetings from the master, #0 (node-01) of 1 processes $ mpirun –np 8 ./masterWorker Greetings from the master, #0 (node-01) of 8 processes Greetings from a worker, #1 (node-02) of 8 processes Greetings from a worker, #5 (node-06) of 8 processes Greetings from a worker, #3 (node-04) of 8 processes Greetings from a worker, #4 (node-05) of 8 processes Greetings from a worker, #7 (node-08) of 8 processes Greetings from a worker, #2 (node-03) of 8 processes Greetings from a worker, #6 (node-07) of 8 processes csinparallel.org /* masterWorker.c (OpenMP) … */ #include <stdio.h> #include <omp.h> int main(int argc, char** argv) { int id = -1, numThreads = -1; // #pragma omp parallel { id = omp_get_thread_num(); numThreads = omp_get_num_threads(); if ( id == 0 ) { // thread with ID 0 is master printf(”Greetings from the master, #%d of %d threads\n\n”, id, numThreads); } else { // threads with IDs > 0 are workers printf(”Greetings from a worker, #%d of %d threads\n\n”, id, numThreads); } } return 0; } csinparallel.org Sample Executions $ ./masterWorker // pragma omp parallel disabled Greetings from the master, #0 of 1 threads $ ./masterWorker // pragma omp parallel enabled Greetings from a worker, #1 of 8 threads Greetings from a worker, #2 of 8 threads Greetings from a worker, #5 of 8 threads Greetings from a worker, #3 of 8 threads Greetings from a worker, #6 of 8 threads Greetings from the master, #0 of 8 threads Greetings from a worker, #4 of 8 threads Greetings from a worker, #7 of 8 threads csinparallel.org EXEMPLARS csinparallel.org Motivation • Everyone in CS needs PDC • Not everyone is naturally drawn to PDC topics How shall we motivate every CS undergraduate to learn the PDC they will need for their careers? csinparallel.org Motivation • Everyone in CS needs PDC • Not everyone is naturally drawn to PDC topics How shall we motivate every CS undergraduate to learn the PDC they will need for their careers? Proposal: Teach PDC concepts with compelling applications. • Some CS students draw by concepts and tech • Other CS students drawn by the applications csinparallel.org Exemplars An exemplar is: • A representative applied problem plus • multiple code solutions implemented in various PDC technologies, with commentary csinparallel.org Exemplar A (from EAPF Practicum) • Compute π via numerical integration • Implemented solutions – Serial – Shared memory (OpenMP, TBB, pthreads, Windows Threads, go language) – Distributed computing (MPI) – Accelerators (CUDA, Array Building Blocks) • Comments: – Flexible uses: demo, concepts, tech, compare – But not a compelling application csinparallel.org Introduction to the Drug Design Exemplar Exemplar B (from EAPF Practicum) Problem definition An important problem in the biological sciences is the drug design problem. The goal is to find small molecules, called ligands, that are good candidates for use as drugs. • Drug design • Implemented solutions At a high level, the problem is simple to state. A protein associated with an interesting disease is identified. The three-dimensional structure of a target protein for the desired drug is found by some means (experimentally or through a molecular modeling computation). A collection of ligands is tested against the protein using a docking algorithm: for every orientation of the ligand relative to the protein, a computation tests if the ligand binds with the protein in useful ways (for example, tying up a biologically active region on the protein). A score is set depending on these binding properties and the best scores are flagged to identify the ligands that would make good drug candidates. – Serial – Shared memory (OpenMP, boost threads, go lang) – Map-reduce framework (Hadoop) Application Architecture Level Patterns The application architecture level patterns constitute the highest level in the OPL hierarchy of patterns, and concern the architectural design of large software. As described in the module Introduction to Parallel Design Patterns, there are two kinds of application architecture level patterns: structural patterns describe the overall organization of a software application, and how csinparallel.org · · Proteins and ligands will be represented as (randoml Exemplar B (from EAPF Practicum) The docking-problem computation will be represente • Comments a protein string P. The score for a pair [L, P] will be t all possibilities when L is compare – characters Compelling among application insertions and deletions. For exam – allowing Molecularpossible dynamics, docking algorithm is the string “lcacxtqvivg” then the score is 4, arising – segment SubstituteofforP:docking algorithm to score ligands: lcacxet q v i v g (score is maximal cx tbcr v match count) This is nottothe only comparison of that ligand to that • Relates genetic alignment algorithm characters. Another is ligand length, # cores • Multiple ways to scale: one # ligands, l c a strings c x e with t q vrandom i v glengths for variable • Random c x load t r per b cligand v computational However, there is no comparison that matches five c right, so the score is 4. csinparallel.org Exemplars + Patterns • Exemplar implementations offer a rich opportunity for learning patterns • Examples – π as area (among 8 PDC implementations): • Data Decomposition, Geometric Decomposition; Parallel For Loop, Master-Worker, Strict Data Parallel, Distributed Array; SIMD, Thread Pool, Message Passing, Collective Communication, Mutual Exclusion – Drug design (among 4 PDC implementations): • Map-Reduce; Data Decomposition; Parallel For Loop, ForkJoin, BSP, Master-Worker, Task Queue, Shared Array, Shared Queue; Thread Pool, Message Passing, Mutual Exclusion π as area Drug design csinparallel.org Conclusion • Patterns – a meaning for “parallel thinking,” best practice from industry • Patternlets – minimalist, scalable, executable programs, each illustrating a particular pattern’s behavior • Exemplars – motivation, hands-on/demo, teaching resource, opportunities for PDC • These are naturally combined and ready for deployment