• Today: Parallel Computing Concepts, Patternlets • Friday, Monday, … : Patternlets, etc. • Source: • CSinParallel: Parallel Computing in the Computer Science Curriculum (http://csinparallel.org/) Parallel Computing Concepts Module Patternlets in Parallel Computing Module CS 495 Senior Project Phase I • Reminder: Requirements & Specifications due on Friday • Schedule October 22, 2014 Parallel Computing Concepts 1 CS 495 Senior Project Phase I • Motivation • Terminology • Parallel speedup • Options for communication • Issues in concurrency • Patternlets in parallel computing October 22, 2014 Outline 2 CS 495 Senior Project Phase I • Moore’s "Law": an empirical observation by Intel co-founder Gordon Moore in 1965 • Number of components in computer circuits had doubled each year since 1958 • Four decades later, that number has continued to double each two years or less October 22, 2014 Motivation 3 CS 495 Senior Project Phase I • But the speedups in software performance were due to increasing clock speeds • Era of the "free lunch"; just wait 18-24 months and software goes faster! • Increased clock speed => increased heat October 22, 2014 Motivation 4 hot plate Temperature actual projected CS 495 Senior Project Phase I October 22, 2014 the sun 2020 5 Source: J. Adams, 2014 CCSC:MW Conference Keynote CS 495 Senior Project Phase I • Around 2005, manufacturers stopped increasing clock speed • Instead, created multi-core CPUs; number of cores per CPU chip is growing exponentially • Most software has been designed for a single core; end of the "free lunch" era • Future software development will require understanding of how to take advantage of multi-core systems October 22, 2014 Motivation 6 CS 495 Senior Project Phase I • parallelism: multiple (computer) actions physically taking place at the same time • concurrency: programming in order to take advantage of parallelism (or virtual parallelism) • Parallelism takes place in hardware, concurrency takes place in software October 22, 2014 Terminology 7 • sequential programming: programming for a single core • concurrent programming: programming for multiple cores or multiple computers CS 495 Senior Project Phase I • process: the execution of a program • thread: a sequence of execution within a program, each process has at least one October 22, 2014 Terminology 8 CS 495 Senior Project Phase I • multi-core computing: computing with systems that provide multiple computational circuits per CPU package • distributed computing: computing with systems consisting of multiple computers connected by computer network(s) • Systems can have both October 22, 2014 Terminology 9 CS 495 Senior Project Phase I • data parallelism: the same processing is applied to multiple subsets of a large data set in parallel • task parallelism: different tasks or stages of a computation are performed in parallel October 22, 2014 Terminology 10 CS 495 Senior Project Phase I • shared memory multiprocessing: e.g., multi-core system, and/or multiple CPU packages in a single computer, all sharing the same main memory • cluster: multiple networked computers managed as a single resource and designed for working as a unit on large computational problems October 22, 2014 Terminology 11 CS 495 Senior Project Phase I • grid computing: distributed systems at multiple locations, typically with separate management, coordinated for working on large-scale problems • cloud computing: computing services are accessed via networking on large, centrally managed clusters at data centers, typically at unknown remote locations October 22, 2014 Terminology 12 CS 495 Senior Project Phase I • The ratio of the compute time for a sequential algorithm to the time for a parallel algorithm is the speedup of a parallel algorithm over a corresponding sequential algorithm. • If the speedup factor is n, then we say we have n-fold speedup. October 22, 2014 Parallel Speedup 13 • Number of processors • Other processes running at the same time • Communication overhead • Synchronization overhead • Inherently sequential computation • Rarely does n processors give n-fold speedup, but occasionally get better than n-fold speedup CS 495 Senior Project Phase I • The observed speedup depends on all implementation factors October 22, 2014 Parallel Speedup 14 overall speedup = 1 1−𝑃 𝑃 + 𝑆 • where P is the time proportion of the algorithm that can be parallelized • where S is the speedup factor for that portion of the algorithm due to parallelization • Note that the sequential portion has disproportionate effect. CS 495 Senior Project Phase I • Amdahl’s Law is a formula for estimating the maximum speedup from an algorithm that is part sequential and part parallel October 22, 2014 Amdahl's Law 15 CS 495 Senior Project Phase I • message passing: communicating with basic operations send and receive to transmit information from one computation to another. • shared memory: communicating by reading and writing from local memory locations that are accessible by multiple computations October 22, 2014 Options for Communication 16 CS 495 Senior Project Phase I • distributed memory: some parallel computing systems provide a service for sharing memory locations on a remote computer system, enabling non-local reads and writes to a memory location for communication. October 22, 2014 Options for Communication 17 CS 495 Senior Project Phase I • Fault tolerance is the capacity of a computing system to continue to satisfy its spec in the presence of faults (causes of error) • Scheduling means assigning computations (processes or threads) to processors (cores, distributed computers, etc.) according to time. October 22, 2014 Issues in Concurrency 18 CS 495 Senior Project Phase I • Mutually exclusive access to shared resources means that at most one computation (process or thread) can access a resource (such as a shared memory location) at a time. This is one of the requirements for correct IPC. October 22, 2014 Issues in Concurrency 19 CS 495 Senior Project Phase I • Log into Linux • Browse to course webpage and click on lab exercises link • Launch a terminal window and log into csserver October 22, 2014 Patternlets for Shared Memory 20