Programming with Concurrency: Threads, Actors and Coroutines ZHEN LI EILEEN KRAEMER EDUPAR 2013 BOSTON, MA, USA Computer Science Department, University of Georgia Motivation & Observations Motivation Users depend on the effectiveness, efficiency, and reliability of parallel and distributed computing that now permeates most computing activities CS Education research and materials for teaching concurrency, particularly to undergraduates, are underdeveloped Observations In existing courses on concurrency, a disconnect often occurs between conceptual knowledge and practical skills In the absence of hands-on practice, students do not retain the conceptual knowledge Practical skills of interest to students: programming with concurrency constructs in Java, Scala and Python Principle & Implementation Course Design Principle: Provide repeated practice to support acquisition of integrated conceptual knowledge (concurrency constructs) and practical skills (programming languages) Student Demographics: Little prior knowledge of concurrency topics Some level of familiarity in 1 to 2 programming languages Implementation: A scaffolding approach for the acquisition of a 2nd and 3rd programming language A hands-on, practice-centered learning environment Course Design Elements of Teaching 5 MODELS OF CONCURRENCY APPROACHES TO CONCURRENCY CLASSICAL PROBLEMS IN CONCURRENCY Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Models of Concurrency 6 Shared memory model Single, unified memory image Synchronization: different computing processes communicate through this shared memory Message passing model Private memory Synchronization: different computing processes communicate through the exchange of messages Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Approaches to Concurrency 7 Thread-based approach Most operating systems (kernel level, user level, hybrid) Java, C, C++, etc. Actor-based approach Web framework (LiftWeb, SOAP); Chat & messaging (Facebook, Twitter); Scala, Erlang Coroutine-based approach Network library (Gevent) Haskell, Python Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Approaches to Concurrency 8 Approach Constructs Java java.lang.Object threads java.lang.Runnable Runnable interface java.lang.Thread wait(), notify(), notifyAll() Scala scala.actors._ actors Actor._ !, receive, react, mailbox related function Python PEP 342: enhanced generator coroutines function yield, send, next, StopIteration exception General Design Procedure discern shared (passive) vs. thread (active) objects apply monitor pattern design protocols of message types and behaviors apply react pattern discern shared (passive) objects discern coroutine and progress conditions apply generator pattern Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Classical Problems in Concurrency 9 Multi-tasking (race condition) Conditional synchronization Deadlock & fairness Multiple issues combined Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Classical Problems in Concurrency 10 Multi-tasking (race condition) Ornamental garden Sum and worker Conditional synchronization Deadlock & fairness Multiple issues combined Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Classical Problems in Concurrency 11 Multi-tasking (race condition) Conditional synchronization Bank account Bounded buffer Deadlock & fairness Multiple issues combined Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Classical Problems in Concurrency 12 Multi-tasking (race condition) Conditional synchronization Deadlock & fairness Dining philosopher Readers and writers Multiple issues combined Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Classical Problems in Concurrency 13 Multi-tasking (race condition) Conditional synchronization Deadlock & fairness Multiple issues combined Party matching Sleeping barber Book inventory Single lane bridge Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Content and Course Implementation Overview of Parallelism and Concurrency 15 2 weeks Content Multi-core architecture Two models of concurrency Reading Introduction to Parallel Computing<book> Parallel Computer Architectures<book> Multi-core Processors and Systems<book> Assignment Lab: Observing multi-core architecture’s performance Homework: Survey on contemporary supercomputers Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia UML and Concurrency 1.5 Weeks Content UML state and sequence diagrams (multi-media tutorial) Mapping of UML diagram to C++ codes Assignment Lab: Modeling book inventory system as shared memory and message passing systems using UML Comprehension of Concurrency with Pseudocode 4 weeks Content Introduction to pseudocode system Concurrency concepts (race conditions, etc.) Implementation details with pseudocode Reading Semaphore versus Mutex (online material) MySQL bug reports (online material) Assignment Homework: pseudocode completion of dining philosopher and readers-writers problem Lab: implement book inventory system with pseudocode Pseudocode System (original) 18 Simple Statement variable = expression Simple statements are executed atomically. Assignment is an example of a simple statement If Statement (Conditional) IF condition THEN statement(s) ELSE IF condition THEN statement(s) ELSE statement(s) ENDIF The calculation of condition is not necessarily atomic if it involves function call statements. However, the choice of branch based on a calculated condition value is executed atomically. total = 0 name = “John Smith” condition = True height = 3.3 IF testScore >= 90 THEN PRINTLN “A” ELSE IF testScore >= 80 THEN PRINTLM “B” ELSE IF testScore >= 70 THEN PRINTLN “C” ELSE PRINTLN “F” ENDIF testScore = 88 Output B Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Pseudocode System (extended) 19 Parallel Execution Statements PARA statement(s) ENDPARA Statements within the PARA/ENDPARA block are executed concurrently. Atomic statements within PARA/ENDPARA are executed in any order. Statements defined in a function that is called within the PARA/ENDPARA block are executed sequentially. Statements defined in functions that are called within a PARA/ENDPARA block are executed in any order of interleaving with simple statements within the same PARA/ENDPARA block. Statements defined in two functions that are called within the same PARA/ENDPARA block are executed in any order of interleaving while statements from any one of the functions are executed in their order of definition. PARA PRINT “hello ” PRINT “world ” ENDPARA Output possibility 1: hello world possibility 2: world hello DEFINE print() PRINT “hi” PRINT “there” ENDDEF PARA print() ENDPARA Output hi there DEFINE print() PRINT “hi” PRINT “there” ENDDEF PARA print() PRINT “world” ENDPARA Output possibility 1: world hi there possibility 2: hi world there possibility 3: hi there world Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Pseudocode System (extended) 20 Shared Memory Concurrency Exclusively Accessed Statement EXC_ACC statement(s) END_EXC_ACC Only appears within a function definition. When one function call executes statements inside an EXC_ACC/END_EXC_ACC block, other function calls that read or modify the same variables that appear inside the markers may not execute until the first function call completes or executes a WAIT function. Wait and Notify Functions WAIT() NOTIFY() Only be called inside a EXC_ACC/END_EXC_ACC block. Once a WAIT() function starts execution, another function call that reads or modifies variables inside the EXC_ACC/END_EXC_ACC block may execute. Once a NOTIFY() function is executed, all WAIT() functions finish their execution. Both WAIT() and NOTIFY() functions are atomic. x = 10 DEFINE changeX(diff) EXC_ACC x = x + diff END_EXC_ACC ENDDEF PARA changeX(1) changeX(-2) ENDPARA PRINTLN x Output 9 x = 10 DEFINE changeX(diff) EXC_ACC WHILE x + diff < 0 DO WAIT() ENDWHILE x = x + diff NOTIFY() END_EXC_ACC ENDDEF PARA changeX(-11) changeX(1) ENDPARA PRINTLN x Output 0 Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Pseudocode System (extended) 21 Message Passing Concurrency Message Variable MESSAGE.message-name(value...) A special message variable that carries a collection of values. The message-name is used to distinguish message variables from one another. m1 = MESSAGE.h(“hello”) m2 = MESSAGE.w(“world”) Send Statement Send(message variable).To(object) Send a message specified by message variable to a receiver object. A send statement is asynchronous, which means that the order in which messages are received may differ from the order in which they were sent. m1 = MESSAGE.h(“hello”) m2 = MESSAGE.w(“world”) Send(m1).To(r1) Send(m2).To(r1) Receive Statement ON_RECEIVING message statement(s) message statement(s) ... Accept the next message and execute statement(s) according to the type of the message. CLASS Receiver DEFINE receive ON_RECEIVING MESSAGE.h(var) PRINT var MESSAGE.w(var) PRINTLN var ENDDEF ENDCLASS m1 = MESSAGE.h(“hello”) m2 = MESSAGE.w(“world”) r1 = new Receiver() r1.receive() Send(m1).To(r1) Send(m2).To(r1) Output possibility1: hello world possibility2: world hello Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Pseudocode System (sample) 22 CLASS Buffer DEFINE initialize Buffer(capacityVal) items = [] capacity = capacityVal ENDDEF DEFINE produce(itemVal) EXC_ACC WHILE length(items) > capacity DO WAIT() ENDWHILE items[length(items)] = itemVal NOTIFY() END_EXC_ACC ENDDEF DEFINE consume() EXC_ACC WHILE length(items) < 1 DO WAIT() ENDWHILE item = items[0] del items[0] NOTIFY() END_EXC_ACC return item ENDDEF ENDCLASS CLASS Producer DEFINE initialize Producer(bufferVal) buffer = bufferVal ENDDEF DEFINE run() WHILE True DO buffer.produce(randNum(0,10)) ENDWHILE ENDDEF ENDCLASS CLASS Consumer DEFINE initialize Consumer(bufferVal) buffer = bufferVal ENDDEF DEFINE run() WHILE True DO PRINTLN buffer.consume() ENDWHILE ENDDEF ENDCLASS Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Midterm Exam 1 week Content Comprehension of a concurrent system (single-lane bridge, expressed using pseudocode Midterm Exam Performance Group Shared Memory (Mean) Message Passing (Mean) Overall (Mean) S (9 students) 56.67 / 100 (1st) 81.72 / 100 (2nd) 138.39 / 200 D (7 students) 76.14 / 100 (2nd) 65.93 / 100 (1st) 142.07 / 200 All 65.19 / 100 74.81 / 100 Concurrent Program Comprehension 25 PARA redCarA.run() redCarB.run() blueCarA.run() END_PARA Suppose redCarA has called the redEnter() method on line 9 but has not returned. Then redCarB invokes its run() method and calls the redEnter() method but also has not returned. Decide if each of the scenarios below (k-t) could happen immediately after the above. Circle YES if the sequence is possible; otherwise, circle NO. Then please provide a brief explanation of your reasoning. (m)redCarB returns from the redEnter() method, then calls the redExit() method on line 19 and blocks on the EXC_ACC marker on line 20. YES NO Explanation: PARA bridge.start() redCarA.start() redCarB.start() blueCarA.start() END_PARA Suppose redCarA has sent the redEnter message but has not yet received any messages. Then redCarB invokes its start() method, and sends the redEnter message but has not yet received any messages. Decide if each of the scenarios below (k-t) could happen immediately after the above. Circle YES if the sequence is possible; otherwise, circle NO. Then please provide a brief explanation of your reasoning. (m)redCarB receives a succeedEnter message, then sends a redExit message and receives MESSAGE.succeedExit(2). YES NO Explanation: Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia General Misconception Hierarchy 26 Description Level D1 Misconceptions of the system and/or problem descriptions Terminology Level Misinterpretation of a term that describes thread or process T1 behavior Concurrency Level C1 Misconceptions about thread or process behaviors Implementation Level I1 Misconceptions about synchronous mechanisms I2 Misconceptions about asynchronous mechanisms Uncertainty Level Confusion about space of executions; include impossible U1 execution sequences or fail to consider possible execution sequences Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Misconceptions about Shared Memory 27 Shared Memory [D1]S1: Conflate order of cars with their thread’s name (#students: 3) [T1]S2: Misinterpret “race condition” as “different interleaving” (#students: 1) [T1]S3: Misinterpretation on terminology “block on” (#students: 2) [C1]S4: Conflate order of method return with order of entering/exiting bridge (#students: 4) [C1]S5: Conflate locking with conditional waiting (#students: 9) [I1]S6: Misinterpretation of WAIT() function’s effect and conflate wait with continuous execution of the enclosing while loop (#students: 1) [I1]S7: Conflate order of method invocation/return with get/release lock (#students: 10) [U]S8: Uncertainty (#students: 2) Increased size of state spaced causes illogical (self-contradictory) reasoning or occurrence of misconceptions not seen in simpler scenarios Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Misconceptions about Message Passing 28 Message Passing [D1]M1: Question setting (#students: 6) [T1]M2: Misinterpret “race condition” as “different order of messages” (#students: 1) [C1]M3: Send semantics : assume ability to send depends on condition at receiver or interpret send as a synchronous method call (#students: 7) [C1]M4: Receive semantics: assume receipt of acknowledgement message is synchronous with the occurrence of the event ( (bridge entered or exited) (#students: 7) [I2]M5: Conflate message sending order with receiving order (#students: 6) Four scenarios: 1) different senders, same receiver (covered by test problem) 2) different senders, different receivers 3) same sender, different receivers (covered by test problem) 4) same sender, same receiver [U1]M6: Uncertainty (#students: 7) Increased size of state spaced causes illogical (self-contradictory) reasoning or occurrence of misconceptions not seen in simpler scenarios Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Misconceptions: the “fall back” phenomenon 29 Extensive training rare misconceptions on terminology level Large number of misconceptions on description level? fall back from uncertainty Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Implementation of Concurrency 8 weeks Content (flipped classroom) Practice Java, Scala and Python without concurrency constructs Implement concurrency with threads, actors and coroutines Reading Java API and concurrency tutorial Scala API and actors tutorial Python API, Google’s python class, and coroutine tutorial Assignment Lab: implement party-matching and sleeping barber with threads, actors and coroutines Final Exam 0.5 week Content Implementation of concurrency (single-lane bridge) Choose either threads, actors or coroutines approach to finish Related Curriculum Topics Covered 32 Single instruction Multiple instruction Single data SISD MISD Multiple data SIMD MIMD Flynn’s Taxonomy Why and what is parallel/distributed computing Concurrency Non-determinism Concurrency defects Shared memory Task/thread spawning Language extensions Tasks and threads Synchronization Critical regions Producer-consumer Monitors Deadlocks Data Races Distributed Memory Message passing Functional/logic languages Work stealing Tools to detect concurrency defects Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Survey Data Surveys on effort and preferences were collected with each lab and homework assignments Students consistently reported difficulties with shared memory systems In homeworks 2 (shared memory) and 3 (message passing), students were asked to write pseudocode for the bounded-buffer and diningphilosopher problems discussed in class. In a survey conducted after homework 3, only 1 student indicated that message-passing is more difficult, and 10 indicated that shared memory is more difficult In lab 2 (shared memory) and lab 3 (message passing) students were asked to design a book inventory system. In the post-lab survey, 8 of 11 students who responded indicated that shared memory is more difficult, 1 indicated that message passing is more difficult, and 2 students found the assignments equally difficult. Survey Data 34 Midterm exam on comprehending concurrency 11 of the 15 students who responded indicated that questions in the shared memory section were harder to answer than those in the message passing section. 10 of the 15 chose the message passing section as final graded part. Of the 5 students who chose the shared memory section, 4 took the shared memory portion in the 2nd session. Of these 15 students, 13 chose correctly, in that they selected the section in which they actually scored higher. The 2 students who chose incorrectly chose the shared memory section but actually scored slightly higher on the messagepassing section. Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Conclusions This class is challenging, especially to undergraduate students who have limited knowledge of concurrency and are inexperienced in programming. Some students report time pressure on completing homework and lab projects. From the feedback of students who withdrew from the course, 2 of 3 expressed unmanageable course workload as their major reason for dropping The pseudocode system is useful for students to comprehend and reason about concurrent systems, but it requires further refinements on wording and validation A standard glossary of well-defined terminology is essential. Shared memory is harder for students to understand, design, write pseudocode for, and reason about. Thanks! QUESTIONS? Monitor Pattern 37 Data Object Class Function lock while condition is false wait execution unlock end function end class Active Object Class Run function invoking data object class’s functions end run function end class Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 38 Monitor Pattern class Buffer<T> { List<T> buf; int capacity, size; Buffer(int capacity) { this.capacity = capacity; buf = new ArrayList<T>(); size = 0; } Bounded Buffer Buffer.java void synchronized produce(T item) { while (size >= capacity) wait(); buf.add(item); size++; } T synchronized consume() { while (size <= 0) wait(); size--; return buf.remove(0); } } Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 39 class Producer<T> { Buffer<T> buffer; Monitor Pattern Producer(Buffer buffer) { this.buffer = buffer; } Bounded Buffer public void run() { buffer.produce((T)10); } Producer.java Consumer.java } class Consumer<T> { Buffer<T> buffer; Consumer(Buffer buffer) { this.buffer = buffer; } public void run() { T item = buffer.consume(); } } Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia React Pattern 40 Data object class Receive message while condition is false delay processing processing End receive message End class Active object class Run function send messages End run function End class Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 41 class Buf[T](capacity:Int) extends Actor { var buf:List[T] = List() React Pattern Bounded Buffer Buffer class def prod(m:T) { if (buf.size >= capacity) { self ! produce(m) } else { buf = buf.:+(m) } } def cons(consumer:Actor) { if (buf.isEmpty) { self ! consume(consumer) } else { val m = buf.head buf = buf.tail consumer ! cargo(m) } } def act() { loop { react { case produce(m:String, => prod(m) case consume(consumer:Actor) => cons(consumer) } } } } Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 42 React Pattern class Producer[T](buffer:Actor) extends Actor { def prod() { buffer ! produce((T)10) } def act() { prod() } Bounded Buffer } Producer class Consumer class class Consumer[T](buffer:Actor) extends Actor { def cons() { buffer ! consume(self); } def act() { cons() loop { react { case cargo(m:T) => m } } } } Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia Generator Pattern 43 Data object class Function If condition is false return false Else processing return true End if End function End class Active object class Run function while invoking data object class’s function returns false yield yield End run function End class Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 44 Generator Pattern Bounded Buffer Buffer class class buf: def __init__(self, capacity): self.capacity = capacity self.items = [] def produce(self, item): if len(self.items) >= capacity: return false self.items.append(item) return true def consume(self): if len(self.items) <= 0: return false return self.items[0] Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia 45 Generator Pattern Bounded Buffer Producer generator Consumer generator def producer(buf): while True: while not buf.produce(10): yield yield def consumer(buf): while True: while not buf.consume(): yield yield Doctoral Prospectus by Zhen Li, Computer Science Department, University of Georgia