Concurrency • Concurrency can occur at four levels: 1. Machine instruction level – may have both an adder and a multiplier that are used at the same time. 2. High-level language statement level – might have a loop (a,b,c) where c from one iteration and a from the next are executed at the same time. 3. Unit level: several methods execute together 4. Program level: several program execute together 1 Suppose we have two methods Populate marsh{ for(i=0;i< 1000; i++){ create frog // high level language statement create carp create mosquitos} } Populate prehistoric world{ for (i=0;i<10,i++) create dinosaur(i) } 2 Concurrency can occur at four levels (termed granularity) 1. Machine instruction level – Create frog is decomposed into basic parts. If one basic instruction is to fold both sides into center, perhaps one “processor” folds the left side and one folds the right. 2. High-level language statement level different parts of “make frog” happen together 3. Unit level: populate marsh occurs with populate prehistoric world 4. Program level: several programs (to do other things not shown here) execute together 3 What would be the advantages/disadvantages of each type of parallelism? 4 The Evolution of Multiprocessor Architectures 1. Late 1950s - One general-purpose processor and one or more special-purpose processors for input and output operations 2. Early 1960s - Multiple complete processors, used for program-level concurrency 3. Mid-1960s - Multiple partial processors, used for instruction-level concurrency 4. Single-Instruction Multiple-Data (SIMD) machines. The same instruction goes to all processors, each with different data - e.g., vector processors 5. Multiple-Instruction Multiple-Data (MIMD) machines • Independent processors that can be synchronized (unit-level concurrency) 5 Making a Frog Fold in sides 6 Take lower corner and fold up to top. Repeat with other side. Fold into middle Repeat 7 Examples • SIMD - all do the same things at the same time. All fold; All Open; All fold again • Pipelined – one person does fold, and then passes. Problems? • MIMD – all do different things 8 9 • Def: A thread of control in a program is the sequence of program points reached as control flows through the program • Categories of Concurrency: 1. Physical concurrency - Multiple independent processors (multiple threads of control) 2. Logical concurrency - The appearance of physical concurrency is presented by timesharing one processor (software can be designed as if there were multiple threads of control) 10 What would be the advantage of logical concurrency? Consider the TV remote as performing “context switch”. Why does one switch between multiple programs? What is downside to switch? Example: Smart Remote Ads play when you are not watching, assume “program” doesn’t continue when you aren’t watching it • You might be an E-mail Junkie.. • You might be a computer science major • Attraction to computer scientists 12 Concerns? • Is switching between tasks confusing? What would need to be retained? • Is switching between tasks expensive? Would there be a minimal size at which you spawn more tasks? • What is the gain? 13 • What is the gain? – Models actual situation better – response time – Use delays in processing 14 Why do we want parallelism? • Price-performance curves • Used to be – paid more for computer - got more (linear relationship between price and performance). • Now, for little money, get a lot of power. As you add more money, performance curve levels off. Not an efficient way to get more performance • Parallelism is the answer – string cheap computers together to do more work. 15 What is a Thread ? • Just as multitasking OS’s can run more than one process “concurrently”, a process can do the same by running more than a single thread. • Each Thread is a different stream of control that can execute its instructions independently. • Compared to a process, a thread is inexpensive to create, terminate, schedule or synchronize. 16 What is a Thread ? • A process is a HEAVY-WEIGHT kernellevel entity. (process struct) • A thread is a LIGHT_WEIGHT entity comprising the registers, stack and some other data. • The rest of the process struct is shared by all threads. (address space, file desc, etc.) • Most of the thread structure is at the user space allowing very fast access. 17 So for our example • If we had two processes to populate the marsh and to populate the prehistoric world, each process would be able to stand alone. • If we had two threads to populate the marsh and to populate the prehistoric world, they would have some shared resources (like the table or paper supply) 18 Concurrency Vs. Parallelism • Concurrency means that two or more threads can be in the “middle” of executing code. • Only one can be on the CPU though at any given time. • Parallelism actually involves multiple CPUs running threads at the same time. • Concurrency is the illusion of Parallelism 19 What can threads do that can’t be done by processes sharing memory ? • Answer: Nothing !... If you have – plenty of time to kill programming, – more time to kill processing, – willing to burn money by buying RAM • Debugging cross-process programs are tough. • In Solaris creating a thread is 30 TIMES FASTER than forking a process. • Synchronization is 10 time faster with threads. • Context Switching - 5 times faster 20 What Applications to Thread? • Multiplexing (communicate two or more signals over a common channel) – Servers • Synchronous Waiting (definition?) – clients – I/O • Event Notification • Simulations • Parallelizable Algorithms – Shared memory multiprocessing 21 – Distributed Multiprocessing Which Programs NOT to thread? • • • • Compute bounds threads on a uniprocessor. Very small threads (threads are not free) Old Code Parallel execution of threads can interfere with each other. • WARNING: Multithreaded applications are more difficult to design and debug than single threaded apps. Threaded programming design requires careful preparation ! 22 Synchronization • The problem – Data Race - occurs when more than one thread is trying to update the same piece of data. – Critical Section - Any piece of code to which access needs to be controlled. • The Solution – Mutex – Condition Variables – Operations - init, lock, unlock 23 MUTEX • A MUTual EXclusion allows exactly one thread access to a variable or critical section of code. • Access attempts by other threads are blocked until the lock is released. 24 • Kinds of synchronization: 1. Cooperation – Task A must wait for task B to complete some specific activity before task A can continue its execution e.g., You cut the paper and then I fold it. 2. Competition – When two or more tasks must use some resource that cannot be simultaneously used e.g., we both want the scissors. 25 • Liveness means the unit will eventually complete its execution. I’m currently blocked from finishing my frog, but I will eventually get to finish. • In a concurrent environment, a task can easily lose its liveness. You were supposed to wake me up when the scissors became available, but you forgot. • If all tasks in a concurrent environment lose their liveness, it is called deadlock. I take the paper and wait for the scissors. You take the scissors and wait for the paper. Circular wait is deadlock. 26 Livelock: theoretically can finish, but never get the resources to finish. How do you prevent deadlock? How do you prevent livelock? 27 Questions? 28 • Methods of Providing Synchronization: 1. Semaphores 2. Monitors 3. Message Passing 29 Semaphores • Dijkstra - 1965 • A semaphore is a data structure consisting of a counter and a queue for storing task descriptors • Semaphores can be used to implement guards on the code (controlling access) that accesses shared data structures • Semaphores have only two operations, wait and signal (originally called P and V by Dijkstra) • Semaphores can be used to provide both competition and cooperation synchronization 30 Example • Suppose I was in a “frog renting” business. • I have a collection of frogs. • I keep track of my frogs via a semaphore • When you come to rent a frog, if I have some, I just adjust my semaphore (count). • If you come and I don’t have one, I place you in a queue. frogAvail = 4 31 • Cooperation Synchronization with Semaphores – Example: A shared buffer – e.g., holding area for frogs – The buffer is implemented as an ADT with the operations DEPOSIT and FETCH as the only ways to access the buffer – Use two semaphores for cooperation: emptyspots (number of empty spots) and fullspots (number of full spots) 32 • DEPOSIT must first check emptyspots to see if there is room in the buffer (for a new frog) • If there is room, the counter of emptyspots is decremented and the value is inserted • If there is no room, the caller is stored in the queue of emptyspots (to wait for room) • When DEPOSIT is finished, it must increment the counter of fullspots 33 • FETCH must first check fullspots to see if there is an item – If there is a full spot, the counter of fullspots is decremented and the value is removed – If there are no values in the buffer, the caller must be placed in the queue of fullspots – When FETCH is finished, it increments the counter of emptyspots • The operations of FETCH and DEPOSIT on the semaphores are accomplished through two semaphore operations named wait and signal. 34 Semaphores wait(aSemaphore) if aSemaphore’s counter > 0 then Decrement aSemaphore’s counter else Put the caller in aSemaphore’s queue Attempt to transfer control to some ready task end 35 Semaphores signal(aSemaphore) if aSemaphore’s queue is empty then Increment aSemaphore’s counter else Put the calling task in the task ready queue Transfer control to a task from aSemaphore’s queue end 36 Producer Code semaphore fullspots, emptyspots; fullstops.count = 0; emptyspots.count = BUFLEN; task producer; loop -- produce VALUE –wait (emptyspots); //wait for space DEPOSIT(VALUE); signal(fullspots);//increase filled end loop; end producer; 37 Consumer Code task consumer; loop wait (fullspots); //wait till not empty FETCH(VALUE); signal(emptyspots); //increase empty -- consume VALUE –end loop; end consumer; 38 Competition Synchronization with Semaphores – A third semaphore, named access, is used to control access to buffer itself as trying to produce and consume at same time may be problem (competition synchronization) • The counter of access will only have the values 0 and 1 • Such a semaphore is called a binary semaphore – Note that wait and signal must be atomic! 39 Producer Code access, fullspots, semaphore emptyspots; access.count = 0; fullstops.count = 0; emptyspots.count = BUFLEN; task producer; loop -- produce VALUE –wait(emptyspots); //wait for space wait(access); //wait for access DEPOSIT(VALUE); signal(access); //relinquish access signal(fullspots); //increase filled end loop; 40 end producer; Consumer Code task consumer; loop wait(fullspots);//wait till not empty wait(access); //wait for access FETCH(VALUE); signal(access); //relinquish access signal(emptyspots); //increase empty -- consume VALUE –end loop; end consumer; 41 Semaphores • Evaluation of Semaphores: 1. Misuse of semaphores can cause failures in cooperation synchronization, e.g., the buffer will overflow if the wait of fullspots is left out 2. Misuse of semaphores can cause failures in competition synchronization, e.g., the program will deadlock if the release of access is left out 42 Monitors • Concurrent Pascal, Modula, Mesa, Java • The idea: encapsulate the shared data and its operations to restrict access • A monitor is an abstract data type for shared data 43 Monitor Buffer Operation 44 Monitors • Evaluation of monitors: – Support for competition synchronization is great. Less chance for errors as system controls. – Support for cooperation synchronization is very similar as with semaphores, so it has the same problems 45 Message Passing • Message passing is a general model for concurrency – It can model both semaphores and monitors – It is not just for competition synchronization • Central idea: task communication is like seeing a doctor--most of the time he waits for you or you wait for him, but when you are both ready, you get together, or rendezvous (don’t let tasks interrupt each 46 other) Message Passing • In terms of tasks, we need: a. A mechanism to allow a task to indicate when it is willing to accept messages b. Tasks need a way to remember who is waiting to have its message accepted and some “fair” way of choosing the next message • Def: When a sender task’s message is accepted by a receiver task, the actual message transmission is called a rendezvous 47 Thank You! 48 Java Threads • Competition Synchronization with Java Threads – A method that includes the synchronized modifier disallows any other method from running on the object while it is in execution – If only a part of a method must be run without interference, it can be synchronized 49 Java Threads • Cooperation Synchronization with Java Threads – The wait and notify methods are defined in Object, which is the root class in Java, so all objects inherit them – The wait method must be called in a loop 50 • Basic thread operations – A thread is created by creating a Thread or Runnable object – Creating a thread does not start its concurrent execution; it must be requested through the Start method – A thread can be made to wait for another thread to finish with Join – A thread can be suspended with Sleep 51 C# Threads • Synchronizing threads – The Interlock class – The lock statement – The Monitor class • Evaluation – An advance over Java threads, e.g., any method can run its own thread – Thread termination cleaner than in Java - abort – Synchronization is more sophisticated 52 Message Passing Concepts: synchronous message passing - channel asynchronous message passing - port - send and receive / selective receive rendezvous bidirectional communications - entry - call and accept ... reply Models: channel port entry : relabelling, choice & guards : message queue, choice & guards : port & channel Practice: distributed computing (disjoint memory) threads and monitors (shared memory) 53 Synchronous Message Passing - channel Sender send(e,c) Channel c one-to-one send(e,c) - send the value of the expression e to channel c. The process calling the send operation is blocked until the message is received from the channel. Receiver v=receive(c) v = receive(c) - receive a value into local variable v from channel c. The process calling the receive operation is blocked waiting until a message is sent to the channel. cf. distributed assignment v = e 54 Demonstration of Channel • Try to pass all objects to final destination • To send- hold out object, but must be taken before you can do anything else. • Advantages? Disadvantages? 55 synchronous message passing applet A sender communicates with a receiver using a single channel. The sender sends a sequence of integer values from 0 to 9 and then restarts at 0 again. Instances of ThreadPanel Channel chan = new Channel(); tx.start(new Sender(chan,senddisp)); rx.start(new Receiver(chan,recvdisp)); Instances of SlotCanvas 56 Java implementation - channel class Channel extends Selectable { Object chann = null; public synchronized void send(Object v) throws InterruptedException { chann = v; signal(); while (chann != null) wait(); } public synchronized Object receive() throws InterruptedException { block(); clearReady(); //part of Selectable Object tmp = chann; chann = null; notifyAll(); //could be notify() return(tmp); } 57 Java implementation - sender class Sender implements Runnable { private Channel chan; private SlotCanvas display; Sender(Channel c, SlotCanvas d) {chan=c; display=d;} public void run() { try { int ei = 0; while(true) { display.enter(String.valueOf(ei)); ThreadPanel.rotate(12); chan.send(new Integer(ei)); display.leave(String.valueOf(ei)); ei=(ei+1)%10; ThreadPanel.rotate(348); } } catch (InterruptedException e){} 58 } } Java implementation - receiver class Receiver implements Runnable { private Channel chan; private SlotCanvas display; Receiver(Channel c, SlotCanvas d) {chan=c; display=d;} public void run() { try { Integer v=null; while(true) { ThreadPanel.rotate(180); if (v!=null) display.leave(v.toString()); v = (Integer)chan.receive(); display.enter(v.toString()); ThreadPanel.rotate(180); } } catch (InterruptedException e){} } } 59 selective receive Sender c1 Sender send(e,c) c2 Sender[n] send(e,c) cn send(en,cn) Select statement... Channels How should we deal with multiple channels? select when G1 and v1=receive(chan1) => S1; or when G2 and v2=receive(chan2) => S2; or when Gn and vn=receive(chann) => Sn; 60 selective receive CARPARK ARRIVALS arrive CARPARK CONTROL depart DEPARTURES 61 Asynchronous Message Passing - port Port p Sender Sender Sender[n] send(e,c) send(e,c) send(en,p) Receiver v=receive(p) many-toone send(e,c) - send the value of the expression e to port p. The process calling the send operation is not blocked. The message is queued at the port if the receiver is not waiting. v = receive(c) receive a value into local variable v from port p. The process calling the receive operation is blocked if there are no messages queued to the62 port. asynchronous message passing applet Two senders communicate with a receiver via an “unbounded” port. Each sender sends a sequence of integer values from 0 to 9 and then restarts at 0 again. Instances of ThreadPanel Port port = new Port(); tx1.start(new Asender(port,send1disp)); tx2.start(new Asender(port,send2disp)); rx.start(new Instances of Areceiver(port,recvdisp)); SlotCanvas 63 Java implementation - port class Port extends Selectable { Vector queue = new Vector(); The implementation of Port is a public synchronized void send(Object v){ monitor that queue.addElement(v); has signal(); synchronized } access methods public synchronized Object receive() for send and throws InterruptedException { receive. block(); clearReady(); Object tmp = queue.elementAt(0); queue.removeElementAt(0); return(tmp); } } 64 Rendezvous - entry Rendezvous is a form of request-reply to support client server communication. Many clients may request service, but only one is serviced at a time. Client Server res=call(entry,req) Request message suspended req=accept(entry) perform service Reply message reply(entry,res) 65