Concurrent Programming

advertisement
Concurrent Programming
The Cunning Plan
• We’ll look into:
– What concurrent programming is
– Why you care
– How it’s done
• We’re going to skim over *all* the
interesting details
2
One-Slide Summary
• There are many different ways to do concurrent
programming
• There can be more than one
• We need synchronization primitives (e.g.
semaphores) to deal with shared resources
• Message passing is complicated
3
4
What?
• Concurrent Programming
– using multiple threads on a single machine
• OS simulates concurrency
or
• using multiple cores/processors
– using message passing
• Memory is not shared between threads
• More general in terms of hardware
requirements
5
What (Shorter version)
• There are a million different ways to do
concurrent programming
• We’ll focus on three:
– co-begin blocks
– threads
– message passing
6
Concurrent Programming: Why?
1. Because it is intuitive for some
problems (say you’re writing httpd)
2. Because we need better-than-sequential
performance
3. Because the problem is inherently
distributed (e.g. BitTorrent)
7
Coding Challenges
• How do you divide the problem across
threads?
– easy: matrix multiplication using threads
– hard: heated plate using message passing
– harder: n-body simulation for large n
8
One Slide on Co-Begin
• We want to execute commands
simultaneously m’kay—solution:
int x;
int y;
// ...
run-in-parallel{
functionA(&x) | functionB(&x,&y)
}
Main
A
B
9
Threads
• Most common in everyday applications
• Instead of a run-in-parallel block,
we want explicit ways to create and
destroy threads
• Threads can all see a program’s global
variables (i.e. they share memory)
10
Some Syntax:
Thread mythread =
new Thread( new Runnable() {
public void run() {
// your code here
}
}
);
mythread.start();
mythread.join();
11
Some Syntax:
void foo(int
Thread
mythread
& x)
= {
new
//Thread(
your code
newhere
Runnable() {
}
//...
public void run() {
// your code here
int bar = 5;
}
pthread_t my_id;
}
pthread_create(&my_id,
);
NULL,
foo, (void *)&bar);
mythread.start();
//
...
pthread_join(my_id, NULL);
12
Example: Matrix Multiplication
• Given:
• Compute:
9
7
4
*
*
*
2
5
-3
= -18
18
= 35
= -12
+----5
41
13
Matrix Multiplication ‘Analysis’
• We have
p = 4 size(A) = (p, q)
q = 3 size(B) = (q, r)
r=4
size(C) = (p, r)
• Complexity:
• p×r elements in C
• O(q) operations per element
• Note: calculating each element of C is
independent from the other elements
14
Matrix Multiplication using Threads
pthread_t threads[P][Q];
struct location locs[P][Q];
for (i = 0; i < P; ++i) {
for (j = 0; j < R; ++j) {
(locs[i][j]).row = i; (locs[i][j]).col = j;
pthread_create( &threads[i][j], NULL, calc_cell,
(void*)(&(locs[i][j]))
);
}
}
for (i = 0; i < P; ++i) {
for (j = 0; j < R; ++j) {
pthread_join( &threads[i][j], NULL);
}
}
15
Matrix Multiplication using Threads
for each element in C:
create a thread:
call the function 'calc_cell'
for each created thread:
wait until the thread finishes
// Profit
16
Postmortem
• Relatively easy to parallellize:
– matrices A and B are ‘read only’
– each thread writes to a unique entry in C
– entries in C do not depend on each other
• What are some problems with this?
– overhead of creating threads
– use of shared memory
17
Synchronization
• So far, we have only covered how to
create & destroy threads
• What else do we need? (See title)
18
Synchronization
• We want to do things like:
– event A must happen before event B
and
– events A and B cannot occur simultaneously
• Is there a problem here?
Thread 1
counter = counter + 1
Thread 2
counter = counter + 1
19
Semaphores
• A number n
(initialized to some value)
• Can only increment sem.V()
and decrement sem.P()
• n > 0 : P() doesn’t block
n ≤ 0 : P() blocks
V() unblocks some
waiting process
20
More Semaphore Goodness
• Semaphores are straightforward to
implement on most types of systems
• Easy to use for resource management
(set n equal to the number of resources)
• Some additional features are common
(e.g. bounded semaphores)
21
Semaphore Example
• Let’s try this again:
Main
Semaphore wes = new Semaphore(0)
// start threads 1 and 2 simultaneously
Thread 1
counter = counter + 1
wes.V()
Thread 2
wes.P()
counter = counter + 1
22
Semaphore Example 2
• Suppose we want two threads to “meet
up” at specific points in their code:
Semaphore aArrived = new Semaphore(0)
Sempahore bArrived = new Semaphore(0)
// start threads A and B simultaneously
Thread A
foo_a1
aArrived.V()
bArrived.P()
bArrived.P()
aArrived.V()
foo_a2
Thread B
foo_b1
bArrived.V()
aArrived.P()
aArrived.P()
bArrived.V()
foo_b2
23
Deadlock
• ‘Deadlock’ refers to a situation in which
one or more threads is waiting for
something that will never happen
• Theorem: You will, at some point in your
life, write code that deadlocks
24
Readers/Writers Problem
• Let’s do a slightly bigger example
• Problem:
– some finite buffer b
– multiple writer threads
(only one can write at a time)
– multiple reader threads
(many can read at a time)
– can only read if no writing is happening
25
Readers/Writers Solution #1
int readers = 0
Semaphore mutex = new Semaphore(1)
Semaphore roomEmpty = new Semaphore(1)
Writers:
roomEmpty.P()
// write here
roomEmpty.V()
26
Readers/Writers Solution #1
Readers:
mutex.P()
readers ++
if (readers == 1) roomEmpty.P()
mutex.V()
// read here
mutex.P()
readers -if (readers == 0) roomEmpty.V()
mutex.V()
27
Starvation
• Starvation occurs when a thread is
continuously denied resources
• Not the same as deadlock: it might
eventually get to run, but it needs to wait
longer than we want
• In the previous example, a writer might
‘starve’ if there is a continuous onslaught of
readers
28
Guiding Question
• Earlier, I said sem.V() unblocks some
waiting thread
• If we don’t unblock in FIFO order,
that means we could cause
starvation
• Do we care?
29
Synchronization Summary
• We can use semaphores to enforce
synchronization:
– ordering
– mutual exclusion
– queuing
• There are other constructs as well
• See your local OS Prof
30
Message Passing
• Threads and co. rely on shared memory
• Semaphores make very little sense if they
cannot be shared between n > 1 threads
• What about systems in which we can’t
share memory?
31
Message Passing
• Threads Processes are created for us
(they just exist)
• We can do the following:
blocking_send(int destination,
char * buffer,
int size)
blocking_receive(int source,
char * buffer,
int size)
32
Message Passing: Motivation
• We don’t care if threads run on
different machines:
– same machine - use virtual memory
tricks to make messages very quick
– different machines - copy and send
over the network
33
Heated Plate Simulation
• Suppose you have a metal plate:
• Three sides are chilled to 273 K
• One side is heated to 373 K
34
Heated Plate Simulation
Problem: Calculate the heat distribution
after some time t:
t=10
t=30
t=50
35
Heated Plate Simulation
• We model the problem by dividing the
plate into small squares:
• For each time step, take the average of a
square’s four neighbors
36
Heated Plate Simulation
• Problem: need to
communicate for each
time step
P1
P2
• Sending messages is
expensive…
P3
37
Heated Plate Simulation
• Problem: need to
communicate for each
time step
P1
• Sending messages is
expensive…
P2
• Solution: send fewer,
larger messages, limit
longest message path
P3
38
How to cause deadlock in MPI
Process 1
Process 2
char * buff = "Goodbye"
char * buff2 =
new char(15);
char * buff =
", cruel world\n“
char * buff2 =
new char(8);
send(2, buff, 8)
recv(2, buff, 15)
send(1, buff, 15)
recv(1, bugg, 8)
39
Postmortem
• Our heated plate solution does
not rely on shared memory
• Sending messages becomes complicated
in a hurry (easy to do the wrong thing)
• We need to reinvent the wheel constantly
for different interaction patterns
40
Example Summary
• Matrix Multiplication Example
– used threads and implicitly shared memory
– this is common for everyday applications
(especially useful for servers, GUI apps, etc.)
• Heated Plate Example
– used message passing
– this is more common for big science and big
business (also e.g. peer-to-peer)
– it is not used to code your average firefox
41
Guiding Question
If you’re writing a GUI app
(let’s call it “firefox”)
would you prefer to use threads or
message passing?
42
Summary
• Looked at three ways to do concurrent
programming:
– co-begin
– threads, implicitly shared memory,
semaphores
– message passing
• Concerns of scheduling, deadlock,
starvation
43
Summary
44
Download