Threads

advertisement
Threads
What do we have so far
 The basic unit of CPU utilization is a process.
 To run a program (a sequence of code), create a
process.
 Processes are well protected from one another.
 Switching between processes is fairly expensive.
 Communications between processes are done through
inter-process communication mechanisms.
 Running process requires the memory management
system.
 Process I/O is done by the I/O subsystem.
Process for all concurrency
needs?
 Consider developing a PC game:
 Different code sequences for different characters
(soldiers, cities, airplanes, cannons, user controlled
heroes)
 Each of the characters is more or less independent.
 We can create a process for each character.
 Any drawbacks?
 The action of a character usually depends on the
game state (locations of other characters).
 Implication on the process based implementation of
characters?

A lot of context switching.
What do we really need for a PC
game?
 A way to run different sequences of code (threads of
control) for different characters.

Processes do this.
 A way for different threads of control to share data
effectively.

Processes are NOT designed to do this.
 Protection is not very important, a game is one application
anyway.

Process is an over-kill.
 Switching between threads of control must be as efficient
as possible.

Context switching is known to be expensive!!!
 Theads are created to do all of above.
Thread
 Process context
Process ID, process group ID, user ID, and group ID
 Environment
 Working directory.
 Program instructions
 Registers (including PC)
 Stack
 Heap
 File descriptors
 Signal actions
 Shared libraries
 Inter-process communication tools
 What is absolutely needed to run a sequence of code (a thread of
control)?

Process/Thread context
 What are absolutely needed to support a stream
of instructions, given the process context?











Process ID, process group ID, user ID, and group ID
Environment
Working directory.
Program instructions
Registers (including PC)
Stack
Heap
File descriptors
Signal actions
Shared libraries
Inter-process communication tools
Process and Thread
Threads
 Threads are executed within a process.
 Share process context means


easy inter-thread communication.
No protection among threads  but threads are intended to be
cooperative.
 Thread context: PC, registers, a stack, misc. info.
 Much smaller than the process context!!


Faster context switching
Faster thread creation
Threads
 Threads are also called light-weight processes.
 traditional processes are considered heavy-weight.
 A process can be single-threaded or multithreaded.
 Threads become so ubiquitous that almost all modern
computing systems use thread as the basic unit of
CPU utilization.
More about threads
 OS view: A thread is an independent stream of
instructions that can be scheduled to run by the
OS.
 Software developer view: a thread can be
considered as a “procedure” that runs
independently from the main program.


Sequential program: a single stream of instructions in a
program.
Multi-threaded program: a program with multiple
streams

Multiple threads are needed to use multiple cores/CPUs
Threads…
 Exist within processes
 Die if the process dies
 Use process resources
 Duplicate only the essential resources for OS to
schedule them independently
 Each thread maintains





Stack
Registers
Scheduling properties (e.g. priority)
Set of pending and blocked signals (to allow different
react differently to signals)
Thread specific data
The Pthreads (POSIX threads) API
 Three types of routines:





Thread management: create, terminate, join, and detach
Mutexes: mutual exclusion, creating, destroying,
locking, and unlocking mutexes
Condition variables: event driven synchronizaiton.
Mutexes and condition variables are concerned about
synchronization.
Why not anything related to inter-thread
communication?
 The concept of opaque objects pervades the
design of the API.
The Pthreads API naming
convention
Routine Prefix
Function
Pthread_
General pthread
Pthread_attr_
Thread attributes
Pthread_mutex_
mutex
Pthread_mutexattr
Mutex attributes
Pthread_cond_
Condition variables
Pthread_condaddr
Conditional variable attributes
Pthread_key_
Thread specific data keys
Compiling pthread programs
 Pthread header file <pthread.h>
 Compiling pthread programs: gcc –lpthread
aaa.c
Thread management routines
 Creation: pthread_create
 Termination:



Return
Pthread_exit
Can we still use exit?
 Wait (parent/child synchronization):
pthread_join
Creation
 Thread equivalent of fork()
 int pthread_create(
pthread_t * thread,
pthread_attr_t * attr,
void * (*start_routine)(void *),
void * arg
);
 Returns 0 if OK, and non-zero (> 0) if
error.
 Parameters for the routines are passed
through void * arg.

What if we want to pass a structure?
Termination
Thread Termination


Return from initial function.
void pthread_exit(void * status)
Process Termination


exit() called by any thread
main() returns
Waiting for child thread
 int pthread_join( pthread_t tid, void **status)
 Equivalent of waitpid()for processes
Detaching a thread
 The detached thread can act as daemon thread
 The parent thread doesn’t need to wait
 int pthread_detach(pthread_t tid)
 Detaching self :
pthread_detach(pthread_self())
Some multi-thread program
examples
 A multi-thread program example: example1.c
 Making multiple producers: example2.c


What is going on in this program?
How to fix it?
Matrix multiply and threaded
matrix multiply
 Matrix multiply: C = A × B
N
C[i, j ]   A[i, k ]  B[k , j ]
k 1
C[0, 1], ......, C[0, N - 1]   A[0, 0],
A[0, 1], ......, A[0, N - 1]   B[0, 0],
B[0, 1], ......, B[0, N - 1] 
 C[0, 0],

 
 

C[1, 1], ......, C[1, N - 1]   A[1, 0],
A[1, 1], ......, A[1, N - 1]   B[1, 0],
B[1, 1], ......, B[1, N - 1] 
 C[1, 0],
 .................................................
   .................................................
   .................................................


 
 

 C[N - 1, 0], C[N - 1, 1], ......, C[N - 1, N - 1]   A[N - 1, 0], A[N - 1, 1], ......, A[N - 1, N - 1]   B[N - 1, 0], B[N - 1, 1], ......, B[N - 1, N - 1] 

 
 

Matrix multiply and threaded
matrix multiply
 Sequential code:
For (i=0; i<N; i++)
for (j=0; j<N; j++)
for (k=0; k<N; k++) C[I, j] = C[I, j] + A[I, k] * A[k, j]
 Threaded code program

The calculation of c[I,j] does not depend on other C term.
Mm_pthread.c.
C[0, 1], ......, C[0, N - 1] 
 C[0, 0],


C[1,
0],
C[1,
1],
......,
C[1,
N
1]


 .................................................



 C[N - 1, 0], C[N - 1, 1], ......, C[N - 1, N - 1] 


PI calculation
1 n
PI  lim( 
n i 1
n 
4. 0
 i  0.5 

 n 
2
)
1 .0  
 Sequential code: pi.c
 Threaded version: homework
Type of Threads
 Independent threads
 Cooperative threads
Independent Threads
 No states shared with other threads
 Deterministic computation

Output depends on input
 Reproducible

Output does not depend on the order and
timing of other threads
 Scheduling order does not matter.
Cooperating Threads
 Shared states
 Nondeterministic
 Nonreproducible
 Example: 2 threads sharing the same display
Thread A
cout << “ABC”;
Thread B
cout << “123”;
 You may get “A12BC3”
So, Why Allow Cooperating Threads?
 Shared resources

e.g., a single processor
 Speedup


Occurs when threads use different resources
at different times
mm_pthread.c
 Modularity

An application can be decomposed into
threads
Some Concurrent Programs
 If threads work on separate data, scheduling
does not matter
Thread A
x = 1;
Thread B
y = 2;
Some Concurrent Programs
 If threads share data, the final values are not
as obvious
Thread A
x = 1;
x = y + 1;
Thread B
y = 2;
y = y * 2;
 What are the indivisible operations?
Atomic Operations
 An atomic operation always runs to
completion; it’s all or nothing

e.g., memory loads and stores on most
machines
 Many operations are not atomic

Double precision floating point store on 32-bit
machines
Suppose…
 Each C statement is atomic
 Let’s revisit the example…
All Possible Execution Orders
Thread A
x = 1;
x = y + 1;
Thread B
y = 2;
y = y * 2;
x=1
x=y+1
y=2
y=2
x=1
y=y*2
y=2
x=y+1
y=y*2
x=1
y=y*2
y=y*2
x=y+1
x=y+1
A decision tree
All Possible Execution Orders
Thread A
x = 1;
x = y + 1;
Thread B
y = 2;
y = y * 2;
(x = ?, y = ?)
(x = ?, y = 2)
y=2
(x = 1, y = ?)
x=1
(x = ?, y = ?)
x=y+1
(x = ?, y = 2)
y=2
(x = 1, y = 2)
y=2
(x = 3, y = 2)
x=y+1
y=y*2
y=y*2
(x = ?, y = 4)
(x = 3, y = 4)
x=1
(x = ?, y = 4)
y=y*2
(x = 1, y = 4)
(x = 1, y = 4)
x=1
y=y*2
x=y+1
x=y+1
(x = 5, y = 4)
(x = 5, y = 4)
Another Example
 Assume each C statement is atomic
Thread A
j = 0;
while (j < 10) {
++j;
}
cout << “A wins”;
Thread B
j = 0;
while (j > -10) {
--j;
}
cout << “B wins”;
So…
 Who wins?
 Can the computation go on forever?
 Race conditions occur when threads share
data, and their results depend on the timing
of their executions…
 The take home point: sharing data in threads
can cause problems, we need some
mechanism to deal with such situation.
Download