Lecture 20 Log into Linux. Copy directory /home/hwang/cs375/lecture20/ Questions? Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 1 Outline ● Introduction to Threads ● Thread Management ● Thread Synchronization ● Reentrancy ● Note: the man pages for this topic's routines are in the glibc-doc package. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 2 Introduction to Threads ● ● ● Multiple strands of execution within a single process are called threads. Threads share the same global memory (data and heap), but each thread has its own stack (automatic variables). Threads also may share the same process ID, controlling terminal, user and group IDs, open file descriptors, signal handlers, and more (see the pthreads man page). Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 3 Introduction to Threads ● ● ● Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads. Two pointers having the same value point to the same data. Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 4 Advantages of Threads ● ● ● ● Threads are useful when you want one program to appear to do two (or more) things at once (word count in a word processor). Performance may improve if I/O and computation are in separate threads. Threads can utilize multicore CPUs. Switching between threads requires less work by the OS than process switching. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 5 Disadvantages of Threads ● ● ● ● Writing multithreaded programs requires very careful design. Subtle errors in synchronization or shared memory access are easy to make. Debugging multithreaded programs is hard. Splitting a large calculation into two parts as different threads does not necessarily result in better performance. Reentrant functions must be used. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 6 Linux Threads ● ● The POSIX pthreads API is the most popular API for multithreaded programming. There have been two implementations of pthreads on Linux: LinuxThreads and NPTL (native POSIX threads library). NPTL better conforms to the POSIX standard and LinuxThreads is now obsolete. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 7 Linux Threads ● ● Programs using the POSIX thread routines should include the <pthread.h> header file. Compile these programs using the “-lpthread” option. (It is not necessary to define the _REENTRANT macro as described in BLP.) Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 8 Creating Threads ● All processes contain a main thread. Additional threads are created using the pthread_create() routine: int pthread_create( pthread_t *thread, pthread_attr_t *attr, void *(*start_routine)(void *), void *arg); ● On success 0 is returned and the thread ID is returned in the location pointed to by thread. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 9 Creating Threads ● ● ● The attr argument specifies thread attributes. See the pthread_attr_init man page for a list of attributes. It can be NULL if the default attributes are to be applied. The new thread executes the start_routine function. arg is passed as the first argument to the function. All threads are peers. New threads can create other threads. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 10 Terminating Threads ● ● ● You can use pthread_exit( ) to exit a thread explicitly. (A simple return from the thread routine also will terminate the thread.) If main( ) exits with pthread_exit( ), other threads will continue to run. Otherwise they will be automatically terminated. pthread_cancel( ) can be used by one thread to cancel another one. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 11 Joining Threads ● Use pthread_join( ) to wait for a specific thread to terminate: int pthread_join( pthread_t thread, void **thread_return); ● pthread_join( ) waits on the thread with ID thread to complete. If thread_return is not NULL, the thread return value is stored at the location pointed to by thread_return. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 12 Joining Threads ● ● ● A thread may have either the joinable or detached attribute. Only threads that are created as joinable (the default) can be joined. If a thread is created as detached, it can never be joined. pthread_detach() can be used to detach a thread created as joinable. This can be done to conserve resources. See file pthread_intro.cpp for an example. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 13 Other Thread Routines ● pthread_self( ) returns the unique, system assigned thread ID of the calling thread. pthread_equal( ) compares two thread IDs. If the two IDs are different, 0 is returned; otherwise a non-zero value is returned. pthread_t pthread_self(); int pthread_equal(pthread_t thr_id1, pthread_t thr_id2); ● The pthread_once( ) routine can be used to ensure that initialization code is run exactly one time. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 14 Thread Synchronization ● ● ● A join is one mechanism for obtaining synchronization between threads. The threads library also provides mutexes and condition variables for synchronization. These will be discussed in the next lecture. SYS V or POSIX semaphores can be used for thread synchronization. POSIX unnamed semaphores are especially attractive. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 15 POSIX Unnamed Semaphores ● Instead of using sem_open( ) to create a semaphore you can create one directly: sem_t sem; // method 1 sem_t *sem = malloc(sizeof(sem_t)); // method 2 sem_t *sem = new sem_t; // method 3 ● This is known as an unnamed semaphore. It must be initialized using sem_init( ): int sem_init(sem_t *sem, int pshared, unsigned value); Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 16 POSIX Unnamed Semaphores ● ● If pshared is 0 the semaphore is to be shared between threads and must be a global variable or allocated on the heap (method 2 or 3). If pshared is nonzero the semaphore is to be shared between processes and must be allocated in a shared memory region. value is the desired initial value of the semaphore. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 17 POSIX Unnamed Semaphores ● An unnamed semaphore is destroyed using sem_destroy( ): int sem_destroy(sem_t *sem); ● ● ● ● The routines sem_close( ) and sem_unlink( ) are not used with unnamed semaphores. sem_wait( ) and sem_post( ) are used to acquire and release an unnamed semaphore. Access to unnamed semaphores is faster than to named semaphores. See file pthread_sem.cpp for an example. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 18 Reentrant Functions ● ● The threaded routines must call functions which are reentrant or thread safe. A reentrant function is one that can be safely executed concurrently; that is, the routine can be reentered while it is already running. To be reentrant a function must: 1. not use static or global variables, 2. not return the address to static data, 3. work only on the data provided to it by the caller. 4. not call non-reentrant functions. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 19 Reentrant Functions ● ● If static variables are used then mutexes (semaphores) must be applied or the functions must be re-written to avoid the use of these variables. In C, local variables are dynamically allocated on the stack. Therefore, any function that does not use static data or other shared resources is thread-safe. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 20 Reentrant Functions ● ● Thread-unsafe functions may be used by only one thread (semaphores may be used to ensure that this is so). Many non-reentrant functions return a pointer to static data. This can be avoided by returning dynamically allocated data or using callerprovided storage. An example of a non-thread safe function is strtok, which is also not reentrant. The "thread safe" version is the reentrant version strtok_r. Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 21 Reentrant Functions ● An example of a non-reentrant function: int g_var = 1; int f() { g_var = g_var + 2; return g_var; } ● A slight alteration for reentrancy: int f(int i) { return i + 2;} Thursday, November 5 CS 375 UNIX System Programming - Lecture 20 22