Threads By Dr. Yingwu Zhu Chapter 4: Threads Overview Multithreading Models Threading Issues Pthreads Threaded Applications Web browsers: display and data retrieval Web servers Many others Threads What is a thread ? Lightweight Process (LWP)? Basic unit of CPU utilization Contains Thread ID Program counter Register set Stack Why multithreading ? Creating processes are expensive Other advantages Single and Multithreaded Processes Benefits Responsiveness Resource Sharing share memory and resources of the process they belong to Sharing code and data allow different threads of activity within the same address space Economy Processes are expensive to create, and do context-switch In Solaris Process creating is about 30 times slower Context-switch is about 5 times slower Utilization of MP Architectures A single-threaded process can only run on one CPU Thread Type User threads Kernel threads User Threads Thread management (creation, scheduling) done by user-level threads library Drawback No kernel resources allocated to the threads Blocking system call suspends other threads in the same process Three primary thread libraries: POSIX Pthreads Win32 threads Java threads Kernel Threads Supported by the Kernel Advantages Non-blocking thread execution (Similar to processes, when a kernel thread makes a blocking call, only that thread blocks ) Drawback Multi-processors (threads on different processors) Slower to create and manage than user-level Examples Windows XP/2000, Solaris Linux Tru64 UNIX, Mac OS X Multithreading Models Many-to-one One-to-one Many-to-many Many-to-One Many user-level threads mapped to single kernel thread Thread management is done by thread lib. in user space; so, it is efficient. But, a thread making a blocking system call block the entire process Multiple threads cannot run in parallel on MP computers (only one thread can access the kernel at a time) Used on systems that do not support kernel threads. Examples: Solaris Green Threads GNU Portable Threads Many-to-One Model Many-to-one Model Kernels do not support multiple threads of control Multithreading can be implemented entirely as a user-level library Schedule multiple threads onto the process’s single kernel thread; multiplexing multiple user threads on a single kernel thread Many-to-one (cont.): Benefits Cheap synchronization When a user thread wishes to perform synchronization, the user-level thread lib. checks to see if the thread needs to block. If a user thread does, the user-level thread lib. enqueues it, and dequeues another user thread from the lib.’s run queue, and swithes the active thread. No system calls are required Cheap thread creation The thread lib. need only create a context (i.e., a stack and registers) and enqueues it in the user-level run queue Many-to-one (cont.): Benefits Resource efficiency Kernel memory is not wasted on a stack for each user thread Allows as many thread as VM permits Portability User-level threads packages are implemented entirely with standard UNIX and POSIX lib. calls Many-to-one (cont.): Drawbacks Single-threaded OS interface If a user thread blocks (e.g, blocking system calls), the entire process blocks and so no other user thread can execute until the kernel thread (which is blocked in the system call) becomes available Solution: using nonblocking system calls Can not utilize MP achitectures Examples: Java, Netscape One-to-One Each user-level thread has (maps to) a kernel thread More concurrency than many-to-one: allowing another thread to run when a thread makes a blocking system call; allowing multiple threads running on MP computers as well Overhead: creating a kernel thread upon a user thread Examples Windows NT/XP/2000 Linux Solaris 9 and later One-to-one Model One-to-one (cont.): Benefits Scalable parallelism Each kernel thread is a different kernel-schedulable entity; multiple threads can run concurrently on multiprocessors Multithreaded OS interface When one user thread and its kernel thread block, the other user threads can continue to execute since their kernel threads are unaffected One-to-one (cont.): Drawbacks Expensive synchronization Kernel threads require kernel involvement to be scheduled; kernel thread synchronization will require a system call if the lock is not immediately acquired If a trap is required, synchronization will be from 3-10 times more costly than many-to-one model Expensive creation Every thread creation requires explicit kernel involvement and consumes kernel resources 3-10 times more expensive than creating a user thread One-to-one (cont.): Drawbacks Resource inefficiency Every thread created by the user requires kernel memory for a stack, as well as some sort of kernel data structure to keep track of it Many parts of many kernels cannot be paged out The presence of kernel threads is likely to displace physical memory for applications Many-to-Many Model Allows many (K) user level threads to be mapped to many (M) kernel threads: M<=K Allows the operating system to create a sufficient number of kernel threads without overburdening the system Solaris prior to version 9 Windows NT/2000 with the ThreadFiber package Many-to-Many Model Many-to-Many Model Combing the previous two models User threads are multiplexed on top of kernel threads which in turn are scheduled on top of processors Taking advantage of the previous two models while minimizing both’s disadvantages Creating a user thread does not necessarily require the creation of a kernel threads; synchronization can be purely user-level Threading Issues Due to multithreading: Semantics of fork() and exec() system calls Thread cancellation Signal handling Thread pools Thread specific data Scheduler activations Semantics of fork() and exec() Does fork() duplicate only the calling thread (single-threaded process) or all threads? It depends on applications Example: if call exec() after fork? Thread Cancellation Terminating a thread before it has finished Examples Multiple threads are concurrently doing the same task Cancel web browser’s on-going tasks Two general approaches: Asynchronous cancellation terminates the target thread immediately Deferred cancellation allows the target thread to periodically check if it should be cancelled Signal Handling Signals are used in UNIX systems to notify a process that a particular event has occurred A signal handler (user-defined handler overrides default handler) is used to process signals 1. 2. 3. Depends on signal type Signal is generated by particular event Signal is delivered to a process Signal is handled Synchronous signals (e.g., division by 0, illegal memory access) delivered to the thread causing the signal Asynchronous signals have options Options: Deliver the signal to the thread to which the signal applies Deliver the signal to every thread in the process, e.g, ctrl-c Deliver the signal to certain threads in the process: kill(aid, signal) Assign a specific thread to receive all signals for the process Thread Pools Create a number of threads in a pool where they await work Advantages: Usually slightly faster to service a request with an existing thread than create a new thread Allows the number of threads in the application(s) to be bound to the size of the pool Thread Specific Data Threads belonging to a process share the data of the process Allows each thread to have its own copy of data Useful when you do not have control over the thread creation process (i.e., when using a thread pool) Scheduler Activations M:M models require communication to maintain the appropriate number of kernel threads allocated to the application by an immediate data structure called LWP (light-weight process), a virtual processor LWP runs a user thread; LWP maps to a kernel thread which the OS schedules to run on the physical processor Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library; upcall handler perform the task, mapping a user thread to a new LWP, or removing a user thread being blocked from a LWP The kernel provides a LWP for a user thread This communication allows an application to maintain the correct number kernel threads Pthreads A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization API specifies behavior of the thread library, implementation is up to development of the library Common in UNIX operating systems (Solaris, Linux, Mac OS X) Pthread Tutorial Creating and destroying threads How to use POSIX threads How to compile? $ gcc –o proj2 proj2.c –pthread The option specifies that pthreads library should be linked causes the complier to properly handle multiple threads in the code that it generates Creating and Destroying Threads Creating threads Step 1: create a thread Step 2: send the thread one or more parameters Destroy threads Step 1: destroy a thread Step 2: retrieve one or more values that are returned from the thread Creating Threads - - #include <pthread.h> int pthread_create (pthread_t *thread_id, pthread_attr_t *attr, void *(*thread_fun)(void *), void *args); The #1 para returns thread ID The #2 para pointing to thread attr. NULL represents using the default attr. settings The #3 para as pointer to a function the thread is to execute The #4 para is the arguments to the function Thread Terminates Pthreads terminate when the function returns, or the thread calls pthread_exit() int pthread_exit(void *status); status is the return value of the thread A thread_fun returns a void*, so calling “return (void *) is the equivalent of this function Thread termination One thread can wait (or block) on the termination of another by using pthread_join() You can collect the exit status of all threads you created by pthread_join() int pthread_join(pthread_t thread_id, void **status) pthread_t pthread_self(); The exit status is returned in status Get its own thread id int pthread_equal(pthread_t t1, pthread_t t2); Compare two thread ids Example #include <pthread.h> void * thread_fun(void *arg) { int *inarg = (int *)arg; … return NULL; } Int main() { pthread_t tid; void *exit_state; int val = 42; pthread_create(&tid, NULL, thread_fun, &val); pthread_join(tid, &exit_state); return 0; } Kill Threads Kill a thread before it returns normally using pthread_cancel() But Make sure the thread has released any local resources; unlike processes, the OS will not clean up the resources Why? Threads in a process share resources Exercise Write a multithreaded program that calculates the summation of a non-negative integer in a separate thread (1+2+3+…+N) The non-negative integer is from command-line parameter The summation result is kept in a global variable: int sum; // shared by threads Step 1: write a thread function void *thread_sum(void *arg) { int i; int m = (int)(*arg); sum = 0; //initialization for (i = 0; i <= sum; i++) sum += I; pthread_exit(0); } Step 2: write the main() int sum; int main(int argc, char *argv[]) { pthread_t tid; if (argc != 2) { printf(“Usage: %s <integer-para>\n”, argv[0]); } int i = atoi(argv[1]); if (i < 0) { printf(“integer para must be non-negative\n”); } pthread_create(&tid, NULL, thread_sum, &i); pthread_join(tid, NULL); printf(“sum = %d\n”, sum); } return -1; return -2; Exercise Write a program that creates 10 threads. Have each thread execute the same function and pass each thread a unique number. Each thread should print “Hello, World (thread n)” five times where ‘n’ is replaced by the thread’s number. Use an array of pthread_t objects to hold the various thread IDs. Be sure the program doesn’t terminate until all the threads are complete. (Try running your program on more than one machine. Are there any differences in how it behaves?) Returning Results from Threads Thread function return a pointer to void: void * Pitfalls in return value Pitfall #1 void *thread_function ( void *) { int code = DEFAULT_VALUE; return ( void *) code ; } Only work in machines where integers can convert to a point and then back to an integer without loss of information Pitfall #2 void *thread_function ( void *) { char buffer[64]; // fill up the buffer with sth good return ( void *) buffer; } This buffer will disappear as the thread function returns Pitfall #3 void *thread_function ( void *) { static char buffer[64]; // fill up the buffer with sth good return ( void *) buffer; } It does not work in the common case of multiple threads running the same thread funciton Right Way void *thread_function ( void *) { char* buffer = (char *)malloc(64); // fill up the buffer with sth good return ( void *) buffer; } Right Way int main() { void *exit_state; char *buffer; …. pthread_join(tid, &exit_state); buffer = (char *) exit_state; printf(“from thread %d: %s\n”, tid, buffer); free(exit_state); } Exercise Write a program that computes the square roots of the integers from 0 to 99 in a separate thread and returns an array of doubles containing the results. In the meantime the main thread should display a short message to the user and then display the results of the computation when they are ready. Exercise Which of the following components of program state are shared across threads in a multithreaded process? a. register values b. heap memory c. Global variables d. stack memory #include <pthread.h> #include <stdio.h> int value = 0; void* runner(void* param); int main(int argc, char* argv[]) { int pid; pthread_t tid; pid = fork(); if (!pid) { pthread_create(&tid, NULL, runner, NULL); pthread_join(tid, NULL); printf(“CHILD: value= %d\n”, value); } else if (pid>0) { wait(NULL); printf(“PARENT: value = %d\n”, value); } } void* runner (void* param) { value = 5; phread_exit(0); }