Chapter 4: Multithreaded Programming Lecture 3 10.7.2008 Hao-Hua Chu 1 I/O Brush (MIT Media Lab) 2 Outline • • • • • Review what a process is. What is a thread and why? Multithreading models Thread libraries Threading issues 3 Process overview • What is a process? – A program in execution on the computer • What is multi-programming? • What resources does OS allocate to run a process? – CPU, Memory, file (I/O), network (I/O) – What is a process made up of in memory? • What is the cost in creating & running a process? • What is the cost in context-switching between processes? – Heavy vs. light load 4 Consider a Web Browser • How to load & render these 9+ images files quickly? – Issue parallel HTTP requests, one per process (for example only, not in real implementation). – What is the speedup? – What is the cost using many processes? – How to lower this cost? 5 Single vs. Multithreaded Processes 6 Benefit-Cost Analysis (multithreaded vs. single-threaded processes) • Benefits – Responsiveness • Some threads (image loading threads) blocked, other (UI thread) can continue to run. • Other examples? – Resource sharing • Reuse the same address space, data/code segments – Economy • Thread creation 30x faster than process creation • Thread context switch 5x faster than process context switch – Utilization of multiprocessor architecture • One process running multiple threads on multiple CPUs • What are the costs? 7 Implementing Multithreading • Poor implementation reduces benefit & increases cost – What are good evaluation metrics for threading implementation? • How to implement multithreading support? – At user-level (via user library): user threads • POSIX Pthreads, Win32 threads, Java threads – At kernal-level: kernel threads • Windows XP, Mac OS X, Solaris, almost all modern OS – Both • What does it mean to implement at the user-level and kernel-level? • Kernel threads are used for VMM, I/O as well as handling syscalls) 8 Multithreading Models • What are the possible mappings between user threads and kernel threads? – Many-to-one – One-to-one – Many-to-one • How about one-to-many? 9 Many-to-one Model • All threads from one process map to the same kernel thread – User-level thread library manages context-switching. • Is this implementation good (efficient vs. concurrency)? – Yes, but … – One user thread issues a blocking syscall? – Multi-processor system? • How to improve concurrency? 10 Many-to-one Model • Each user thread maps to one kernel thread • Is this implementation good (concurrency vs. efficiency)? – Good concurrency, why? (blocking syscall does not affect other threads) – Expensive, why? (user-thread creation -> kernel-thread creation) • How to have both good concurrency and efficiency? 11 Many-to-many Model • Many user threads are mapped to a smaller or equal number of kernel threads. – Why is this better than Many-to-one? (concurrency & multi-processor) – Why is this better than one-to-one? (efficiency) • Like one-to-one concurrency? – Two-level model 12 2-Level Model • Allows a user thread to be bound to kernel thread 13 Thread Library APIs: POSIX Pthreads 14 Threading Issues • • • • • • Semantics of fork() and exec() system calls Thread cancellation Signal handling Thread pools Thread specific data Scheduler activations 15 Semantics of fork() and exec() • Does fork() duplicate only the calling thread or all threads? – The best answer, when you are not sure, is both. – What would you do if exec() is called immediately after fork()? Thread Cancellation • Terminating a thread before it has finished • Two general approaches: – Asynchronous cancellation terminates the target thread immediately – Deferred cancellation allows the target thread to periodically check if it should be cancelled • Why have both? – Concurrency control … (more about this later) Signal Handling • Signals are used in UNIX systems to notify a process that a particular event has occurred • A signal handler is used to process signals 1. Signal is generated by particular event 2. Signal is delivered to a process 3. Signal is handled • Two types of signals: asynchronous vs. synchronous – What is the difference? • How to deliver a signal in a multithreaded process? – Which thread? One or all? Signal Handling • Options: – Deliver the signal to the thread to which the signal applies • Applied to Synchronous signals – Deliver the signal to every thread in the process – Deliver the signal to certain threads in the process – Assign a specific thread to receive all signals for the process Thread Pools • Create a number of threads in a pool where they await work • Advantages: – Usually slightly faster to service a request with an existing thread than create a new thread – Allows the number of threads in the application(s) to be bound to the size of the pool Thread Specific Data • Allows each thread to have its own copy of data (not shared!) • Useful when you do not have control over the thread creation process (i.e., when using a thread pool) Scheduler Activations • Both M:M and Two-level models require communication to maintain the appropriate number of kernel threads allocated to the application • Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library • This communication allows an application to maintain the correct number kernel threads Summary • Thread overview – Single-thread process vs. multithreaded processes • Multi-threading models – Many-to-one, one-to-one, many-to-many, two-level model • Thread issues – Semantics of fork() and exec() system calls, thread cancellation, signal handling, thread pools, thread specific data, scheduler activations 23 End of Chapter 3