Managing Processors Jeff Chase Duke University The story so far: protected CPU mode Any kind of machine exception transfers control to a registered (trusted) kernel handler running in a protected CPU mode. syscall trap u-start fault u-return u-start fault u-return kernel “top half” kernel “bottom half” (interrupt handlers) clock interrupt user mode kernel mode interrupt return Kernel handler manipulates CPU register context to return to selected user context. The kernel Every entry to the kernel is the result of a trap, fault, or interrupt. The core switches to kernel mode and transfers control to a handler routine. syscall trap/return fault/return OS kernel code and data for system calls (files, process fork/exit/wait, pipes, binder IPC, low-level thread support, etc.) and virtual memory management (page faults, etc.) I/O completions interrupt/return timer ticks The handler accesses the core register context to read the details of the exception (trap, fault, or interrupt). It may call other kernel routines. Exceptions: trap, fault, interrupt synchronous caused by an instruction asynchronous caused by some other event intentional unintentional happens every time contributing factors trap: system call fault open, close, read, write, fork, exec, exit, wait, kill, etc. invalid or protected address or opcode, page fault, overflow, etc. “software interrupt” software requests an interrupt to be delivered at a later time interrupt caused by an external event: I/O op completed, clock tick, power fail, etc. Every entry to the kernel is the result of a trap, fault, or interrupt. The core sets its mode to protected kernel mode and transfers control to the corresponding handler. Kernel Stacks and Trap/Fault Handling Processes execute user code on a user stack in the user virtual memory in the process virtual address space. Each process has a second kernel stack in kernel space (VM accessible only to the kernel). data stack stack stack syscall dispatch table stack System calls and faults run in kernel mode on the process kernel stack. Kernel code running in P’s process context (i.e., on its kstack) has access to P’s virtual memory. The syscall handler makes an indirect call through the system call dispatch table to the handler registered for the specific system call. The kernel A trap or fault handler may suspend (sleep) the current thread, leaving its state (call frames) on its kernel stack and a saved context in its TCB. syscall traps faults sleep queue ready queue interrupts The TCB for a blocked thread is left on a sleep queue for some synchronization object. A later event/action may wakeup the thread. Thread states and transitions running Scheduler governs these transitions. sleep blocked wakeup wait, STOP, read, write, listen, receive, etc. STOP wait yield ready Sleep and wakeup are internal primitives. Wakeup adds a thread to the scheduler’s ready pool: a set of threads in the ready state. Contention on ready thread queues • A large-scale system may have significant contention on the spinlock for the ready thread queue. – Each core removes a thread from the ready queue with GetNextThreadToRun() on each context switch. – Every wakeup adds a thread to the ready queue. • On average, the frequency of these events is linear with the number of cores. – What is the average wait time for the spinlock? • To reduce contention, an OS may have a separate run queue for each machine partition. – Each queue serves a partition of N cores. Per-CPU ready queues • lock per runqueue • preempt on queue insertion • recalculate priority on expiration Let’s talk about priority… CPU dispatch and ready queues In a typical OS, each thread has a priority, which may change over time. When a core is idle, pick the (a) thread with the highest priority. If a higher-priority thread becomes ready, then preempt the thread currently running on the core and switch to the new thread. If the quantum expires (timer), then preempt, select a new thread, and switch Priority Most modern OS schedulers use priority scheduling. – Each thread in the ready pool has a priority value. – The scheduler favors higher-priority threads. – Threads inherit a base priority from the associated application/process. – User-settable relative importance within application – Internal priority adjustments as an implementation technique within the scheduler. – How to set the priority of a thread? How many priority levels? 32 (Windows) to 128 (OS X) Per-CPU ready queues On most architectures, a find-first-bit-set instruction is used to find the highest priority bit set in one of five 32bit words (for the 140 priorities).