CS 6560 Operating System Design Lecture 6: Interrupts: Bottom Halves Kernel Synchronization References • Our textbook: Robert Love, Linux Kernel Development, 2nd edition, Novell Press, 2005. • Understanding the LINUX Kernel, 3rd. edition, O’Reilly, 2005. (covers 2.6) • The kernel code and its own documentation. Plan • Chap 7: Bottom halves • Chap 8: Concurrency Issues and Principles • Chap 9: Concurrency methods in Linux 2.6 Recall: Top and Bottom Halves • Conflicting needs for interrupts: – Return quickly – Get lots of work done • Solution: Break up the interrupt processing into two parts – Top half: returns quickly, after setting up work to be done – Bottom half: scheduled by the kernel to get work done at a later time Bottom Halves • Bottom Halves = General term for mechanisms to defer work generated by interrupt handlers (also an old name for an obsolete mechanism for doing this) When to run Bottom Half? • When? – In interrupt context - no sleeping allowed – In process context - sleeping allowed • Advantages to each, thus Linux 2.6 offers both types of bottom halves. Process and Interrupt Contexts • Process Context: Execution of code in the kernel that is scheduled by the process scheduler. • Interrupt Context: Execution of code in the kernel that is responding to an interrupt and is not scheduled by the process scheduler. Bottom Half Mechanisms In Linux 2.6 • The following are current mechanisms – Softirq – Tasklets – Work queues • Previous versions of the kernel had others that are now obsolete: – BH - removed in 2.5 kernel – Task queues - removed in 2.5 kernel Softirq • • • • Runs in interrupt context - no blocking allowed Statically defined at compile time - limited to 32 Available since 2.3 version of kernel Handles a few special cases such as network and scsi disk processing. Also handles tasklets • Code is mainly in kernel/softirq.c (and other places such as include/linux/irq_cpustat.h) Properties • A softirq is never preempted by another softirq - only by an interrupt • A softirq can run on more than one processor at a time. Implementation of Softirqs • Data structure: softirq_action – Has function reference and data reference • Static array of these: softirq_vec[32] • Handler function has simple prototype: void softirq_handler(struct softirq_action *) Registering Softirqs • Softirqs are enumerated at compile time • They are registered via the open_softirq function. • This just places them in the softirq_vec. When do softirqs run? • When needed, they are marked for running with the raise_softirq function. This turns on a bit in the softirq_pending word • They are later run by the kernel with the do_softirq function. This is called – in the return from hardware interrupt code – in the ksoftirq kernel thread – elsewhere such as in the networking subsystem Tasklets • Tasklets is a system that is implemented on top of soft_irq. • While soft_irqs are reserved for certain purposes, tasklets are the recommended way to do low overhead bottom half processing. • They are dynamically generated and managed. Tasklet Operations • Declaring: DECLARE_TASKLET and DECLARE_TASKLET_DISABLED • Scheduling (mark as pending): tasklet_schedule • Disabling: tasklet_disable and tasklet_disable_nosync • Enabling: tasklet_enable • Removing: tasklet_kill Workqueues • Workqueues run in process context and contain a list of work to be worked on. • Each unit of work has a function and a pointer to some data. • Each workqueue has a name, a list of work, and a set of worker threads, one for each CPU. The threads are named after the name of the workqueue. • Initially, there is one workqueue whose worker threads are called “events”. Individual threads are called “events/0”, “events/1”, and so on. Workqueue Operations • Creating Workqueues: create_workqueue makes new workqueue with given name. • Creating work: (give name, function, data) – Statically: DECLARE_WORK – Dynamically: INIT_WORK • Schedule work: – Generically: queue_work and queue_delayed_work – Default workqueue: schedule_work and schedule_work_delayed • Flush work: – Generically: flush_workqueue – Default workqueue: flush_scheduled_work • Cancel work: cancel_delayed_work Kernel Synchronization • Chapters 8 and 9 • Issue: Sharing data in a multiprocessor, preemptive environment • Solutions: – Limit sharing – Use locks Concurrency Issues • True Concurrency: Multiple processors be running code at the same time. They may contend for the same data. • Preemption: Running code may be interrupted by the running of other code. Running the new code may contend for the same data as running the old. Blocking and Sleeping • Blocking: Code may stop running and be waiting for an event to occur. • Sleeping: This can only happen if there is a process that is executing it. Scalability • Adding more processors adds more overhead. So a quad core system does not run four times faster. Granularity • Graduality: How much data should be protected? – Bits – Words – Structures Danger of Data Contention • Danger of data contention is corruption. • Example: updates - read, modify, write Solutions • Don’t share: • Don’t use global variables. • Have separate copies of variables for each CPU • Carefully design how data is shared and use a variety of locking and synchronization methods. Basic Principles • Protect data not code. • Critical region: Code that accesses shared data • Typical protection: bracket critical regions Linux 2.6 uses • • • • • • • • • Per CPU copies of variables Preemption disabling Interrupt disabling atomic operations spinlocks Semaphores Completion variables Seq locks Barriers Per CPU Variables • Covered under memory management in LKD, Chapter 11. • Linux 2.4 used an explicit array, indexed by CPU. The function get_cpu gives the index number for the current CPU and disables preemption. The function put_cpu enables preemption. • Linux 2.6 has a percpu interface. Atomic Operations • Linux 2.6 has atomic integer variables and atomic bitwise operations Atomic Integer Operations • ATOMIC_INIT(int i)// assign the type atomic_t to i • int atomic_read(atomic_t *v)// returns value of v atomically • void atomic_set(atomic_t *v, int i)// atomically sets v equal to the value of I • void atomic_add(int i, atomic_t *v)// atomically add i to v • void atomic_sub(int i, atomic_t *v)// atomically subtract i from v • void atomic_inc(atomic_t *v)// atomically increment v • void atomic_dec(atomic_t *v)// atomically decrement v • And more - see page 133 for details on atomic_sub_and_test, atomic_add_negative, atomic_dec_and_test, atomic_inc_and_test Atomic Bitwise Operations • Operations – – – – Set, clear and flip individual bits Test and set, clear, and flip individual bits Return a bit value Find first set bit and first cleared bit • Atomic and non-atomic forms (prefix by double underscore) • Danger - bit position can overflow actual data Spin Locks • Why? – Way of waiting without sleep – Used in SMP systems • Notes: – sleep has high overhead and is not available in interrupt context – Spin locks must be only be held for a short time - maybe five lines of code Spin Lock Use • Bracket critical region with spin_lock and spin_unlock • Example: spin_lock(&my_lock) /* critical section */ spin_unlock(&my_lock) Spin Lock Implementation • See include/asm-i386/spinlock.h static inline void __raw_spin_lock(raw_spinlock_t *lock) { asm volatile("\n1:\t" LOCK_PREFIX " ; decb %0\n\t" "jns 3f\n" "2:\t" "rep;nop\n\t" "cmpb $0,%0\n\t" "jle 2b\n\t" "jmp 1b\n" "3:\n\t" : "+m" (lock->slock) : : "memory"); } Spin Lock Methods • • • • • • • • • spin_lock()// aquires spin lock spin_unlock()// releases spin lock spin_lock_irq() // disables local interrupts and acquires spin lock spin_unlock_irq() // releases spin lock and enables interrupts spin_lock_irqsave() // saves and disables local interrupts, acquires spin lock spin_unlock_irqsave() // restores and enables local interrupts and releases spin lock spin_lock_init() spin_trylock() spin_is_locked() More Coming