CSC 660: Advanced OS Interrupts CSC 660: Advanced Operating Systems Slide #1 Topics 1. 2. 3. 4. 5. 6. 7. 8. 9. Types of Interrupts PIC and IRQs Interrupt Handlers Top Halves and Bottom Halves Enabling/Disabling Interrupts SoftIRQs Tasklets Work Queues Timer Interrupts CSC 660: Advanced Operating Systems Slide #2 How can hardware communicate with CPU? Busy Wait Issue hardware request. Wait in tight loop until receives answer. Polling Issue hardware request. Periodically check hardware status. Interrupts Issue hardware request. Hardware signals CPU when answer ready. CSC 660: Advanced Operating Systems Slide #3 Types of Interrupts Synchronous Produced by CPU while executing instructions. Issues only after finishing execution of an instr. Often called exceptions. Ex: page faults, system calls, divide by zero Asynchronous Generated by other hardware devices. Occur at arbitrary times, including while CPU is busy executing an instruction. Ex: I/O, timer interrupts CSC 660: Advanced Operating Systems Slide #4 Types of Exceptions • Faults – Correctable errors like page faults. – Resumes instruction identified by saved EIP. • Traps – Triggered by int or similar instructions. – Used for system calls and debugging. – Saved value of EIP is instruction after trap. • Aborts – Serious errors, where it may be impossible to save EIP. – The affected process is terminated. CSC 660: Advanced Operating Systems Slide #5 x86 Exceptions # TYPE Signal/Exception Description 0 Fault SIGFPE. Integer divide by zero error. 1 Trap SIGTRAP. Debug. 3 Trap SIGTRAP. Debugger breakpoint created by int3 instruction. 4 Trap SIGSEGV. Overflow: into instruction executed with OF flag set. 5 Fault SIGSEGV. Address bound check failed. 6 Fault SIGKILL. Invalid opcode. 7 Fault None. Device (FP,MMX,SSE) not available. 8 Abort None. Double fault (two exceptions that cannot be handled serially.) 10 Fault SIGSEGV. Invalid TSS. 11 Fault SIGBUS. Segment not present (in memory.) 12 Fault SIGBUS. Stack segment fault. 13 Fault SIGSEGV. General protection. 14 Fault SIGSEGV. Page fault. 16 Fault SIGFPE. Floating point error. 17 Fault SIGBUS. Alignment check (misaligned data item.) CSC 660: Advanced Operating Systems Slide #6 Programmable Interrupt Controller PIC connects Hardware devices that issue IRQs. CPU: INTR pin and data bus. PIC features 15 IRQ lines Sharing and dynamic assignment of IRQs. Masking (disabling) of selected IRQs. CPU masking of all maskable interrupts: cli, sti. APIC: Advanced PIC Handles multiprocessor systems. CSC 660: Advanced Operating Systems Slide #7 Interrupt Vectors Vector Range Use 0-19 Nonmaskable interrupts and exceptions. 20-31 Intel-reserved 32-127 External interrupts (IRQs) 128 System Call exception 129-238 External interrupts (IRQs) 239 Local APIC timer interrupt 240 Local APIC thermal interrupt 241-250 Reserved by Linux for future use 251-253 Interprocessor interrupts 254 Local APIC error interrupt 255 Local APIC suprious interrupt CSC 660: Advanced Operating Systems Slide #8 IRQ Example IRQ INT Hardware Device 0 32 Timer (required) 1 33 Keyboard 2 34 PIC Cascading (required) 3 35 Second serial port 4 36 First serial port 6 38 Floppy Disk 8 40 System Clock 10 42 Network Interface 11 43 USB port, sound card 12 44 PS/2 Mouse 13 45 Math Coprocessor 14 46 EIDE first controller 15 47 EIDE second controller CSC 660: Advanced Operating Systems Slide #9 Interrupt Handling CSC 660: Advanced Operating Systems Slide #10 IRQ Handling 1. Monitor IRQ lines for raised signals. If multiple IRQs raised, select lowest # IRQ. 2. If raised signal detected 1. 2. 3. 4. 5. 6. Converts raised signal into vector (0-255). Stores vector in I/O port, allowing CPU to read. Sends raised signal to CPU INTR pin. Waits for CPU to acknowledge interrupt. Kernel runs do_IRQ(). Clears INTR line. 3. Goto step 1. CSC 660: Advanced Operating Systems Slide #11 Nested Kernel Control Paths • Interrupt handlers can interrupt other handlers. • Exception handlers can only interrupt exception handlers. CSC 660: Advanced Operating Systems Slide #12 do_IRQ 1. Kernel jumps to entry point in entry.S. 2. Entry point saves registers, calls do_IRQ(). 3. 4. 5. 6. 7. Finds IRQ number in saved %EAX register. Looks up IRQ descriptor using IRQ #. Acknowledges receipt of interrupt. Disables interrupt delivery on line. Calls handle_IRQ_event() to run handlers. 8. Cleans up and returns. 9. Jumps to ret_from_intr(). CSC 660: Advanced Operating Systems Slide #13 handle_IRQ_event() fastcall int handle_IRQ_event(unsigned int irq, struct pt_regs *regs, struct irqaction *action) { int ret, retval = 0, status = 0; if (!(action->flags & SA_INTERRUPT)) local_irq_enable(); do { ret = action->handler(irq, action->dev_id, regs); if (ret == IRQ_HANDLED) status |= action->flags; retval |= ret; action = action->next; } while (action); if (status & SA_SAMPLE_RANDOM) add_interrupt_randomness(irq); local_irq_disable(); return retval; } CSC 660: Advanced Operating Systems Slide #14 Interrupt Handlers Function kernel runs in response to interrupt. More than one handler can exist per IRQ. Exception vs interrupt handlers Exception handlers send signal to current process. Interrupt handlers can’t as they’re asynchronous. Must run quickly. Resume execution of interrupted code. How to deal with high work interrupts? Ex: network, hard disk CSC 660: Advanced Operating Systems Slide #15 Top and Bottom Halves Top Half The interrupt handler. Current interrupt disabled, possibly all disabled. Runs in interrupt context, not process context. Can’t sleep. Acknowledges receipt of interrupt. Schedules bottom half to run later. Bottom Half Runs in process context with interrupts enabled. Performs most work required. Can sleep. Ex: copies network data to memory buffers. CSC 660: Advanced Operating Systems Slide #16 Interrupt Context Not associated with a process. Cannot sleep: no task to reschedule. current macro points to interrupted process. Shares kernel stack of interrupted process. Be very frugal in stack usage. CSC 660: Advanced Operating Systems Slide #17 Registering a Handler request_irq() Register an interrupt handler on a given line. free_irq() Unregister a given interrupt handler. Disable interrupt line if all handlers unregistered. CSC 660: Advanced Operating Systems Slide #18 Registering a Handler int request_irq(unsigned int irq, irqreturn_t (*handler)(int, void *, struct pt_regs *), unsigned long irqflags, const char * devname, void *dev_id) irqflaqs = SA_INTERRUPT | SA_SAMPLE_RANDOM | SA_SHIRQ CSC 660: Advanced Operating Systems Slide #19 Writing an Interrupt Handler irqreturn_t ih(int irq,void *devid,struct pt_regs *r) Differentiating between devices Pre-2.0: irq Current: dev_id Registers Pointer to registers before interrupt occurred. Return Values IRQ_NONE: Interrupt not for handler. IRQ_HANDLED: Interrupted handled. CSC 660: Advanced Operating Systems Slide #20 RTC Handler irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs) { spin_lock (&rtc_lock); rtc_irq_data += 0x100; rtc_irq_data &= ~0xff; if (rtc_status & RTC_TIMER_ON) mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100); spin_unlock (&rtc_lock); /* Now do the rest of the actions */ spin_lock(&rtc_task_lock); if (rtc_callback) rtc_callback->func(rtc_callback->private_data); spin_unlock(&rtc_task_lock); wake_up_interruptible(&rtc_wait); kill_fasync (&rtc_async_queue, SIGIO, POLL_IN); return IRQ_HANDLED; } CSC 660: Advanced Operating Systems Slide #21 Interrupt Control Disable/Enable Local Interrupts local_irq_disable(); /* interrupts are disabled */ local_irq_enable(); Saving and Restoring IRQ state Useful when don’t know prior IRQ state. unsigned long flags; local_irq_save(flags); /* interrupts are disabled */ local_irq_restore(flags); /* interrupts in original state */ CSC 660: Advanced Operating Systems Slide #22 Interrupt Control Disabling Specific Interrupts For legacy hardware, avoid for shared IRQ lines. disable_irq(irq) enable_irq(irq) What about other processors? Disable local interrupts + spin lock. We’ll talk about spin locks next time… CSC 660: Advanced Operating Systems Slide #23 Returning from Interrupts Runs ret_from_intr() or ret_from_exception() – – – – – – – Resumes any interrupted kernel ctl paths. Return to User Mode if no more kernel ctl paths. Performs any pending work (bottom halves.) Calls schedule() if process switch pending. Handles pending signals. Deals with virtual-8086 mode. Handles debugger single stepping. CSC 660: Advanced Operating Systems Slide #24 Bottom Halves Perform most work required by interrupt. Run in process context with interrupts enabled. Three forms of deferring work SoftIRQs Tasklets Work Queues CSC 660: Advanced Operating Systems Slide #25 SoftIRQs Statically allocated at compile time. Only 32 softIRQs can exist (only 6 currently used.) struct softirq_action { void (*action)(struct softirq_action *); void *data; }; static struct softirq_action softirq_vec[32]; Tasklets built on SoftIRQs. All tasklets use one SoftIRQ. Dynamically allocated. CSC 660: Advanced Operating Systems Slide #26 SoftIRQ Handlers Prototype void softirq_handler(struct softirq_action *) Calling my_softirq->action(my_softirq); Pre-emption SoftIRQs don’t pre-empt other softIRQs. Interrupt handlers can pre-empt softIRQs. Another softIRQ can run on other CPUs. CSC 660: Advanced Operating Systems Slide #27 Executing SoftIRQs Interrupt handler marks softIRQ. Called raising the softirq. SoftIRQs checked for execution: In return from hardware interrupt code. In ksoftirq kernel thread. In any code that explicitly checks for softIRQs. do_softirq() Loops over all softIRQs. CSC 660: Advanced Operating Systems Slide #28 Current SoftIRQs SoftIRQ Priority Description HI 0 High priority tasklets. TIMER 1 Timer bottom half. NET_TX 2 Send network packets. NET_RX 3 Receive network packets. SCSI 4 SCSI bottom half. TASKLET 5 Tasklets. CSC 660: Advanced Operating Systems Slide #29 Tasklets • Implemented as softIRQs. – Linked list of tasklet_struct objects. • Two priorities of tasklets: – HI: tasklet_hi_schedule() – TASKLET: tasklet_schedule() • Scheduled tasklets run via do_softirq() – HI action: tasklet_action() – TASKLET action: tasklet_hi_action() CSC 660: Advanced Operating Systems Slide #30 ksoftirqd SoftIRQs may occur at high frequencies. SoftIRQs may re-raise themselves. Kernel will not handle re-raised softIRQs immediately in do_softirq(). Kernel thread ksoftirq solves problem. One thread per processor. Runs at lowest priority (nice +19). CSC 660: Advanced Operating Systems Slide #31 Work Queues Defer work into a kernel thread. Execute in process context. One thread per processor: events/n. Processes can create own threads if needed. struct workqueue_struct { struct cpu_workqueue_struct cpu_wq[NR_CPUS]; const char *name; struct list_head list; /* Empty if single thread */ }; CSC 660: Advanced Operating Systems Slide #32 Work Queue Data Structures worker thread cpu_workqueue_struct 1/CPU workqueue_struct 1/thread type work_struct work_struct work_struct CSC 660: Advanced Operating Systems 1/deferrable function Slide #33 Worker Thread Each thread runs worker_thread() 1. Marks self as sleeping. 2. Adds self to wait queue. 3. If linked list of work empty, schedule(). 4. Else, marks self as running, removes from queue. 5. Calls run_workqueue() to perform work. CSC 660: Advanced Operating Systems Slide #34 run_workqueue() 1. Loops through list of work_structs struct work_struct { unsigned long pending; struct list_head entry; void (*func)(void *); void *data; void *wq_data; struct timer_list timer; }; 2. Retrieves function, func, and arg, data 3. Removes entry from list, clears pending 4. Invokes function CSC 660: Advanced Operating Systems Slide #35 Which Bottom Half to Use? 1. If needs to sleep, use work queue. 2. If doesn’t need to sleep, use tasklet. 3. What about serialization needs? Bottom Half Softirq Context Interrupt Serialization None Tasklet Interrupt Against same tasklet Work queues Process None CSC 660: Advanced Operating Systems Slide #36 Timer Interrupt Executed HZ times a second. #define HZ 1000 /* <asm/param.h> */ Called the tick rate. Time between two interrupts is a tick. Driven by Programmable Interrupt Timer (PIT). Interrupt handler responsibilities Updating uptime, system time, kernel stats. Rescheduling if current has exhausted time slice. Balancing scheduler runqueues. Running dynamic timers. CSC 660: Advanced Operating Systems Slide #37 Jiffies Jiffies = number of ticks since boot. extern unsigned long volatile jiffies; Incremented each timer interrupt. Uptime = jiffies/HZ seconds. Convert for user space: jiffies_to_clock_t() Comparing jiffies, while avoiding overflow. time_after(a, b): a > b time_before(a,b) a < b time_after_eq(a,b): a >= b time_before_eq(a,b): a <= b CSC 660: Advanced Operating Systems Slide #38 Time Calculations • Do not assume HZ is 100 or 1000. – Use HZ constant in calculations instead. • Use time conversion functions from jiffies.h – – – – msecs_to_jiffies: converts ms to jiffies jiffies_to_msecs: converts jiffies to ms timespec_to_jiffies: struct timespec -> jiffies jiffies_to_timespec: jiffies -> struct timespec CSC 660: Advanced Operating Systems Slide #39 Timer Interrupt Handler 1. Increments jiffies. 2. Update resource usages (sys + user time.) 3. Run dynamic timers. 4. Execute scheduler_tick(). 5. Update wall time. 6. Calculate load average. CSC 660: Advanced Operating Systems Slide #40 Using Timers Declare the timer struct timer_list my_timer; Initialize the timer init_timer(&my_timer); Set your desired timeout and callback my_timer.expires = jiffies + delay my_timer.data = 0 my_timer.function = my_function Activate the timer add_timer(&my_timer); This timer will cause my_function(0) to be executed after delay ticks have passed. CSC 660: Advanced Operating Systems Slide #41 Timers Timers are executed via TIMER_SOFTIRQ run_timer_softirq() executes all expired timers. To change the expiration time of a timer mod_timer(&my_timer, jiffies+new_delay); To deactivate a timer prior to expiration del_timer(&my_timer); CSC 660: Advanced Operating Systems Slide #42 Delaying Execution Busy Looping while(time_before(jiffies, delay)) ; Small Delays void udelay(unsigned long usecs) void mdelay(unsigned long msecs) schedule_timeout() CSC 660: Advanced Operating Systems Slide #43 schedule_timeout() Puts task to sleep until specified time elapsed. Sleep time may be longer than requested. Creates a local timer to do the sleep. Returns time slept if awakened prematurely. Using schedule_timeout() /* Set state to interruptible sleep */ set_current_state(TASK_INTERRUPTIBLE); /* Sleep at least delay jiffies */ schedule_timeout(delay); /* Task will resume in TASK_RUNNING */ CSC 660: Advanced Operating Systems Slide #44 References 1. 2. 3. 4. 5. 6. Daniel P. Bovet and Marco Cesati, Understanding the Linux Kernel, 3rd edition, O’Reilly, 2005. Johnathan Corbet et. al., Linux Device Drivers, 3rd edition, O’Reilly, 2005. Robert Love, Linux Kernel Development, 2nd edition, Prentice-Hall, 2005. Claudia Rodriguez et al, The Linux Kernel Primer, Prentice-Hall, 2005. Peter Salzman et. al., Linux Kernel Module Programming Guide, version 2.6.1, 2005. Andrew S. Tanenbaum, Modern Operating Systems, 3rd edition, Prentice-Hall, 2005. CSC 660: Advanced Operating Systems Slide #45