제44강 : Bottom Halves Ch 7 Bottom halves 1 Why Bottom Halves? • Two conflicting goals – Execute as quickly as possible because it • runs with lock & interrupt disabled • is blocking others • is very time critical – But we also need to perform a lot of work • Solution Divide the work into two halves – Top : – Bottom : time-critical, quick/simple work runs with interrupt disabled. less critical work, large amount of work defer, runs with interrupt enabled. 2 Top half & Bottom half (1) Interrupt request (3) exit I.H. Top Half Bottom Half (2) Schedule bottom halves (i.e. set bits) softirq_pending[cpu] 0 1 0 softirq_vec[cpu] action action data data 0 1 action data (4) Invoke do_softirq(), which Execute each bottom half (if the mask bit is set) Who invokes do_softirq()? - returning I.H. or - low priority Kernel thread or - network subsystem struct softirq_action { void (*action)(struct softirq_action *); void *data; }; 3 3 Ways to Register Bottom Half Handlers Softirq Runtime efficiency Registering Handlers Open_softirq() Tasklet - DECLARE_TASKLET() Work Queue Easy to code DECLARE_WORK() Trade-off 4 (1) Softirq 5 Defining Softirq • Software interrupt (softirq) vs Hardware interrupt (irq) • bitmask array[32 bit] – Interrupt handler marks this entry – It this bit is set, a particular bottom half is requested – this entry points to particular softirq action struct • do_softirq() scans mask bits & executes softirq handlers softirq_pending[cpu] softirq_vec[32] 0 action data 1 action data 0 0 1 action data struct softirq_action { void (*action)(struct softirq_action *); /* function to run */ void *data; /* data for function */ }; 6 In Linux, only first few entries are used (Bovet p. 148) softirq_pending[cpu] 0 1 0 0 1 Softirq HI_SOFTIRQ TIMER_SIFTIRQ NET_TX_SOFTIRQ NET_RX_SOFTIRQ SCSI_SOFTIRQ TASKLET_SOFTIRQ Description high priority Timer bottom half Transmit network packets Receive network packets SCSI bottom half regular tasklets 7 do_softirq() softirq_pending[cpu] pending 0 1 0 0 softirq_vec[cpu] actio actio actio h asmlinkage void do_softirq(void) n n n { int max_restart = MAX_SOFTIRQ_RESTART; data data data __u32 pending; f( ) SCSI( ) IP( ) unsigned long flags; if (in_interrupt()) return; Softirq handlers local_irq_save(flags); pending = local_softirq_pending(); /* bit mask local var. Clear original bit mask*/ if (pending) { struct softirq_action *h; local_bh_disable(); restart: local_softirq_pending() = 0; local_irq_enable(); h = softirq_vec; /* array name alone is pointer to array */ do { if (pending & 1) haction(h); /* bottom half softirq handler for this bit */ h++; /* pointer arithmetic next array element */ pending >>= 1; /* next mask bit */ } while (pending); local_irq_disable(); 8 ….. } 1 Love, Chapter 7 Invoking do_softirq() 1. Returning hardware interrupt handler – – before do_IRQ() returns it calls irq_exit() do_softirq() 2. kernel thread – – Bovet, p. 150 low priority kernel thread called ksoftirqd_CPUn It runs ksoftirq() function, which calls do_softirq() 3. Any code (such as network subsystem) – checks softirq pending bit and calls do_softirq() 9 Concurrent Execution of Softirq IRQm CPUk selected do_IRQ() ISR sets softirq bit (m) IRQm CPUi selected do_IRQ() ISR sets softirq bit (m) t2 t1 t3 CPUk invokes do_softirq() checks softirq bit (m) clears original mask bits invokes IP() softirq handler CPUi invokes do_softirq() checks softirq bit (m) clears original mask bits invokes IP() softirq handler irq_desc[ ] timer IRQ1 Network IRQ2 SCSI IRQ3 IRQm action g1() g2() action f1() IP( ) 0 ISR f2() IP( ) 1 0 0 actio actio actio n n n data data data SCSI( ) IP( ) 1 softirq_pending[cpu] softirq_vec[cpu] h( ) Softirq handlers do_IRQ() handle_IRQ_event() do_softirq() 10 Mutual Exclusion in Softirq Execution 1. 2. 3. 4. 5. 1. At t1, IRQ arrives at PIC At t0, IRQ arrives at2.PIC HW Dynamic IRQ distribution CPUk chosen 3. CPUk is interrupted set softirq bit HW Dynamic IRQ distribution CPUj chosen CPU k calls CPUj is interrupted 4. set softirq bitdo_softirq() 5. CPUk checks softirq maskbit action(h) /* no lock */ CPU calls do_softirq() j CPUj checks softirq maskbit action(h) /* no lock / t0 (CPUj) f() t1 (CPUk) f() t2 (CPUj) g() Any CPU can execute softirq handlers at anytime Maximum concurrency, but Careful coding is required mutual exclusion on shared data access reentrant code 11 (2) Tasklet 12 • softirq Softirq v.s. Tasklet – handlers may run simultaneously on different CPU’s – Maximum throughput, Difficult coding (reentrant, data access) – Good for System (eg network packet handling by SMP) • Some devices do not need softirq concurrency, – for example • device driver that needs exclusive access to data • device driver that transfers a series of bits – Such handlers should run serially. (don’t need concurrency) • tasklet – – – – – – a special type of softirq that same tasklet cannot run simultaneously on different CPU’s If f( ) starts to run on a CPU, other CPU’s cannot run f( ) (Other CPU may run other tasklet, say tasklet g( ), in parallel) tasklet is easy to code (no shared data, no reentrant) 13 less concurrency Data Structure for Tasklet tasklet_struct state run/pending count en/dis-abled *func data for function next To prevent concurrent execution of tasklet handlers, we need a lock for each tasklet function struct tasklet_struct It has pointer to function *func() state flag (lock for function) 0 no CPU is running this tasklet function set the state flag & run this tasklet 1 this tasklet function is running on another CPU 14 For tasklet, “which CPU is executing this tasklet?” is important Hence, link tasklets to individual CPU tasklet_vec[cpu] cpu0 tasklet_he cpu1 ad tasklet_struct state run/pending count en/dis-abled *func data for function next 15 tasklet_vec[cpu] cpu0 tasklet_he cpu1 ad tasklet_struct state run/pending count en/dis-abled *func data for function next Some tasklets are high priority, others regular tasklet_hi_vec[cpu] cpu0 tasklet_he cpu1 ad tasklet_struct state run/pending state count en/dis-abled count *func data next *func for function data next 16 Activating the Tasklet Bovet p. 152 • Invoke tasklet_schedule() or tasklet_hi_schedule() • tasklet_schedule() { Get logical CPU number that is executing this function Adds tasklet descriptor to the list pointed to by tasklet_vec[cpu] or tasklet_hi_vec[cpu] Invoke cpu_raise_softirq() to activate corresponding softirq } tasklet_hi_vec[cpu] tasklet_vec[cpu] cpu0 tasklet_hea cpu1 d tasklet_struct tasklet_struct state state cou nt cou nt * f( ) * g( ) data data run/pending this tasklet en/dis-abled for function 17 HI_SOFTIRQ (tasklet_hi_vec[]) TIMER_SIFTIRQ NET_TX_SOFTIRQ TASKLET_SOFTIRQ (tasklet_vec[]) irq_desc[ ] ISR IRQ1 IRQ2 IRQ3 IRQm 0 action g1() 0 0 actio actio actio n n n data data data f2() softirq_vec[cpu] tasklet_high_action() do_IRQ() softirq_pending[cpu] 1 g2() action f1() 1 handle_IRQ_event() tasklet_action() Softirq handlers do_softirq() do_softirq() tasklet_action() or tasklet_hi_action() { get CPU number executing the function get list = tasklet_vec[cpu]=NULL for (each tasklet in the list ), if (state= run | disabled), give up this tasklet else, set the state flag & run this tasklet } Bovet p. 153 tasklet_hi_vec[cpu] tasklet_vec[cpu] cpu0 tasklet_hea cpu1 d tasklet_struct tasklet_struct state state cou nt cou nt * f( ) * g( ) data data next next run/pending this tasklet en/dis-abled for function 18 asmlinkage void do_softirq(void) { int max_restart = MAX_SOFTIRQ_RESTART; __u32 pending; softirq_pending[cpu] unsigned long flags; 1 0 0 1 pending 0 if (in_interrupt()) return; softirq_vec[cpu] local_irq_save(flags); h action action action pending = local_softirq_pending(); data data data if (pending) { struct softirq_action *h; tasklet_high_action() tasklet_action() local_bh_disable(); restart: local_softirq_pending() = 0; Softirq handlers local_irq_enable(); h = softirq_vec; /* array name alone is pointer to array */ do { if (pending & 1) haction(h); /* bottom half handler for this bit */ h++; /* pointer arithmetic next array element */ pending >>= 1; /* next mask bit */ } while (pending); /* repeat max 32 times or till zero */ local_irq_disable(); tasklet_struct tasklet_hi_vec[cpu] tasklet_struct pending = local_softirq_pending(); tasklet_vec[cpu] state state run/pending if (pending && --max_restart) goto restart; this tasklet cpu0 tasklet_hea en/dis-abled if (pending) wakeup_softirqd(); cou cou cpu1 d __local_bh_enable(); nt nt for function } * f( ) * g( ) local_irq_restore(flags); 19 data data } (3) Work Queue 20 Which bottom half should I use? • Softirq – fastest alternative for timing critical & frequent users – code with great care • Tasklet – alternative for softirq – easier to use, sacrifice performance – different type tasklets can run concurrently on different CPU’s • Work queues – runs in a process context (kernel thread) • can sleep & schedulable (tasklets cannot sleep) • can use semaphore, block I/O, a lot of memory, … – context switching overhead 21 Comparison 3 Bottom Half Handlers Performance Ease of Code Registering Handler Softirq Tasklet Work Queue 1 Maximum concurrency 2 Tasklet execution is serialized 3 Process context Must be reentrant Must not No need reentrant sleep Must not sleep Open_softirq() DECLARE_TASKLET() Context switch overhead No need - reentrant sleep it can use semaphore, block I/O, large memory, … Coding is easiest DECLARE_WORK() 22