The Structuring of Systems using Upcalls David D. Clark, “The Structuring of Systems using Upcalls”, Proc. of the 10th Symposium on Operating System Principles, pp. 171-180, 1985. Ralf Juengling Portland State University Layers When you bake a big cake or write a big program, you will probably do it in layers Layers as one way of abstracting When writing big code you need abstractions to be able to… • Think about your code • Communicate your code to others • Test your code • Adapt your code later to changed requirements For many applications layered abstractions are natural • Protocol stacks • Compilers • Database management • Scientific computing applications • Operating systems Flow of control in layered code Clients XYZ Library (stateless) May have any number of concurrent threads if code is reentrant Additional requirements in OS kernel code • • Handle device interrupts timely Support dynamic updating of modules (e.g., device drivers) but don’t compromise safety Solutions: • Have interrupt handlers communicate with devices and let other code communicate with interrupt handlers asynchronously (buffers, messages) • Contain modules in own address spaces • Use IPC to let different modules communicate across protection boundaries In kernel code… …we have: 1. Abstraction boundaries 2. Protection boundaries 3. Downward control flow 4. Upward control flow … communication between layers is more costly because: • Control flow across protection boundaries (RPC, messages,…) • Upward control flow across abstraction boundaries (buffers) Flow of control in kernel code Note: • Layers have state • Need to synchronize shared data • Call across layers crosses protection boundary • Upward data flow is asynchronous (buffers) • For some layers there is a dedicated task (pipeline) • Downward control flow may be asynchronous or synchronous In kernel code… … communication between layers is more costly because: • Control flow across protection boundaries • Upward control flow across abstraction boundaries Clark’s solution: • Let upward flow control proceed synchronously with upcalls • Get rid of protection boundaries Upcalls Idea: • Leave “blanks” in lower-level code • Let higher level code “fill in the blanks” in form of handlers In functional programming this technique is used every day, in OO programming every other day. Other terms: Handler function, Callback function, Virtual method Does using upcalls abolish abstraction boundaries? Flow of control in kernel code • It looks a bit more like layered library code • Procedure calls instead of IPC • Plus Upcalls • But we can’t do completely without buffering Protocol package example display-start transport-open net-open • • • create-task display-receive transport-receive transport-get-port net-receive net-dispatch wakeup transport-receive is a handler for net-receive display-receive is a handler for transport-receive a handler gets registered by an xxx-open call Protocol package example display-start transport-open net-open display-start(): local-port = transport-open(display-receive) end display-receive transport-open(receive-handler): local-port = net-open(transport-receive) transport-receive transport-get-port handler-array(local-port) = receive-handler return local-port end net-receive net-dispatch net-open(receive-handler): port = generate-uid() handler-array(port) = receive-handler task-array(port) = create-task(net-receive, port) return port end Protocol package example transport-get-port(packet): // determine whose packet this is extract port from packet display-receive display-start return port end transport-open transport-receive net-dispatch(): read packet from device restart device net-open net-receive port = transport-get-port(packet) put packet on per port queue task-id = task-array(port) wakeup-task(task-id) end transport-get-port net-dispatch Protocol package example transport-get-port(packet): // determine whose packet this is extract port from packet display-receive display-start return port end transport-open transport-receive transport-get-port net-dispatch(): read packet from device restart device net-open net-receive net-dispatch port = transport-get-port(packet) put packet on per port queue not quite clean task-id = task-array(port) wakeup-task(task-id) end Protocol package example display-receive(char): write char to display end display-start transport-receive(packet, port): handler = handler-array(port) validate packet header for each char in packet: transport-open handler(char) end net-open net-receive(port): handler = handle-array(port) do forever remove packet from per port queue handler(packet, port) block() end end display-receive transport-receive net-receive transport- net-di Full protocol package example What if an upcall fails? This must not leave any shared data inconsistent! Two things need to be recovered: 1. The task 2. The per-client data in each layer/module Solution: • Cleanly separate shared state from per-client data • Have a per-layer cleanup procedure and arrange for the system to call it in case of a failure • Unlock everything before an upcall May upcalled code call down? This is a source of potential, subtle bugs: Indirect recursive call may change state unexpectedly Some solutions: 1. Check state after an upcall (ugly) 2. Don’t allow a handler to downcall (simple & easy) 3. Have upcalled procedure trigger future action instead of down-calling (example: transport-arm-for-send) How to use locks? Don’t really know. With downcall-only there is a simple locking discipline: • Have each layer use its own set of locks • Have each subroutine release its lock before return • No deadlock as a partial order is implied by the call graph Doesn’t work when upcalls are allowed. The principle behind their recipe “release any locks before an upcall” is asymmetry of trust: • Trust the layers you depend on, but not your clients Upcalls & Abstraction boundaries We get rid of protection boundaries for the sake of performance and to make upcalls practical We seemingly keep abstraction boundaries intact as we: • Don’t leak information about the implementation by offering an upcall interface • Don’t know our clients, they must register handlers But we need to observe some constraints to make it work: • downcall policy • locking discipline • Cleanup interface Other things in Swift • • • • • • Monitors for synchronization Task-scheduling with a “deadline priority” scheme Dynamic priority adjustment if a higher priority task waits for a lower priority task (“deadline promotion”) Inter-task communication per shared memory High-level implementation language (CLU, anyone?) Mark & Sweep garbage collector Oh, and “multi-task modules” are just layers with state prepared for multiple concurrent execution. Time for coffee