Vassal: Loadable Scheduler Support for Multi-Policy Scheduling George M. Candea, Oracle Corporation Michael B. Jones, Microsoft Research The Problem OS multiplexes CPU among tasks OS not always aware of scheduling requirements No algorithm is good enough for all task mixes Compromise: Hardcode set of scheduling policies into the operating system Desirable: Dynamically extensible set of policies Overview of Vassal Tasks can use custom scheduling policies Custom schedulers – are special Windows NT drivers – coexist with the Windows NT scheduler – have negligible impact on global performance In current prototype, one external scheduler loaded at once Outline Motivation and Overview Windows NT Scheduling Vassal Design and Implementation Sample Scheduler Results Conclusions Windows NT Scheduling Schedulable unit = thread Priority-based thread scheduling Two policies, in distinct priority ranges: – Variable (dynamic priority round-robin) – Real-Time (fixed priority round-robin) NT Scheduling Precedence 1. Interrupts 2. Deferred Procedure Calls (DPCs) 3. Threads Not all time gets scheduled based on priorities Scheduling predictability is limited NT Scheduling Events Scheduling – – – – decisions triggered by: End of thread quantum Priority or affinity changes Transition to Wait state Wakeups Windows NT Timers Hardware Abstraction Layer (HAL) provides kernel with a periodic timer Resolution selectable from 1 to 15 ms (default: 10 or 15 ms) Not all HALs implement all values – MP HAL provides 1, 2, 4, 8, 16 ms – Some HALs just implement 10 ms Outline Motivation and Overview Windows NT Scheduling Vassal Design and Implementation Sample Scheduler Results Conclusions Separate Policy from Mechanism NT scheduler = thread dispatcher with scheduling policies interspersed Vassal = separate scheduling and dispatching modules – Schedulers: policy modules that decide which threads to run – Dispatcher: runs threads selected by schedulers Details of Present Prototype Standard NT policies remain in kernel Schedulers are in a hierarchy – Give loaded scheduler first choice – Ask native scheduler if loaded scheduler makes no choice – Could easily support deeper hierarchy By default, threads use NT policies Vassal Entities Schedulers – Register decision making routines with dispatcher Dispatcher – Invokes decision routines when scheduling events occur Threads – Communicate with schedulers to request services Vassal Architecture User space Application Thread NT Scheduler Thread Dispatcher External Scheduler Kernel Hardware Abstraction Layer (HAL) Drivers Interface Modifications Extend driver interface for schedulers: – RegisterScheduler – SetSchedulerEvent Extend syscall interface for threads – MessageToScheduler Registering a Scheduler RegisterScheduler (scheduler identifier, decision making routine, message dispatcher routine) Invoked by driver at initialization time Dispatcher checks for conflicts and updates scheduler hierarchy Dispatcher queries scheduler by invoking the decision making routine Communicating with a Scheduler MessageToScheduler (scheduler identifier, message buffer, message length) Thread sends message to specific scheduler Corresponding scheduler’s message dispatcher extracts message from buffer and responds Precisely Timed Events SetSchedulerEvent (scheduler identifier, absolute time value) Scheduler requests control of CPU at certain absolute time Dispatcher invokes scheduler’s decision routine at specified time Vassal Interfaces Application Thread User space MessageToScheduler NT Scheduler Thread Dispatcher Kernel External Scheduler RegisterScheduler SetSchedulerEvent Hardware Abstraction Layer (HAL) Drivers Outline Motivation and Overview Windows NT Scheduling Vassal Design and Implementation Sample Scheduler Results Conclusions Sample Real-Time Scheduler Allows threads to get scheduled at application-specified time instances Demonstrates potential for more interesting time-based policies Using The Real-Time Scheduler Tell system to use the real-time scheduler status = MessageToScheduler (RT_scheduler, {JOIN}) if status != SUCCESS error (“Could not join R/T scheduling class.”) We want one iteration every 1ms while TRUE do { status = MessageToScheduler (RT_scheduler, {SET, wakeup_time}) … wakeup_time = wakeup_time + 1 msec } Execution of Sample Code RUN Thread Join R/T scheduling class WAIT RUN Set time constraint Event occurred Kernel Dispatch thread Request thread Scheduler Update data structures Update Set precisely data timed event structures Make scheduling decision Tpredicted Outline Motivation and Overview Windows NT Scheduling Vassal Design and Implementation Sample Scheduler Results Conclusions Windows NT Kernel Changes Added 188 lines of C code Added 61 assembly instructions Replaced 6 assembly instructions Context Switch Times System Version Vanilla NT 4.0 (released) Vanilla NT 4.0 (rebuilt) Vassal (no loaded scheduler) Vassal (sample scheduler loaded) Median 17.03 19.95 19.71 21.32 Avg. 18.71 19.88 19.71 21.17 Std. Dev. 4.17 1.64 1.56 1.28 Context switch times on original and modified systems (µs, P-133) No significant difference when external schedulers not loaded 8% overhead on untuned prototype when using loaded schedulers Writing a Scheduler Proof-of-concept real-time scheduler: – 116 lines of C code – No assembly language Only need to code the policy Periodic Wakeup Times Method Min. Max. Avg. Std. Dev. NT Multimedia Timers 75 1566 996 82 Sample Scheduler Events 996 1485 1002 21 Wakeup times using multimedia timers on vanilla system and sample scheduler on Vassal (µs, P-133). Desired value is 1000. No early wakeups when using our scheduler Predictability significantly improved Believe late samples due to unscheduled activities Outline Motivation and Overview Windows NT Scheduling Vassal Design and Implementation Sample Scheduler Results Conclusions Vassal Take-Home Demonstrates viability and effectiveness of loadable schedulers Frees OS from anticipating all possible application scheduling requirements Encourages scheduling research by making it easy to develop and test new policies Insignificant performance impact Limitations and Future Work Timing precision limited by HAL Predictability limited by interrupts and DPC activity Only one loaded scheduler supported External schedulers not fully MP aware Related Work Solaris scheduler class drivers – Must map scheduling decisions onto global thread priority space Extensible OS work – Spin, Exokernel, Vino Hierarchical schedulers – Utah CPU inheritance scheduling – UIUC Windows NT soft real-time scheduler For More Information... http://pdos.lcs.mit.edu/~candea/research.html http://research.microsoft.com/~mbj/