Operating System Support for Mobile Interactive Applications Thesis Oral Dushyanth Narayanan Carnegie Mellon University Motivation: resource mismatch Resource intensive app 2 GHz, 1 GB, 3-D graphics 200 MHz, 32 MB, no 3-D, no FPU Resource-poor wearable Poor performance! 2 Focus: mobile interactive applications speech recognition, language translation, augmented reality, … • Resource-heavy, but need bounded response time Columbia U. MARS project 3 Strawman solution Scale demand down to supply • degraded version of app Supply varies dynamically • turbulent mobile environment Demand varies dynamically • depends on app state, runtime parameters Static solution worst-case fidelity 4 Multi-fidelity computation Traditional algorithm fixed output spec. • resource consumption varies Multi-fidelity algorithm many allowable outputs • different fidelities different outputs, rsrc usage • bound performance despite resource variation • E.g. sort only first n of m items 5 Using multi-fidelity computation Make each interactive operation a multi-fidelity computation. System chooses fidelity for each operation • relieves application programmer burden using predictive resource management • • • • predict resource supply predict resource demand as a function of fidelity predict performance from supply and demand adapt fidelity to meet performance goals 6 Thesis Statement Multi-fidelity computation, with system support for predictive resource management, can significantly improve interactive response when mobile. History-based resource demand prediction is feasible, and essential to such a system. Legacy applications can use multi-fidelity computation with modest changes. 7 Thesis validation Multi-fidelity API, case studies • can real apps use multi-fidelity computation? System architecture • can we support this programming model? Evaluation • does history-based demand prediction work? • is application performance improved? • how much code modification is required? 8 Outline • Motivation and thesis statement • What is fidelity? • System architecture • History-based demand prediction • Evaluation • Conclusion 9 What is fidelity? Fidelity runtime tunable quality Application-specific metric(s) • • • • resolution for augmented reality rendering vocabulary size for speech recognition JPEG compression level for web images … 10 Fidelity in rendering 160 k polygons 16 k polygons 11 Outline • Introduction • What is fidelity? • System support • Design principles • Architecture • History-based demand prediction • Evaluation • Conclusion 12 Design principles Middleware layer supports multi-fidelity API • user-level server running on Linux • no functionality in kernel Gray-box approach to resource prediction • read load statistics from standard interfaces • use knowledge of kernel internals if necessary Only supports application adaptation • allocation decisions left to OS 13 Before each interactive operation Application begin_fidelity_op f=0.7 Demand predictor 0.9 Iterative Solver 0.8 sec, f=0.7 f=0.7 4 M cycles f=0.7 API 0.8 sec Performance predictor Utility function Supply predictor 5 M cycles/sec 14 After each interactive operation Application end_fidelity_op API Demand predictor Demand Logger f=0.7, 4.2 M cycles 4.2 M cycles monitor f=0.7, 4.2 M cycles 15 In the document … Resource model • CPU, memory, network, file cache latency • energy battery lifetime Iterative solver • gradient descent + exhaustive search Supply predictors • sample/filter load observations • CPU, memory, network, energy, file cache 16 Outline • Introduction • What is fidelity? • System support • History-based demand prediction • Case study • Evaluation • Evaluation • Conclusion 17 Why demand prediction? Resource demand varies • with fidelity • with runtime parameters • with input data scene=“Notre Dame” fidelity=0.7 position=<1.3, 5.7, 0> orientation=<0.1, 0.5, 0> Demand predictor cycles=3.7e6 rendering/CPU 18 Constructing a demand predictor Static analysis not always sufficient • data-dependence, platform-dependence History-based demand prediction • monitor, log, learn • static analysis can guide learning stage • online learning at runtime Can be written by third party • with small amount of domain knowledge • predictor code is decoupled from app code 19 Case study: GLVU Virtual 3-D walkthrough Rendering is CPU-bound • multi-resolution models to adjust demand 160 k polygons 16 K polygons 20 CPU demand vs. resolution CPU demand is linear in resolution • linear fit is within 10%, 90% of the time for a given scene, camera position 21 CPU demand with camera movement Demand varies with camera position • even at fixed resolution (160k polygons) 22 Dynamic demand prediction Exploit locality: • camera moves along continuous path • slope, offset change in small increments Linear regression with online update • more weight for recent data • slope, offset updated after each operation 23 Demand prediction: accuracy Evaluated predictor in realistic scenario • user navigating scene • time-varying background load Prediction error: 29% (90th pc) • static linear predictor had 61% • more tuning / better ML to improve further Online update improves accuracy significantly 24 Performance impact of demand pred. Latency (sec) 5 4 Latency bound: 1 sec 3 2 1 0 No prediction Demand prediction Supply and demand prediction Demand prediction improves performance • supply prediction further improvement 25 Demand prediction for other resources Operation Resource Dynamic Prediction Range Error 14 – 60 MB 3% Web image fetch Energy 1.4 – 25 J 9% Remote speech recognition 4 – 219 KB 0.3% Radiosity Memory Network transmission Better accuracy than GLVU/CPU • also have CPU predictors for Radiosity, speech 26 Outline • • • • Introduction What is fidelity? System support History-based demand prediction • Evaluation • Performance improvement • Modification cost • Conclusion 27 Evaluation scenario Two concurrent applications • competing for CPU GLVU is a “virtual walkthrough” [Brooks @ UNC] • moving user renders continuously • 1 sec latency bound (best achievable on platform) Radiator does lighting effects [Willmott @ CMU] • runs sporadically in background • 10 sec latency bound 28 Evaluation objective We want latency close to target value • too high poor interactivity • too low unnecessarily low quality • high variability annoys users Does dynamic adaptation help? Compare with static option: • static-best: best possible static values • static-naive: arbitrarily chosen static values 29 Mean latency Normalized latency 4 GLVU Radiator 3 2 1 0 Adaptive Static-best Static-naïve 30 Variation in latency Coefficient of variation 0.5 0.4 GLVU Radiator 0.3 0.2 0.1 0 Adaptive Static-best Static-naïve 31 Evaluation summary Adaptation improves latency and variability • static can improve mean at best, not variation More experimental results (in document): • sensitivity analysis (peak load, frequency) • memory load variation Other results (not in document): • adaptation for energy [Flinn99] • adaptive remote execution [Flinn01] 32 Cost of modification Application Size (LOC) Modified (LOC) Virtual walkthrough 27 K 560 Radiosity 51 K 599 126 K 1081 Netscape + 4 K 861 Speech recognition Web browsing Legacy apps ported at modest cost (also language translation, face recognition, LaTeX, …) 33 Outline • Introduction • What is fidelity? • System support • History-based demand prediction • System evaluation • Conclusion • Contributions • Related Work 34 Novel research Contributions Multi-fidelity computation • subsumes approximate, anytime algorithms Resource model • file cache state as resource History-based demand prediction • as a function of fidelity Predictive resource management 35 Related Work Other system components • remote execution [Flinn01] • energy adaptation [Flinn99] • Adaptation to network bandwidth [Noble97] Application adaptation [Fox96, Noble97, Flinn99, de Lara01, …] QoS • allocation is complementary to adaptation • good survey of QoS systems in [Nahrstedt01] 36 Related Work (cont) Solver • efficient heuristics [Lee99] Resource prediction • Load (supply) [Dinda99, …] • Demand from prob. dist. [Harchol-Balter96] • Demand from runtime parameters [Kapadia99, Abdelzaher00] User intent • mixed-initiative approach [Horvitz99] 37 Conclusion Multi-fidelity computation works • applications able to meet latency goals • with modest source code modification History-based demand prediction is feasible • essential to good adaptation • not (yet) completely automated Underlying OS limits user-level approach • scheduling constants limit granularity • system call overheads limit performance 38