Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider Advisor: John Regehr Dissertation defense School of Computing University of Utah Data-flow Analysis for Interruptdriven Microcontroller Software • A whole program analysis • Targeting embedded C programs • Suitable for use in a compiler 2 Microcontrollers (MCUs) • 10 billion units / year • $12.5 billion market in 2006 • Cheap • Resource constrained • e.g. Wireless sensor networks – Mica2 mote ATmega 128L (4 MHz 8-bit MCU) 128 kB code, 4 kB data SRAM 3 Problem • Resources are constrained • Software outlives hardware – Code reuse leads to bloat • Low-level code confuses analysis – Interrupt-driven concurrency – Device register access 4 Solution • Traditional data-flow analysis – Not adequate precision for MCU software • New techniques to increase precision – Deal with concurrency – Track volatile data • Use in code transformations Thesis statement – Optimizations 5 Contributions • Analysis techniques – Interatomic concurrent data-flow (ICD) – Tracking data through volatile variables • Tool – cXprop • Applications – Practical memory safety – Safe TinyOS – Offline RAM Compression 6 • Open-source OS for WSNs • Written in nesC main – Dialect of C • Concurrency – Tasks and interrupts – No threads – Atomic sections Interrupt task task task Interrupt 7 ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe TinyOS RAM compression Pointer analysis 8 Abstract interpretation • Abstract domain switch (x) { ... – Abstract values break; – Form poset case 42: case 7: case -1: • Subset relation () if (x < 0) x={42,7,-1} – Lattice x *= -1; • Undefined ( ) x++; • Unknown (⊥) {} or if (x == 0) {42} {7} {-1} assert(0); break; {42,7} {7,-1} {42,-1} ... {42,7,-1} or ⊥ 9 Abstract interpretation switch (x) { ... break; case 42: case 7: case -1: if (x < 0) x={42,7,-1} x *= -1; x++; if (x == 0) assert(0); break; ... • Abstract domain – Abstract values – Form poset • Subset relation () – Lattice • Undefined ( ) • Unknown (⊥) • Data-flow analysis – Transfer functions – Merging () – Fixed point 10 Abstract interpretation • Abstract domain Τ {42,7,-1} – Abstract values – Form poset Τ{-1} Τ {42,7} x<0 < x*=-1; *= ++ x++; Τ {1} Τ {42,7,1} • Subset relation () – Lattice • Undefined ( ) • Unknown (⊥) Τ {43,8,2} x==0 == Τ {43,8,2} • Data-flow analysis assert(0); Τ Τ – Transfer functions – Merging () – Fixed point 11 ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe TinyOS RAM compression Pointer analysis 12 Interrupt-driven concurrency • Problems – C statements not necessarily atomic x = 0x4242; ldi r24, 0x42 Interrupt ldi r25, 0x42 13 Interrupt-driven concurrency • Problems – C statements not necessarily atomic – Preempts sequential control flow • Complicated control flow • Synchronization A race – One flow does not “break” another – Bad synchronization happens • Difficult or impossible to reason about • Must deal with conservatively (⊥) 14 Related work • Thread-based concurrency – M. B. Dwyer, L. A. Clarke, J. M.Cobleigh, and G. Naumovich. Flow analysis for verifying properties of software systems. TOSEM 2004. – M. C. Rinard. Analysis of multithreaded programs. SAS 2001. • Leveraging race detection – R. Chugh, J. W. Voung, R. Jhala, and S. Lerner. Dataflow analysis for concurrent programs using datarace detection. PLDI 2008. • Formal semantics – X. Feng, Z. Shao, Y. Dong, Y. Gho. Certifying low-level programs with hardware interrupts and preemptive threads. PLDI 2008. 15 Race detection • Lockset analysis - standard technique – Lock status = interrupt enable bit status – Only one lock – no lock aliasing – nesC uses lexical nesting • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 16 Race detection Accessed without locking Written in shared or unlocked unshared code Accessed in shared code R A C E • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 17 Race detection case analysis Interrupt Write Read Use Racing Not racing Interrupt or task Write Read Access Atomic section 18 Data classification Data Heap Concurrent Static (Global) Sequential Shared ⊥ Racing 6% Stack Unshared 50% Not racing 44% 19 Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Atomic section Interatomic Concurrent Data-flow 20 Volatile • C type qualifier – volatile int • Special case of C’s memory model – Read value may change “randomly” – Write may affect system state • E.g., racing data, device registers • Behavior opaque at C level • Prevents compiler optimizations 21 Tracking volatile RAM • Locate variables backed by RAM • Introduce concurrency information – Interatomic concurrent dataflow • Have sound approximation of mutators – Behavior not opaque at system level • Safely analyze volatile variables in RAM 22 Tracking volatile device registers • Hardware registers – Memory mapped I/O – Hardware not actually random (volatile) • Can track using MCU-specific information – OK to track individual bits • Instead of whole register • Interrupt bit of status register Volatile tracking 23 Pointer analysis • Points-to sets – must and may alias – Two pluggable domains – Subtleties from context-insensitivity • Targets: – – – – – – Device registers Scalars Structs Arrays not-NULL Heap Pointer analysis 24 Conditional X propagation • Pluggable abstract domains – From conditional constant propagation • Clean domain interface – Transfer functions – Abstract interpretation Abstract domain utility functions Conditional X propagation Analysis 25 Domains Constant Bitwise Interval Conditional X propagation Value set 26 ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe TinyOS RAM compression Pointer analysis 27 Struct splitter Inliner Fixed point computation Cleaner Value-flow Pointer-flow ICD Volatile tracking • Constant propagation • Dead code elimination • Dead data elimination Transformations Cleaner Implemented as a CIL extension 28 Suppose we have a WSN… 29 Suppose we have a WSN… • What happened? – State got corrupted – array out-of-bounds Memory safety error – Hard to debug • Limited visibility into executing systems • Difficult to replicate complex bugs • Memory safety can – Catch all pointer and array bounds errors • Before they corrupt state – Provide a choice of recovery action • Display error message or reboot 30 Safe TinyOS Expand Deputy: existing solution for making C safe into system safety • Modify TinyOS to work with Deputy • Enforce Deputy’s safety model under concurrency • Reduce overhead cXprop Published at SenSys 2007 31 Safe TinyOS toolchain int post(val_t* buf, buf, int n); int post(val_t* COUNT(n) int n); run modified nesC compiler enforce safety using Deputy deal with concurrency TinyOS code cXprop compress error messages Safe whole-program optimization TinyOS app cXprop Annotate Safe TinyOS code Modify TinyOS to work with Deputy Enforce Deputy’s safety model under concurrency Reduce overhead 32 • Deputy enforces safety in sequential code • cXprop avoids extraneous protection – Only racing variables need protection Atomic block Concurrency Potentially unsafe read to local Interrupt Deputy check Potentially Read local ) If ( unsafe read 33 Code size 35 Code size 35% 13% -11% Safe TinyOS 36 A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes – Data in SRAM – 6 transistors / bit – SRAM can dominate power consumption of a sleeping chip 37 A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes –On-chip Data in SRAM transistors / bit RAM–is6 persistently scarce in can tinydominate MCU-based systems – SRAM power consumption of a sleeping chip • Is RAM used efficiently? – Performed value profiling for MCU apps • Apps already heavily tuned for RAM usage – Result: Average byte stores four values! 38 Offline RAM compression • Automated sub-word packing for statically allocated scalars, pointers, structs, arrays – No heap on targeted MCUs – Trades ROM and CPU cycles for RAM Published at PLDI 2007 39 Method x ≝ variable that occupies n bits Vx ≝ conservative estimate of value set log2|Vx| < n ⇒ RAM compression possible Cx ≝ another set such that |Cx| = |Vx| fx ≝ bijection between Vx and Cx n - log2|Cx| ⇒ bits saved through compression of x 40 Example Compression void (*function_queue[8])(void); 41 Example Compression void (*function_queue[8])(void); x n = size of a function pointer = 16 bits 42 Example Compression x Vx &function_A &function_B &function_C NULL 43 Example Compression x Vx n = 16 bits |Vx| = 4 log2|Vx| < n 2 < 16 44 Example Compression x Vx Cx 0 1 2 fx ≝ Vx to Cx ≝ compression fx-1 ≝ Cx to Vx ≝ decompression 3 45 Example Compression ROM x Cx Vx = { , , , } 0 1 2 3 fx ≝ compression table scan fx-1 ≝ decompression table lookup 46 Example Compression ROM x Cx Vx = { , , , } 0 1 2 128 bits reduced to 16 bits 3 112 bits of RAM saved 47 RAM compression results 49 RAM compression results cXprop (no compression) 10% RAM reduction 20% ROM reduction 5.9% duty cycle reduction Compression 22% RAM reduction 3.6% ROM reduction 29% duty cycle increase 50 ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe TinyOS RAM compression Pointer analysis 51 Conclusion • Interatomic concurrent data-flow • Volatile data may be tracked • Better analysis more optimizations – Safe TinyOS – practical memory safety – RAM compression – 22% RAM reduction http://www.cs.utah.edu/~coop/research/cxprop/ http://www.cs.utah.edu/~coop/safetinyos/ http://www.cs.utah.edu/~coop/research/ccomp/ Thank you 52 53 Cost/Benefit Ratio C i Ai B i V C ≝ access profile A,B ≝ platform-specific costs V ≝ cardinality of value set S u− S c Su ≝ original size Sc ≝ compressed size 54 Turning the RAM Knob 0% 55 Turning the RAM Knob 10% 56 Turning the RAM Knob 20% 57 Turning the RAM Knob 30% 58 Turning the RAM Knob 40% 59 Turning the RAM Knob 50% 60 Turning the RAM Knob 60% 61 Turning the RAM Knob 70% 62 Turning the RAM Knob 80% 63 Turning the RAM Knob 90% 64 Turning the RAM Knob 100% 65 Turning the RAM Knob 95% 66 Future work • Triggering and sequencing Timer interrupt handler Sense Data ready interrupt handler Fire Trigger Fire Data • Caching compressed values read x x decompress read x x decompress read x x decompress 67 More related work • Safe TinyOS – R. K. Rengaswamy, E. Kohler, and M. Srivastava. Softwarebased memory protection in sensor nodes. EmNets 2006. – B. L. Titzer. Virgil: Objects on the head of a pin. OOPSLA 2006. – S. Kowshik, D. Dhurjati, and V. Adve. Ensuring code safety without runtime checks for real-time control systems. CASES 2002. • Offline RAM compression – Y. Zhang and R. Gupta. Compressing heap data for improved memory performance. Software—Practice and Experience 2006. – L. S. Bai, L. Yang, and R. P. Dick. Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems. CASES 2006. 68 PAG • Program Analysis Generator – Domain specific language input describes • Domain lattice • Transfer functions • Language-describing grammar • Fixed point solution method – Data-flow analyzer as output • Does not deal with concurrency • Used to evaluate fixed point solutions 69 Feature comparison 12% 5.5% 70 Domain comparison 71 Resource reduction 12% 8.3% 2.5% 1.8% 72 Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Interrupt Atomic section Atomic section Atomic section Interatomic Concurrent Data-flow 73 Context insensitivity a is a global variable foo int x = 7; bar(&x); a = {27} {7} x = {7,42} bar(int *y) goo(y); a = {27} y = {&x} goo(int *z) *z = 42; a = *z; {27} a = {7,27,42} z = {&x} 74 Benchmark descriptions • • • • • AVR ATmega128 code TinyOS 3,000-26,000 lines of C code Analysis times - seconds to an hour Metrics – Duty cycle • % of time processor is on • Obtained from Avrora – Cycle-accurate simulator for WSNs – Code size and data size 75 Wireless sensor networks • 10 billion units / year • $12.5 billion market in 2006 • Cheap • Resource constrained • e.g. Wireless sensor networks – Mica2 mote ATmega 128L (4 MHz 8-bit MCU) 128 KB code, 4 KB data SRAM 76