Lecture 2: Software Platforms Anish Arora CIS788.11J Introduction to Wireless Sensor Networks Lecture uses slides from tutorials prepared by authors of these platforms Outline • Discussion includes OS and also prog. methodology some environments focus more on one than the other • Focus is on node centric platforms (vs. distributed system centric platforms) composability, energy efficiency, robustness reconfigurability and pros/cons of interpreted approach • Platforms TinyOS (applies to XSMs) slides from UCB EmStar (applies to XSSs) slides from UCLA SOS slides from UCLA Contiki slides from Upsaala Virtual machines (Maté) slides from UCB 2 References • • NesC, Programming Manual, The Emergence of Networking Abstractions and Techniques in TinyOS, TinyOS webpage EmStar: An Environment for Developing Wireless Embedded Systems Software, Sensys04 Paper, EmStar webpage • SOS Mobisys paper, SOS webpage • Contiki Emnets Paper, Sensys06 Paper, Contiki webpage • Mate ASPLOS Paper, Mate webpage, (SUNSPOT) 3 Traditional Systems • Application Application User System Network Stack • Strict boundaries • Ample resources • Transport Threads Network Address Space Data Link Files Well established layers of abstractions Physical Layer Drivers • Independent applications at endpoints communicate pt-pt through routers Well attended Routers 4 Sensor Network Systems • Highly constrained resources processing, storage, bandwidth, power, limited hardware parallelism, relatively simple interconnect • Applications spread over many small nodes self-organizing collectives highly integrated with changing environment and network diversity in design and usage • Concurrency intensive in bursts streams of sensor data & network traffic • Robust inaccessible, critical operation • Unclear where the Need a framework for: • Resource-constrained concurrency • Defining boundaries • Appl’n-specific processing allow abstractions to emerge 5 Choice of Programming Primitives • Traditional approaches command processing loop (wait request, act, respond) monolithic event processing full thread/socket posix regime • Alternative provide framework for concurrency and modularity never poll, never block interleaving flows, events 6 TinyOS • Microthreaded OS (lightweight thread support) and efficient network interfaces • Two level scheduling structure Long running tasks that can be interrupted by hardware events • Small, tightly integrated design that allows crossover of software components into hardware 7 Tiny OS Concepts Scheduler + Graph of Components Event Handlers Frame (storage) Tasks (concurrency) • Constrained Storage Model frame per component, shared stack, no heap • Very lean multithreading • Efficient Layering Messaging Component internal thread Internal State RX_packet_done (buffer) Commands init Component: init Power(mode) TX_packet(buf) • power(mode) model: threads + events Events msg_rec(type, data) msg_send_done) Commands TX_packet_done (success) constrained two-level scheduling send_msg (addr, type, data) • 8 application Application = Graph of Components Route map Router Sensor Appln packet Radio Packet byte Radio byte bit Active Messages RFM Serial Packet UART Temp ADC Photo SW Example: ad hoc, multi-hop routing of photo sensor readings HW 3450 B code 226 B data clock Graph of cooperating state machines on shared stack 9 TOS Execution Model • commands request action ack/nack at every boundary message-event driven call command or post task events notify occurrence HW interrupt at lowest level active message event-driven packet-pump packet • Radio Packet may signal events • Radio byte encode/decode event-driven bit-pump bit post tasks crc event-driven byte-pump byte call commands data processing application comp RFM tasks provide logical concurrency preempted by events 10 Event-Driven Sensor Access Pattern command result_t StdControl.start() { SENSE return call Timer.start(TIMER_REPEAT, 200); } event result_t Timer.fired() { return call sensor.getData(); Timer Photo LED } event result_t sensor.dataReady(uint16_t data) { display(data) return SUCCESS; } • clock event handler initiates data collection • sensor signals data ready event • data event handler calls output command • device sleeps or handles other activity while waiting • conservative send/ack at component boundary 11 TinyOS Commands and Events { ... status = call CmdName(args) ... } command CmdName(args) { ... return status; } event EvtName(args) { ... return status; } { ... status = signal EvtName(args) ... } 12 TinyOS Execution Contexts events Tasks commands Interrupts Hardware • Events generated by interrupts preempt tasks • Tasks do not preempt tasks • Both essentially process state transitions 13 Handling Concurrency: Async or Sync Code Async methods call only async methods (interrupts are async) Sync methods/tasks call only sync methods Potential race conditions: any update to shared state from async code any update to shared state from sync code that is also updated from async code Compiler rule: if a variable x is accessed by async code, then any access of x outside of an atomic statement is a compile-time error Race-Free Invariant: any update to shared state is either not a potential race condition (sync code only) or is within an atomic section 14 Tasks • provide concurrency internal to a component longer running operations • are preempted by events • not preempted by tasks • able to perform operations beyond event context • may call commands • may signal events { ... post TskName(); ... } task void TskName { ... } 15 Typical Application Use of Tasks • event driven data acquisition • schedule task to do computational portion event result_t sensor.dataReady(uint16_t data) { putdata(data); post processData(); return SUCCESS; } task void processData() { int16_t i, sum=0; for (i=0; i ‹ maxdata; i++) sum += (rdata[i] ›› 7); display(sum ›› shiftdata); } • 128 Hz sampling rate • simple FIR filter • dynamic software tuning for centering the magnetometer signal (1208 bytes) • digital control of analog, not DSP • ADC (196 bytes) 16 Task Scheduling • Typically simple FIFO scheduler • Bound on number of pending tasks • When idle, shuts down node except clock • Uses non-blocking task queue data structure • Simple event-driven structure + control over complete application/system graph instead of complex task priorities and IPC 17 Maintaining Scheduling Agility • Need logical concurrency at many levels of the graph • While meeting hard timing constraints sample the radio in every bit window Retain event-driven structure throughout application Tasks extend processing outside event window All operations are non-blocking 18 The Complete Application SenseToRfm generic comm IntToRfm AMStandard packet RadioCRCPacket UARTnoCRCPacket CRCfilter noCRCPacket Timer photo byte MicaHighSpeedRadioM SecDedEncode ChannelMon bit SPIByteFIFO phototemp RadioTiming RandomLFSR SW UART ClockC ADC HW SlavePin 19 Programming Syntax • TinyOS 2.0 is written in an extension of C, called nesC • Applications are too just additional components composed with OS components • Provides syntax for TinyOS concurrency and storage model commands, events, tasks local frame variable • Compositional support separation of definition and linkage robustness through narrow interfaces and reuse interpositioning • Whole system analysis and optimization 20 Component Interface • logically related set of commands and events StdControl.nc interface StdControl { command result_t init(); command result_t start(); command result_t stop(); } Clock.nc interface Clock { command result_t setRate(char interval, char scale); event result_t fire(); } 21 Component Types • Configuration links together components to compose new component configurations can be nested complete “main” application is always a configuration • Module provides code that implements one or more interfaces and internal behavior 22 Example of Top Level Configuration configuration SenseToRfm { // this module does not provide any interface } implementation { components Main, SenseToInt, IntToRfm, ClockC, Photo as Sensor; Main.StdControl -> SenseToInt; Main.StdControl -> IntToRfm; SenseToInt.Clock -> ClockC; SenseToInt.ADC -> Sensor; SenseToInt.ADCControl -> Sensor; SenseToInt.IntOutput -> IntToRfm; } Main StdControl SenseToInt Clock ClockC ADC ADCControl Photo IntOutput IntToRfm 23 Nested Configuration includes IntMsg; configuration IntToRfm { provides { StdControl IntToRfmM interface IntOutput; interface StdControl; } IntOutput SubControl } SendMsg[AM_INTMSG]; GenericComm implementation { components IntToRfmM, GenericComm as Comm; IntOutput = IntToRfmM; StdControl = IntToRfmM; IntToRfmM.Send -> Comm.SendMsg[AM_INTMSG]; IntToRfmM.SubControl -> Comm; } 24 IntToRfm Module includes IntMsg; command result_t StdControl.start() { return call SubControl.start(); } module IntToRfmM { uses { interface StdControl as SubControl; interface SendMsg as Send; } provides { interface IntOutput; interface StdControl; } } implementation { bool pending; struct TOS_Msg data; command result_t StdControl.init() { pending = FALSE; return call SubControl.init(); } command result_t StdControl.stop() { return call SubControl.stop(); } command result_t IntOutput.output(uint16_t value) { ... if (call Send.send(TOS_BCAST_ADDR, sizeof(IntMsg), &data) return SUCCESS; ... } event result_t Send.sendDone(TOS_MsgPtr msg, result_t success) { ... } } 25 Atomicity Support in nesC • • • Split phase operations require care to deal with pending operations Race conditions may occur when shared state is accessed by premptible executions, e.g. when an event accesses a shared state, or when a task updates state (premptible by an event which then uses that state) nesC supports atomic block implemented by turning of interrupts for efficiency, no calls are allowed in block access to shared variable outside atomic block is not allowed 26 Supporting HW Evolution • Distribution broken into apps: tos: lib: system: platform: o shared application components hardware independent system components hardware dependent system components includes HPLs and hardware.h interfaces tools: contrib beta • top-level applications development support tools Component design so HW and SW look the same example: temp component may abstract particular channel of ADC on the microcontroller may be a SW I2C protocol to a sensor board with digital sensor or ADC • HW/SW boundary can move up and down with minimal changes27 Example: Radio Byte Operation • Pipelines transmission: transmits byte while encoding next byte • Trades 1 byte of buffering for easy deadline • Encoding task must complete before byte transmission completes • Decode must complete before next byte arrives • Separates high level latencies from low level real-time rqmts Encode Task Bit transmission RFM Bits Byte 1 start Byte 2 Byte 1 Byte 3 Byte 2 … Byte 4 Byte 3 28 Dynamics of Events and Threads bit event filtered at byte layer bit event => end of byte => end of packet => end of msg send thread posted to start send next message radio takes clock events to detect recv 29 Sending a Message bool pending; struct TOS_Msg data; command result_t IntOutput.output(uint16_t value) { IntMsg *message = (IntMsg *)data.data; if (!pending) { pending = TRUE; message->val = value; message->src = TOS_LOCAL_ADDRESS; if (call Send.send(TOS_BCAST_ADDR, sizeof(IntMsg), &data)) return SUCCESS; pending = FALSE; } return FAIL; } destination length • Refuses to accept command if buffer is still full or network refuses to accept send command • User component provide structured msg storage 30 Send done Event event result_t IntOutput.sendDone(TOS_MsgPtr msg, result_t success) { if (pending && msg == &data) { pending = FALSE; signal IntOutput.outputComplete(success); } return SUCCESS; } } • Send done event fans out to all potential senders • Originator determined by match free buffer on success, retry or fail on failure • Others use the event to schedule pending communication 31 Receive Event event TOS_MsgPtr ReceiveIntMsg.receive(TOS_MsgPtr m) { IntMsg *message = (IntMsg *)m->data; call IntOutput.output(message->val); return m; } • Active message automatically dispatched to associated handler knows format, no run-time parsing performs action on message event • Must return free buffer to the system typically the incoming buffer if processing complete 32 Tiny Active Messages • Sending declare buffer storage in a frame request transmission name a handler handle completion signal • Receiving declare a handler firing a handler: automatic • Buffer management strict ownership exchange tx: send done event reuse rx: must return a buffer 33 Tasks in Low-level Operation • transmit packet send command schedules task to calculate CRC task initiates byte-level data pump events keep the pump flowing • receive packet receive event schedules task to check CRC task signals packet ready if OK • byte-level tx/rx task scheduled to encode/decode each complete byte must take less time that byte data transfer 34 TinyOS tools • • TOSSIM: a simulator for tinyos programs ListenRaw, SerialForwarder: java tools to receive raw packets on PC from base node • Oscilloscope: java tool to visualize (sensor) data in real time • Memory usage: breaks down memory usage per component (in contrib) • Peacekeeper: detect RAM corruption due to stack overflows (in lib) • Stopwatch: tool to measure execution time of code block by timestamping at entry and exit (in osu CVS server) • Makedoc and graphviz: generate and visualize component hierarchy • Surge, Deluge, SNMS, TinyDB 35 Scalable Simulation Environment • target platform: TOSSIM whole application compiled for host native instruction set event-driven execution mapped into event-driven simulator machinery storage model mapped to thousands of virtual nodes • radio model and environmental model plugged in bit-level fidelity • Sockets = basestation • Complete application including GUI 36 Simulation Scaling 37 TinyOS 2.0: basic changes • Scheduler: improve robustness and flexibility • New nesC 1.2 features: • Reserved tasks by default ( fault tolerance) Priority tasks Network types enable link level cross-platform interoperability Generic (instantiable) components, attributes, etc. Platform definition: simplify porting Structure OS to leverage code reuse Decompose h/w devices into 3 layers: presentation, abstraction, device-independent Structure common chips for reuse across platforms so platforms are a collection of chips: msp430 + CC2420 + • Power mgmt architecture for devices controlled by resource reservation • Self-initialisation • App-level notion of instantiable services 38 TinyOS Limitations • • • • Static allocation allows for compile-time analysis, but can make programming harder No support for heterogeneity Support for other platforms (e.g. stargate) Support for high data rate apps (e.g. acoustic beamforming) Interoperability with other software frameworks and languages Limited visibility Debugging Intra-node fault tolerance Robustness solved in the details of implementation nesC offers only some types of checking 44 Em* • • Software environment for sensor networks built from Linux-class devices Claimed features: Simulation and emulation tools Modular, but not strictly layered architecture Robust, autonomous, remote operation Fault tolerance within node and between nodes Reactivity to dynamics in environment and task High visibility into system: interactive access to all services 45 Contrasting Emstar and TinyOS • • Similar design choices programming framework Component-based design “Wiring together” modules into an application event-driven reactive to “sudden” sensor events or triggers robustness Nodes/system components can fail Differences hardware platform-dependent constraints Emstar: Develop without optimization TinyOS: Develop under severe resource-constraints operating system and language choices Emstar: easy to use C language, tightly coupled to linux (devfs,redhat,…) TinyOS: an extended C-compiler (nesC), not wedded to any OS 46 Em* Transparently Trades-off Scale vs. Reality Em* code runs transparently at many degrees of “reality”: high visibility debugging before low-visibility deployment Pure Simulation Deployment Scale Data Replay Ceiling Array Portable Array Reality 47 Em* Modularity • Dependency DAG • Each module (service) Collaborative Sensor Processing Application Manages a resource & State Sync resolves contention 3d MultiLateration Has a well defined interface Has a well scoped task Acoustic Ranging Topology Discovery Encapsulates mechanism Exposes control of policy Neighbor Discovery Leader Election Hardware Radio Reliable Unicast Time Sync Minimizes work done by client library • Application has same structure as “services” Audio Sensors 48 Em* Robustness • Fault isolation via multiple processes • Active process management (EmRun) • Auto-reconnect built into libraries scheduling “Crashproofing” prevents cascading failure • Soft state design style depth map path_plan EmRun Services periodically refresh clients Avoid “diff protocols” camera motor_x motor_y 49 Em* Reactivity • Event-driven software structure React to asynchronous notification e.g. reaction to change in neighbor list • scheduling Notification through the layers Events percolate up path_plan notify filter Domain-specific filtering at every level motor_y e.g. neighbor list membership hysteresis time synchronization linear fit and outlier rejection 50 EmStar Components • Tools EmRun EmProxy/EmView EmTOS • Standard IPC FUSD Device patterns • Common Services NeighborDiscovery TimeSync Routing 51 EmRun: Manages Services • Designed to start, stop, and monitor services • EmRun config file specifies service dependencies • Starting and stopping the system Starts up services in correct order Can detect and restart unresponsive services Respawns services that die Notifies services before shutdown, enabling graceful shutdown and persistent state • Error/Debug Logging Per-process logging to in-memory ring buffers Configurable log levels, at run time 52 EmSim/EmCee • • • Em* supports a variety of types of simulation and emulation, from simulated radio channel and sensors to emulated radio and sensor channels (ceiling array) In all cases, the code is identical Multiple emulated nodes run in their own spaces, on the same physical machine 53 EmView/EmProxy: Visualization Emulator emview emproxy neighbor linkstat motenic Mote nodeN nodeN nodeN Mote … … Mote 54 Inter-module IPC : FUSD • • • Creates device file interfaces Text/Binary on same file Standard interface Client Server /dev/servicename /dev/fusd User Kernel Language independent No client library required kfusd.o 55 Device Patterns • FUSD can support virtually any semantics What happens when client calls read()? • But many interfaces fall into certain patterns • Device Patterns encapsulate specific semantics take the form of a library: objects, with method calls and callback functions priority: ease of use 56 Status Device • Designed to report current state no queuing: clients not guaranteed to see every intermediate state • Supports multiple clients • Interactive and programmatic interface ASCII output via “cat” binary output to programs • Supports client notification Server Config Handler notification via select() • Client configurable O State Request Handler I Status Device client can write command string server parses it to enable per-client behavior Client1 Client2 Client3 57 Packet Device • Designed for message streams • Supports multiple clients • Supports queuing Round-robin service of output queues Delivery of messages to all/ specific clients Server Packet Device O F • Client-configurable: Input and output queue lengths I I F O I F O I O Input filters Optional loopback of outputs to Client1 Client2 Client3 other clients (for snooping) 58 Device Files vs Regular Files • • Regular files: Require locking semantics to prevent race conditions between readers and writers Support “status” semantics but not queuing No support for notification, polling only Device files: Leverage kernel for serialization: no locking needed Arbitrary control of semantics: queuing, text/binary, per client configuration Immediate action, like an function call: system call on device triggers immediate response from service, rather than setting a request and waiting for service to poll 59 Interacting With em* • • Text/Binary on same device file Text mode enables interaction from shell and scripts Binary mode enables easy programmatic access to data as C structures, etc. EmStar device patterns support multiple concurrent clients IPC channels used internally can be viewed concurrently for debugging “Live” state can be viewed in the shell (“echocat –w”) or using emview 60 SOS: Motivation and Key Feature • Post-deployment software updates are necessary to • customize the system to the environment • upgrade features • remove bugs • re-task system • Remote reprogramming is desirable • Approach: Remotely insert binary modules into running kernel • software reconfiguration without interrupting system operation no stop and re-boot unlike differential patching Performance should be superior to virtual machines 61 Architecture Overview Tree Routing Application Dynamic Memory Timer Clock Function Pointer Control Blocks Scheduler Serial Framer UART Comm. Stack ADC • • Provides hardware abstraction & common services Maintains data structures to enable module loading Costly to modify after deployment Sensor Manager SPI Static Kernel • Dynamically Loadable Modules Light Sensor I2C Kernel Services Low-level Device Drivers Hardware Abstraction Layer Dynamic Modules • • • Drivers, protocols, and applications Inexpensive to modify after deployment Position independent 62 SOS Kernel • Hardware Abstraction Layer (HAL) • • Low layer device drivers interface with HAL • • Clock, UART, ADC, SPI, etc. Timer, serial framer, communications stack, etc. Kernel services • Dynamic memory management • Scheduling • Function control blocks 63 Kernel Services: Memory Management Fixed-partition dynamic memory allocation • • Constant allocation time • Low overhead Memory management features • • • Guard bytes for run-time memory overflow checks • Ownership tracking • Garbage collection on completion pkt = (uint8_t*)ker_malloc(hdr_size + sizeof(SurgeMsg), SURGE_MOD_PID); 64 Kernel Services: Scheduling • • SOS implements non-preemptive priority scheduling via priority queues Event served when there is no higher priority event • Low priority queue for scheduling most events • High priority queue for time critical events, e.g., h/w interrupts & sensitive timers • Prevents execution in interrupt contexts • post_long(TREE_ROUTING_PID, SURGE_MOD_PID, MSG_SEND_PACKET, hdr_size + sizeof(SurgeMsg), (void*)packet, SOS_MSG_DYM_MANAGED); 65 Modules • Each module is uniquely identified by its ID or pid • Has private state • Represented by a message handler & has prototype: int8_t handler(void *private_state, Message *msg) • Return value follows errno SOS_OK for success. -EINVAL, -ENOMEM, etc. for failure 66 Kernel Services: Module Linking • Orthogonal to module distribution protocol • Kernel stores new module in free block located in program memory and critical information about module in the module table • Kernel calls initialization routine for module • Publish functions for other parts of the system to use char tmp_string = {'C', 'v', 'v', 0}; ker_register_fn(TREE_ROUTING_PID, MOD_GET_HDR_SIZE, tmp_string, (fn_ptr_t)tr_get_header_size); • Subscribe to functions supplied by other modules char tmp_string = {'C', 'v', 'v', 0}; s->get_hdr_size = (func_u8_t*)ker_get_handle(TREE_ROUTING_PID, MOD_GET_HDR_SIZE, tmp_string); • Set initial timers and schedule events 67 Module–to–Kernel Communication System Call System Jump Table HW Specific API • Module A System Messages High Priority Message Buffer SOS Kernel Hardware Interrupt Kernel provides system services and access to hardware ker_timer_start(s->pid, 0, TIMER_REPEAT, 500); ker_led(LED_YELLOW_OFF); • Kernel jump table re-directs system calls to handlers • • upgrade kernel independent of the module Interrupts & messages from kernel dispatched by a high priority message buffer • low latency • concurrency safe operation 68 Inter-Module Communication Module A Module B Module A Module B Indirect Function Call Post Module Function Pointer Table Inter-Module Function Calls • • • • Synchronous communication Kernel stores pointers to functions registered by modules Message Buffer Inter-Module Message Passing • • Blocking calls with low latency • Type-safe runtime function binding • Asynchronous communication Messages dispatched by a twolevel priority scheduler Suited for services with long latency Type safe binding through publish / subscribe interface 69 Synchronous Communication • Module A Module can register function for low latency blocking call (1) Module B • 3 2 Modules which need such function can subscribe to it by getting function pointer pointer (i.e. **func) (2) 1 Module Function Pointer Table • When service is needed, module dereferences the function pointer pointer (3) 70 Asynchronous Communication 3 2 Module A 4 Module B Network 1 Msg Queue 5 Send Queue • Module is active when it is handling the message (2)(4) • Message handling runs to completion and can only be interrupted by hardware interrupts • Module can send message to another module (3) or send message to the network (5) • Message can come from both network (1) and local host (3) 71 Module Safety • • Problem: Modules can be remotely added, removed, & modified on deployed nodes Accessing a module • • • If module doesn't exist, kernel catches messages sent to it & handles dynamically allocated memory If module exists but can't handle the message, then module's default handler gets message & kernel handles dynamically allocated memory Subscribing to a module’s function • • • • Publishing a function includes a type description that is stored in a function control block (FCB) table Subscription attempts include type checks against corresponding FCB Type changes/removal of published functions result in subscribers being redirected to system stub handler function specific to that type Updates to functions w/ same type assumed to have same semantics 72 Module Library • • Some applications created by combining already written and tested modules SOS kernel facilitates loosely coupled modules • Passing of memory ownership • Efficient function and messaging interfaces Surge Photo Sensor Surge Application with Debugging Memory Debug Tree Routing 73 Module Design #include <module.h> typedef struct { uint8_t pid; uint8_t led_on; } app_state; DECL_MOD_STATE(app_state); DECL_MOD_ID(BLINK_ID); int8_t module(void *state, Message *msg){ app_state *s = (app_state*)state; switch (msg->type){ case MSG_INIT: { s->pid = msg->did; s->led_on = 0; ker_timer_start(s->pid, 0, TIMER_REPEAT, 500); break; } case MSG_FINAL: { ker_timer_stop(s->pid, 0); break; } case MSG_TIMER_TIMEOUT: { if(s->led_on == 1){ ker_led(LED_YELLOW_ON); } else { ker_led(LED_YELLOW_OFF); } s->led_on++; if(s->led_on > 1) s->led_on = 0; break; } default: return -EINVAL; } return SOS_OK; • Uses standard C • Programs created by “wiring” modules together 74 Sensor Manager Module A Periodic Access Module B Signal Data Ready • Polled Access • Sensor Manager • getData dataReady MagSensor • Enables sharing of sensor data between multiple modules Presents uniform data access API to diverse sensors Underlying device specific drivers register with the sensor manager Device specific sensor drivers control • • ADC I2C • Calibration Data interpolation Sensor drivers are loadable: enables • • post-deployment configuration of sensors hot-swapping of sensors on a running node 75 Application Level Performance Comparison of application performance in SOS, TinyOS, and MateVM Surge Tree Formation Latency Surge Forwarding Delay Platform ROM RAM SOS Core Dynamic Memory Pool TinyOS with Deluge Mate VM 20464 B 1163 B 1536 B 21132 B 597 B 39746 B 3196 B Memory footprint for base operating system with the ability to distribute and update node programs. System TinyOs SOS Mate VM Surge Packet Delivery Ratio Active Time Active Time Overhead relative to (in 1 min) (%) TOS (%) 3.31 sec 5.22% NA 3.50 sec 5.84% 5.70% 3.68 sec 6.13% 11.00% CPU active time for surge application. 76 Reconfiguration Performance Module Name sample_send tree_routing photo_sensor Energy (mJ) Latency (sec) Code Size (Bytes) 568 2242 372 2312.68 46.6 Code Size Write Cost Write System (Bytes) (mJ/page) Energy (mJ) SOS 1316 0.31 1.86 TinyOS 30988 1.34 164.02 Mate VM NA NA NA Energy cost of light sensor driver update Code Size Write Cost Write System (Bytes) (mJ/page) Energy (mJ) SOS 566 0.31 0.93 TinyOS 31006 1.34 164.02 Mate VM 17 0 0 Module size and energy profile for installing surge under SOS Energy cost of surge application update • Energy trade offs SOS has slightly higher base operating cost TinyOS has significantly higher update cost SOS is more energy efficient when the system is updated one or more times a week 77 Platform Support Supported micro controllers • Atmel Atmega128 • • • 4 Kb RAM 128 Kb FLASH Oki ARM • • 32 Kb RAM 256 Kb FLASH Supported radio stacks • Chipcon CC1000 • • BMAC Chipcon CC2420 • IEEE 802.15.4 MAC (NDA required) 78 Simulation Support • • • Source code level network simulation • Pthread simulates hardware concurrency • UDP simulates perfect radio channel • Supports user defined topology & heterogeneous software configuration • Useful for verifying the functional correctness Instruction level simulation with Avrora • Instruction cycle accurate simulation • Simple perfect radio channel • Useful for verifying timing information • See http://compilers.cs.ucla.edu/avrora/ EmStar integration under development 79 Contiki Dynamic loading of programs (vs. static) Multi-threaded concurrency managed execution (in addition to event driven) Available on MSP430, AVR, HC12, Z80, 6502, x86, ... Simulation environment available for BSD/Linux/Windows 81 Key ideas Dynamic loading of programs • Selective reprogramming Static/pre-linking (early work: EmNets) Dynamic linking (recent work: SENSYS) Key difference from SOS: no assumption of position independence Concurrency management mechanisms • Events and threads Trade-offs: preemption, size 82 Loadable programs One-way dependencies • Core resident in memory If programs “know” the core • Language run-time, communication Can be statically linked And call core functions and reference core variables freely Core Individual programs can be loaded/unloaded Need to register their variable and function information with core 83 84 Loadable programs (contd.) Programs can be loaded from anywhere • Radio (multi-hop, single-hop), EEPROM, etc. Core • During software development, usually change only one module 85 Core Symbol Table • Registry of names and addresses of all externally visible variables and functions of core modules and run-time libraries • • Offers API to linker to search registry and to update registry Created when Contiki core binary image is compiled multiple pass process 86 Linking and relocating a module 1. Parse payload into code, data, symbol table, and list of “relocation entries” which correspond to an instruction or address in code or data that needs to be updated with a new address consist of o o o a pointer to a symbol, such as a variable name or a function name or a pointer to a place in the code or data address of the symbol a relocation type which specifies how the data or code should be updated 2. Allocate memory for code & data is flash ROM and RAM 3. Link and relocate code and data segments — for each relocation entry, search core symbol table and module symbol table — if relocation is relative than calculate absolute address 4. Write code to flash ROM and data to RAM 87 Contiki size (bytes) Module Code MSP430 Code AVR RAM Kernel 810 1044 10 + e + p Program loader 658 - 8 Multi-threading library 582 678 8+s Timer library 60 90 0 Memory manager 170 226 0 Event log replicator 1656 1934 200 µIP TCP/IP stack 4146 5218 18 + b 88 Revisiting Multi-threaded Computation Thread Threads blocked, waiting Thread Thread for events Kernel unblocks threads when event occurs Thread runs until next blocking statement Kernel Each thread requires its own stack Larger memory usage 90 Event-driven vs multi-threaded Event-driven Multi-threaded - No wait() statements + wait() statements - No preemption + Preemption possible - State machines + Sequential code flow + Compact code - Larger code overhead + Locking less of a problem - Locking problematic + Memory efficient - Larger memory requirements How to combine them? 91 Contiki: event-based kernel with threads Kernel is event-based • Most programs run directly on top of the kernel • Multi-threading implemented as a library • Threads only used if explicitly needed Long running computations, ... Preemption possible • Responsive system with running computations 92 Responsiveness Computation in a thread 93 Threads implemented atop an event-based kernel Event Thread Thread Event Kernel Event Event 94 Implementing preemptive threads 1 Thread Event handler Timer IRQ 95 Implementing preemptive threads 2 Event handler yield() 96 Memory management Memory allocated when module is loaded • Both ROM and RAM Fixed block memory allocator Code relocation made by module loader • Exercises flash ROM evenly 97 Protothreads: light-weight stackless threads Protothreads: mixture between event-driven and threaded • A third concurrency mechanism • Allows blocked waiting • Requires per-thread no stack • Each protothread runs inside a single C function • 2 bytes of per-protothread state 98 Mate: A Virtual Machine for Sensor Networks Why VM? • Large number (100’s to 1000’s) of nodes in a coverage area • Some nodes will fail during operation • Change of function during the mission Related Work • PicoJava assumes Java bytecode execution hardware • K Virtual Machine requires 160 – 512 KB of memory • XML too complex and not enough RAM • Scylla VM for mobile embedded system 99 Mate features • Small (16KB instruction memory, 1KB RAM) • Concise (limited memory & bandwidth) • Resilience (memory protection) • Efficient • Tailorable (user defined instructions) (bandwidth) 10 Mate in a Nutshell • Stack architecture • Three concurrent execution contexts • Execution triggered by predefined events • Tiny code capsules; self-propagate into network • Built in communication and sensing instructions 10 When is Mate Preferable? • • For small number of executions GDI example: Bytecode version is preferable for a program running less than 5 days • In energy constrained domains • Use Mate capsule as a general RPC engine 10 Mate Architecture Stack based architecture Single shared variable • Three events: • Message reception • Message send 0 1 Less prone to bugs PC Code • Simplifies programming 3 gets/sets Hides asynchrony • 2 Receive Clock timer Events Send • Subroutines Clock gets/sets Operand Stack Return Stack 10 Instruction Set One byte per instruction Three classes: basic, s-type, x-type • basic: arithmetic, halting, LED operation • s-type: messaging system • x-type: pushc, blez 8 instructions reserved for users to define Instruction polymorphism • e.g. add(data, message, sensing) 10 Code Example(1) • Display Counter to LED gets pushc 1 add copy sets pushc 7 and putled halt # # # # # # # # # Push heap variable on stack Push 1 on stack Pop twice, add, push result Copy top of stack Pop, set heap Push 0x0007 onto stack Take bottom 3 bits of value Pop, set LEDs to bit pattern 10 Code Capsules • One capsule = 24 instructions • Fits into single TOS packet • Atomic reception • Code Capsule Type and version information Type: send, receive, timer, subroutine 10 Viral Code • Capsule transmission: forw • Forwarding other installed capsule: forwo (use within clock capsule) • Mate checks on version number on reception of a capsule -> if it is newer, install it • Versioning: 32bit counter • Disseminates new code over the network 10 Component Breakdown • Mate runs on mica with 7286 bytes code, 603 bytes RAM 10 Network Infection Rate • 42 node network in 3 by 14 grid • Radio transmission: 3 hop network • Cell size: 15 to 30 motes • Every mote runs its clock capsule every 20 seconds • Self-forwarding clock capsule 10 Bytecodes vs. Native Code • Mate IPS: ~10,000 • Overhead: Every instruction executed as separate TOS task 11 Installation Costs • • Bytecodes have computational overhead But this can be compensated by using small packets on upload (to some extent) 11 Customizing Mate • Mate is general architecture; user can build customized VM • User can select bytecodes and execution events • Issues: Flexibility vs. Efficiency Customizing increases efficiency w/ cost of changing requirements Java’s solution: General computational VM + class libraries Mate’s approach: More customizable solution -> let user decide 11 How to … • Select a language -> defines VM bytecodes • Select execution events -> execution context, code image • Select primitives -> beyond language functionality 11 Constructing a Mate VM This generates a set of files -> which are used to build TOS application and to configure script program 11 Compiling and Running a Program Send it over the network to a VM VM-specific binary code Write programs in the scripter 11 Bombilla Architecture Once context: perform operations that only need single execution 16 word heap sharing among the context; setvar, getvar Buffer holds up to ten values; bhead, byank, bsorta 11 Bombilla Instruction Set basic: arithmetic, halt, sensing m-class: access message header v-class: 16 word heap access j-class: two jump instructions x-class: pushc 11 Enhanced Features of Bombilla • Capsule Injector: programming environment • Synchronization: 16-word shared heap; locking scheme • Provide synchronization model: handler, invocations, resources, scheduling points, sequences • Resource management: prevent deadlock • Random and selective capsule forwarding • Error State 11 Discussion • • Comparing to traditional VM concept, is Mate platform independent? Can we have it run on heterogeneous hardware? Security issues: How can we trust the received capsule? Is there a way to prevent version number race with adversary? • • In viral programming, is there a way to forward messages other than flooding? After a certain number of nodes are infected by new version capsule, can we forward based on need? Bombilla has some sophisticated OS features. What is the size of the program? Does sensor node need all those features? 11 .NET MicroFramework (MF) Architecture • • .NET MF is a bootable runtime environment tailored for embedded development MF services include: Boot Code Code Execution Thread Management Memory Management Hardware I/O 12 .NET MF Hardware Abstraction Layer (HAL) • Provides an interface to access hardware and peripherals Relevant only for system, not application developers • Does not require operating system Can run on top of one if available • Interfaces include: Clock Management Core CPU Communications External Bus Interface Unit (EBIU) Memory Management Power 12 .NET MF Platform Abstraction Layer (PAL) • Provides hardware independent abstractions Used by application developers to access system resources Application calls to PAL managed by Common Language Runtime (CLR) In turn calls HAL drivers to access hardware • PAL interfaces include: Time PAL Memory Management Input/Output Events Debugging Storage 12 Threading Model • User applications may have multiple threads Represented in the system as Managed Threads serviced by the CLR Time sliced context switching with (configurable) 20ms quantum Threads may have priorities • CLR has a single thread of execution at the system level Uses cooperative multitasking 12 Timer Module • MF provides support for accessing timers from C# • Enables execution of a user specified method At periodic intervals or one-time Callback method can be selected when timer is constructed • Part of the System.Threading namespace Callback method executes in a thread pool thread provided by the system 12 Timer Interface • • Callback: user specified method to be executed State: information used by callback method May be null • • • Duetime: delay before the timer first fires Period: time interval between callback invocations Change method allows user to stop timer Change period to -1 12 ADC Extension to the HAL • Extended MF HAL to support ADC API’s High-precision, low latency sampling using hardware clock Critical for many signal processing applications • Supported API functions include Initialize: initialize ADC peripheral registers and the clocks UnInitialize: reset ADC peripheral registers and uninitialize clocks ConfigureADC: select ADC parameters (mode, input channels, etc) StartSampling: starts conversion on selected ADC channel GetSamplingStatus: whether in progress or complete GetData: returns data stored in ADC data register 12 Radio Extension to the HAL • Extended the MF HAL to support radio API’s • Supported API functions include On: powers on radio, configures registers, SPI bus, initializes clocks Off: powers off radio, resets registers, clocks and SPI bus Configure: sets radio options for 802.15.4 radio BuildFrame: constructs data frame with specified parameters destination address, data, ack request SendPacket: sends data frame to specified address ReceivePacket: receives packet from a specified source address 12 MAC Extension to PAL • Built-in, efficient wireless communication protocol OMAC (Cao, Parker, Arora: ICNP 2006) Receiver centric MAC protocol Highly efficient for low duty cycle applications Implemented as a PAL component natively on top of HAL radio extensions for maximum efficiency Exposes rich set of wireless communication interfaces OMACSender OMACReceiver OMACBroadcast Easy, out-of-the-box Wireless Communication Complete abstraction of native, platform or protocol specific code from application developers 12