A Network Virtual Machine for Real-Time Coordination Services Professor Jack Stankovic, PI Department of Computer Science University of Virginia June 2001 Outline • Overview – Problem/Goal – Research Team/Team Coordination • • • • • Specific Problems/Key Issues Research Approach Success Schedule and Milestones Deliverables Sensor/Actuator Clouds Resource management, team formation, real-time, mobility, power Heterogeneous Sensors/Actuators/CPUs Smart Dust • battlefield awareness (more later) • earthquake response • tracking movements of animals Goal • Create a network virtual machine that is a coordination and control layer (middleware) that – abstracts – controls, and – guarantees aggregate behavior for unreliable and mobile networks of sensors, actuators, and processors. The Team Lockheed Martin Applications Req. Virginia FC Aggregate Control Team Coord. Wireless Data Discovery MMDP CMU RT Illinois The Team • University of Virginia – Tarek Abdelzaher, Sang Son, Jack Stankovic (PI), Gang Tao • University of Illinois – Lui Sha, P. R. Kumar • CMU – Bruce Krogh • Lockheed Martin – Dennis Adams Primary Responsibilities • • • • • Applications and Transition - Adams Data Discovery - Son Team Coordination - Sha and Abdelzaher Aggregate Control - Stankovic, Tao, Krogh Wireless - Kumar Specific Problems/Key Issues • Application Requirements • Aggregation - system as a whole must meet requirements – individual entities not critical – Real-Time, Power, Mobility, Wireless, Size, Cost, (Security and Privacy) • Self-organizing protocols that organize mobile sensor control agents into teams • Environment Data Discovery • Wireless Communications - capacity man. Overview of Research Approach • Application requirements • Behavior specification language - listen, move, call-in-fire, call-in-jamming • Integration of real-time computing theory, multi-mode MDP, and feedback control theory • Composable and scalable micro-protocols that can selforganize distributed devices into collaborative teams to achieve aggregate goals • Protocols for dynamic environmental data discovery • Scaling of wireless networks and protocols for capacity management and interaction with aggregate control A Network Virtual Machine for Real-Time Coordination Services Integrated Theory Multi-Mode Markov Decision Processes (chooses modes) Set of Adaptive Controllers 1 with Elastic RT Scheduling Set of Adaptive Controllers N with Elastic RT Scheduling Robust Feedback Control and Real-Time Scheduling Theory Combined to design each set of controllers Middleware Architecture Notional NEST Application: Distributed Surveillance Network • Large, heterogeneous network of unattended sensor/communication nodes provides battlefield awareness to military commanders at all echelons. – – – – Unattended ground sensors Robotic ground vehicles Micro air vehicles Miniature aerostats • Nodes collect, filter, and route battlefield information to client. – – – – Visible and IR imagery Seismic and acoustic RF Chemical Distributed Surveillance Network • Node communication range (a) 2x node sensor range (b) • Each node capable of sensing and b relaying data to Enemy neighbors Activity Node 1 • Network learns patterns, recognizes anomalies, and routes information to appropriate clients a Node 2 Node 3 Distributed Surveillance Network • Typical Operational Situation (OPSIT) – Network deployed from high altitude to assess enemy air defenses prior to strike. – Network identifies potential enemy AAA sites, communicates locations to command structure. – Network associates tracks from AAA node neighbors to postulate increased vehicular traffic at Decoy AAA specific candidate sites. – Nodes local to candidate sites monitor increased human activity as hostilities increase; decoy AAA sites rejected. – Network routes around failed nodes to distribute targeting and BDA information during and after air strike. University of Virginia, University of Illinois, CMU, Lockheed Martin NE&SS-Akron How the Problems Change • Environment – connect to physical environment (large numbers) – massively parallel interfaces – faulty, highly dynamic, non-deterministic • Network – – – – – – wireless structure is dynamically changing sporadic connectivity new resources entering/leaving large amounts of redundancy self-configure/re-configure Aggregate Performance • Specify and control emerging behavior to meet system-level requirements – Smart Clouds of sensors/actuators/cpus in battlefield environments • Combine FC, MMDP and elastic RT scheduling FC-EDF scheduler Completed Tasks EDF Scheduler MR(t) MRs PID Controller FC-EDF U QoS Controller Admission Controller CPU EDF Sched Adjust QoS Accepted Tasks Admit Reject Design and Evaluation of a feedback control EDF scheduling algorithm, IEEE RTSS’99 Submitted Tasks Performance Specs Transient Response Transient response of a second order system y(t) t 2 FC-EDF scheduler Completed Tasks U(t) MRs EDF Scheduler MR(t) CPU PID Controller Um U U Us EDF Sched PID Controller QoS Controller Min Adjust QoS Accepted Tasks u FC-EDF2 Admission Controller Admit Reject Submitted Tasks Network Architectures Classical Hierarchical Neighborhood 15 13 10 2 12 13 14 15 6 7 8 9 10 1 2 3 4 5 14 9 1 11 3 11 4 5 12 6 7 8 Distributed Control System Architecture * Move into network for HCLOSE * Added functionality for NCLOSE DFCS LFCS P-5 min slr_setpoint PID-4 P-2 min PID-1 ctrl_signal Node-MR RCSL Actuator slr_ctrl P-3 AC Actuator SLR System CPU_Util MR Network Architectures Non-classical • Clouds of sensors/actuators/cpus – network architecture dynamically changing (fast) – subject to high error rate – new resources entering and leaving • due to mobility, faults, …. – Power/mobility/communication/computation/secu rity tradeoffs Aggregate Control • Feedback Control Theory – – – – – – – explicit use of real-time computer system models transient performance specifications adaptive/robust control utilization bounds elastic control random algorithms The Multi-Mode MDP Approach • NEST applications as Markov decision processes – Discrete-state, discrete-time features – Markovian behavior – Influence of resource allocation decisions • Challenges – size and complexity of NEST applications – abrupt and random changes in topology – abrupt and random changes in the environment • Multi-mode approach – basic MDP formulation is intractable for NEST – behaviors can be aggregated into modes corresponding to various topologies/components multi-mode policies Multi-Mode MDPs Strategies P1 action ak resource allocation policy Pn ak m̂ k state estimation mode estimation two-level MDP model NEST Components action switching rule X̂ k NEST Virtual Machine observations Sensor/ Actuator Interactions mode MDP mode mk state xk state MDP ENVIRONMENT multi-mode MDP resource allocation strategy MMDP Research Issues • Modeling – – – – state variables and validation of Markov assumption action variables and influences on transition probabilities network and environmental modes observable states and modes • Scalable Strategies – design of mode-matching policies – state and mode aggregation – mode estimation and policy switching • Adaptive Strategies – run-time policy improvement • Integration – data acquisition and fusion from NEST sensors – with local/global individual mode controllers – implementation via micro-protocols Summary - Aggregate Control Integrated Theory Multi-Mode Markov Decision Processes (chooses modes) Set of Adaptive Controllers with Elastic RT Scheduling Robust Feedback Control and Real-Time Scheduling Theory Combined to design each set of controllers Team Formation • For each major task, a reference model for an ideal team is defined (the dream team model) – Roles and members needed (minimal, ideal) – Computational requirements (minimal, idea) – Communication flow (minimal, ideal) • Utility functions to be defined, so that we can compute the gain as a function of members, computation and communication resources available. • Teams compete for resources: members, computation and communication resources. Allocate resource to maximize total payoff. • Challenge fundamental assumptions, e.g., in consensus algorithms Data Discovery • Find interesting information in the environment - geographic based – move proper resources to those areas of interest • Procedure – identify target data streams and attributes needed – remove noise, outliers, synchornize streams, etc. – data discovery (find patterns of interest) • Analogy: data mining on a non-stationary dataset Challenges in Wireless Networks • Networks of wireless nodes - Ad Hoc Networks – Spontaneously deployable anywhere – Adaptive to nodes, mobility, volatility • Issues – How much traffic can they carry? Scalability Performance of protocols for Power control Routing MAC …. Clean abstraction for control and surveillance Approach • Power control algorithms – for enhancing capacity – for providing power aware routes – for reducing MAC contention • Media Access Control – build on SEEDEX protocol – no reservations – new idea of exchanging the seeds of random number • Study performance and scaling of routing algorithms • Study performance of transport layer protocols Success • Application Level (battlefield scenario) : – Find information faster and more accurately via coordination, react quicker and with higher throughput, re-configure when necessary, able to scale • Network Virtual Machine for NEST – hide complexity of environment • Unified theory of QoS aggregate control • Self-configuring team formation protocols under new constraints • Etc. Tasks • 1: Application Req. • 2: Behavioral Spec Lang • 3: Mapping to System Level Parameters • 4: Architecture For Data Discovery • 5: Data Discovery Protocols • 6: Micro-Protocols for Team Formation – form teams – timely and coherent info • 7: Robust and Adaptive Controllers – decentralized control – MMDP • • • • • 8: Option years 9: Testbed Development 10: Testing and Demos 11: Reports and Papers 12: Work with OEP Schedule and Milestones Deliverables • An API that supports behavioral abstractions • Library routines to map behavioral abstractions into system level requirements • Architecture design for data discovery • Micro-protocols for team formation • Aggregate QoS control for first part of scheduling problem (as defined in proposal) • Simulation testbed (for first stage) • Quarterly reports, final report A Network Virtual Machine for Real-Time Coordination Services Network Virtual Machine (hides complexity of physical environment - battlefield awareness) Resource management, team formation, real-time, mobility, power New Ideas • Integration of real-time computing theory, multi-mode MDP, and feedback control theory • Composable and scalable micro-protocols that can selforganize distributed devices into collaborative teams to achieve aggregate goals • Scaling of wireless networks and protocols for capacity enhancement • Protocols for dynamic environmental data discovery Heterogeneous Sensors/Actuators/CPUs Schedule Impact • Guaranteed aggregate behavior of NEST systems • Control of mobile sensor/actuator/computer networks • Large scale distributed team coordination • Theory and practice for performance control • Survival of essential services 16 Months Year •behavior spec. language •self-organizing teams protocol •QoS aggregate control •demo •protocols for self-organizing nodes 2 •robust an adaptive controllers •demo Year 3 John A. Stankovic (stankovic@cs.virginia.edu), University of Virginia University of Illinois, CMU, Lockheed Martin •integrated theory •NEST middleware •demo