SOFTWARE ENGINEERING FOR DEPENDABLE SYSTEMS John C. Knight Department of Computer Science University of Virginia Copyright 2004 - John C. Knight Overview My General Interest: Systems that are REALLY important. Systems where failure means: injury, death, destruction, chaos, etc. Safety-critical Embedded Systems Two halves of Crucial Application Networks overall research Copyright 2004 - John C. Knight program Electronic Automobile Systems Steering Brakes Traction Control Engine Control Transmission Suspension By wire… With no physical backup… Copyright 2004 - John C. Knight Digital Fly By Wire Computers Failure Rate < 10-9/hour Wire Not Plumbing.... Networks Copyright 2004 - John C. Knight System Complexity Integrated Digital Avionics 1996 7 2M* 1.5M* Relative Complexity 6 5 Digital Avionics 4 Hybrid Avionics 1987 3 1987 Analog Avionics 2 1 1989 1980 1992 1984 1972 1971 B-777 F-22 F-15E B-1B F-18C C-17C F-14D F-16C F-18A E-2C F-15A F-14A 0 1989 *Lines of Code 03/6/02 From Steve Miller, Rockwell Collins Copyright 2004 - John C. KnightPage 5 Critical Infrastructure These are safety-critical systems Copyright 2004 - John C. Knight Wide Area Augmentation System Copyright 2004 - John C. Knight Free flight Precision approaches Sizewell B Nuclear Plant Primary protection system 100,000 lines of code Over 600 processors…. 50,000 test cases “Failed” 52% Real problem was inability to determine correct response Copyright 2004 - John C. Knight Wide Area Augmentation System Original cost est. $892.4 Current estimate $2,900 Orig. del. date 1998 Current estimate 2003 Many other major modernization programs in similar states (STARS, AMASS) Copyright 2004 - John C. Knight What Are The Areas Of Research? Formal methods, especially specification System architectures Verification Tools Other miscellaneous things that are fun Copyright 2004 - John C. Knight Specification About 60% of defects in practice are specification errors Community solution approach: Formal languages, i.e., languages with semantics defined in mathematics Powerful mechanism for communication and analysis Rarely used… Copyright 2004 - John C. Knight The Situation At Present Idiots Idiots Industrial Practitioners Academics We think we understand this Copyright 2004 - John C. Knight The Situation In The Future Don’t Do It Again Sorry! Industrial Practitioners Academics Copyright 2004 - John C. Knight Specific Research Integration of formal and informal languages: Embedded system survivability: They are different, both are needed in all systems How should they be combined? How do you analyze the combination? Don’t make it reliable, make it survivable Complex combination of specification, analysis & arch Tool support: Powerful toolsets developed See: http://www.cs.virginia.edu/zeus Copyright 2004 - John C. Knight Zeus Specification Tools SPECIFICATION MEANING Natural Language Manipulation & Analysis MAP Formal Structure FUNCTION Formal Language Analysis: Manipulation •Symbol defns & Analysis •Symbols uses •Invariants •Etc.Copyright 2004 - John C. Knight The Network Problem •Very Large Networks •Interdependent Networks •Heterogeneous Nodes •Non-Local Faults •Sequential Faults Copyright 2004 - John C. Knight Survivability As Control To Actuators From Sensors Control Function “Sensor” Signals “Actuator” Commands Copyright 2004 - John C. Knight Dynamic Reconfiguration Single Component Reconfiguration ? Application Reconfiguration ? ? Copyright 2004 - John C. Knight Willow Architecture Logical View Reactive Operator Active Control During Attack Intelligence Analysis Development Commands New Postures Administrator Trust boundary Active Management Proactive Copyright 2004 - John C. Knight Before and After Attack Approach to Fault Treatment Critical Networked Application Actuators Sensors Application State & Analysis Model Self Healing Tolerate Anticipated Faults Planned Posture Change System Update External CopyrightInput 2004 - John C. Knight System Deployment Willow Architectural Issues Hierarchic faults Control loop interactions: Critical Networked Application Actuators Sensors Network scale: Network State &Analysis Model Self Healing Tolerate Anticipated Faults Planned Posture Change System Update System Deployment Volume of software State model Wide area change Exceptions and results: External Input Asynchronous Priority & resources Conflicting goals Dynamic app’l membership Absolute vs. statistical Result “harvesting” Target system actuation: Lightweight Standard interface & protocol Copyright 2004 - John C. Knight Summary Lots of crucial applications—many more than most people think Very challenging engineering Very significant research problems Many exciting ideas here at UVA Lots of opportunities to contribute Breakout session: Thursday at 5:00PM Olsson 236D Copyright 2004 - John C. Knight Questions? Copyright 2004 - John C. Knight