Self-healing networks When the going gets tough, the tough get going 2001 IPA Spring Days on Security L.Spaanenburg. Groningen University, Department of Computing Science. P.O. Box 800, 9700 AV, Groningen. Mail: ben@cs.rug.nl, http://www.cs.rug.nl/~ben Motivation What is security? Security involves the guaranteed access to all resources at all times with top quality Threats: - from outside - from inside Here: internal diseases only April 2001 IPA Spring Days - Security 2 Agenda What we need and what we can’t • • • • • • The nature of the net Disasters with central control The nature of self-healing In-line monitoring A hardware / software perspective Research view April 2001 IPA Spring Days - Security 3 The weak spot It is the small dog that bites! • A network is billions of tightly connected distributed heterogeneous components • Things happen on a wide time/spatial scale with massive interaction • A local disturbance can spread widely in zero time • Relationships and interdependencies are too complex for mathematical theories April 2001 IPA Spring Days - Security 4 User’s perspective on networks An integrated Power Information Communication technology April 2001 IPA Spring Days - Security 5 Telephone network A network can be a tree with central control long distance 1st-order exchange 2nd-order exchange local exchange April 2001 medium distance short distance connection IPA Spring Days - Security 6 Data Network Connectionless communication by broadcast Host Router Subnet LAN April 2001 IPA Spring Days - Security 7 Means of Communication Sigh, there are some many ways to communicate • Synchronous PDH: Plesiochronous Digital Hierarchy SDH: Synchronous Digital Hierarchy ISDN: Integrated Services Digital Network • Asynchronous FDDI: Fiber Distributed Data Interface FR: Frame Relay ATM: Asynchronous Transfer Mode April 2001 IPA Spring Days - Security 8 Sources of Abnormality What goes wrong, will go wrong • Attacks from the outside world (service attack) • Hick-ups in the network communication • Failures on the network nodes It’s a detection problem! April 2001 IPA Spring Days - Security 9 The Keeler-Allston disaster The network is vulnerable for local abnormalities • On 10 August 1996, the Keeler-Allston 500 kV power line tripped creating voltage depression and the McNary Dam went to maximum • The Ross-Lexington 230 kV line also tripped and pushed the McNary Dam over the edge • The McNary Dam sets off oscillations that went to 500 MW within 1.5 minutes • The North-South Pacific INTER-tie isolated 11 US states and 2 Canadian provinces April 2001 IPA Spring Days - Security 10 The 1998 Galactic page out The weak belly of the Earth • In May 1998, the Galaxy-IV satellite was disabled by unknown causes • US National Public Radio and 40M pagers went out, airline flights delayed and data networks had to be manually reconfigured • Many geo-stationary satellites are 800 – 1400 km; 13 (60-), 35 (70-), 69 (80-) and 250 (90-) • 10 million pieces of debris > 1 mm April 2001 IPA Spring Days - Security 11 Other fault cascades Cause/effect relations occur frequently Finagle’s Law “Anything that can go wrong, will” Antibiotics cause resistance (DDT) Code replication also works for errors April 2001 IPA Spring Days - Security 12 Self-healing in history The name has been used before • 1993 • 1998 • 1998 • 2001 April 2001 AT&T announced the self-healing wireless network SUN bought the RedCape Policy Framework for self-healing software HP released the sefl-healing version of OpenView Network Node Manager Concord Com. Announced self-healing for the home IPA Spring Days - Security 13 Self-Healing ingredients Self-healing = Detection + Diagnosis + Self-Repair • • • • • • • Application Presentation Session Transport Network Data Link Physical April 2001 handling the communication message formatting controls traffic between parties converts packets into frames v.v. controls frame routing frames of bit sequences relays physical quantities IPA Spring Days - Security Network Test Node Test Reconfigure 14 An Initiative in Self-Healing The Complex Interactive Networks/Systems Initiative • The CIN/SI is funded by the Electronic Power Research Institute and the US Dept. of Defense as part of the Government-Industry Collaborative University Research program • 28 universities in 6 consortia started Spring 1999 to spent $30 M in 5 years • The approach is multi-agent technology April 2001 IPA Spring Days - Security 15 CIN/SI consortia The different aspects of self-healing • • • • • • [CalTech] [CMU] [Cornell] [Harvard] [Purdue] [Washington] April 2001 CIN Mathematical Foundation Context-dependent Agents Failure Minimization Modeling and Diagnosis Intelligent Management Defense to Attacks IPA Spring Days - Security 16 Key issues Central control comes too late by definition • Pre-programming misses the target by lack of context dependence • No damage would have occurred if the load on the McNary Dam would have decreased by 0.4% during the next 30 minutes • Local agents making real-time decision would have eliminated the Keeler-Allson disaster. April 2001 IPA Spring Days - Security 17 Basic agent types What are agents? • Agents are called cognitive or rational when equipped with clear rules and algorithms • Agents are called reactive when their functioning depends on the interrogation of the environment Both type of agents are required on the decisionmaking layers handling respectively reaction, coordination and deliberation April 2001 IPA Spring Days - Security 18 CIN/SI architecture (1) Operational control of the power plant Triggering events Events/alarm Filtering Agents Plans/Decisions Model update Agents Command Agents Controls Events/ alarms Faults Isolation Agents Frequency Stability Agents Protection Agents Generation Agents Power System April 2001 IPA Spring Days - Security 19 CIN/SI architecture (2) Strategic management of the power grid Hidden Failure Monitoring Agents Reconfiguration Agents Vulnerability Assessment Agents Restoration Agents Events Identification Agents Planning Agents Triggering events Events/alarm Filtering Agents April 2001 Plans/Decisions Model update Agents IPA Spring Days - Security Command Agents 20 Monitoring the process Strategic decisions on tactic control Monitor Sensor April 2001 Control Process IPA Spring Days - Security Actuator 21 The network emphasis The network glues the agents together Agent Agent Agent Network Agent April 2001 Agent Agent IPA Spring Days - Security 22 Defect looses all Majority voting is a centralized consensus scheme But what we need is: • Mutual observation between nodes • Group decision of testing agents • Implied reconfiguration of the network How can we facilitate testing with agent properties? April 2001 IPA Spring Days - Security 23 Agent characteristics What is security? mouse messages ... other agents sen sors effec tors Behaviour messages move change appearance speak Independent, Reactive,Proactive, Social April 2001 IPA Spring Days - Security 24 Built-in Block Observation Testing complex systems requires autonomy generator process verifier April 2001 IPA Spring Days - Security 25 Linear Feedback Shift-register Generation of ordered bit strings by EXORs When data flows over identical nodes, the typical function can be characterized by the feedback polynomial x x x 6 April 2001 1 0 IPA Spring Days - Security 26 Friedmann model The aim is for a locally compacted set of patterns I O Process Q April 2001 IPA Spring Days - Security 27 A basic function Proto-typical software on a small PIC controller • A simple low-pass filter z N 1 1 (ci xt i ) N i 0 • Takes a data sampling routine, multiplying adder and final function 1/N. April 2001 IPA Spring Days - Security 28 A neuron Intelligence can be built from filtering • A simple neuron N 1 z f ( wi xij ) i 0 • Is similar to the low-pass filter except for the incoming data. Operates from the same input data ring-buffer. April 2001 IPA Spring Days - Security 29 A neural network Where there is one neuron, there can be more • A feed-forward network M 1 z f wj j 0 N 1 f ( wi xij ) i 0 • Differs only in the layer-by-layer switching of the I/O-blocks April 2001 IPA Spring Days - Security 30 Non-Linear Feedback SR Generation of ordered patterns by Correlators When data flows over identical nodes, the typical function can be characterized by the globally recurrent neural network w x April 2001 t IPA Spring Days - Security 31 Neural Observation Analog correlation looks like digital EXOR • Analog correlation is about finding the functional similarity • Digital correlation is the same except for the effect of crisping • Random access storage is always larger than storage of an ordered function • The neurally approximated function allowes for a dense salvage of ordered I/O-pairs April 2001 IPA Spring Days - Security 32 Data-Flow Architecture Data discrepancy is low-level abnormal behavior • When data flows over identical nodes, the typical function can be characterized • Built-In Logic Block Observation • The BIFBO can also be shared with neighboring nodes • Built-In Function Block Observation • The local test does not differentiate between hardware and software April 2001 IPA Spring Days - Security 33 Question 1 Is there an abstractional test? • If you can not test it, then it’s not worth to design it. • Hierarchical design needs a hierarchical test. • Abstraction gives a condensed view on reality. • Abstraction provides for scalability. April 2001 IPA Spring Days - Security 34 Question 2 Is feature interaction really a static problem? • Interaction is good, conflicts are less • If resources have a state, access should be bounded by state • Conflicting services pose basically a scheduling problem • It’s hard to schedule over an arbitrary network April 2001 IPA Spring Days - Security 35 Question 3 Do neural networks provide for a built-in test? • Design should be scalable; test is no exception. • Detection can do without diagnosis; Diagnosis can not go without detection. • Testing can be based on area (coverage) or on frontier (sensitivity) • The boundary between software and hardware is still moving April 2001 IPA Spring Days - Security 36