GridStat Middleware for More Extensible and Resilient Status Dissemination for the Electric Power Grid Faculty: Dave Bakken, Carl Hauser, Anjan Bose Students: H. Gjermundrød, I. Dionysiou, R. Johnston P. Jiang, S. Sheshadi, K. Swenson School of Electrical Engineering and Computer Science Washington State University Pullman, Washington USA http://gridstat.eecs.wsu.edu November, 2003 Overview of Presentation • Background and Motivation • GridStat Architecture • Gridstat Implementation and Deployment Issues Copyright © 2003 Washington State University Dave Bakken GridStat–2 Power Grid Today Overview Generation • 3 fundamental roles • Historically one vert-integr. utility • IT/control based on this fixed hierarchy Transmission Substation Subtransmission Substations Transmission Hierarchy • Substation • Control Area/utility • Grid Distribution Substations Distribution Customers (Create load) Residential Commercial Copyright © 2003 Washington State University Industrial Dave Bakken Figure credit: NSTAC GridStat–3 Protection and Control Today are Local • Remedial Action Schemes (RAS): hardwired remote link to trigger a protective relay • Otherwise almost exclusively local monitoring (status) & local control – Power dynamics are grid wide, and anomalies can affect a wide geographic area Generation Generation Generation Distribution Distribution Distribution Customer Customer UTILITY B Copyright © 2003 Washington State University Customer UTILITY A Dave Bakken Customer UTILITY C GridStat–4 The Changing Landscape • Higher demand for power transmission – miles x megawatts – More power and longer distances with little new transmission capacity • Installed transmission capacity constrained by minimum of – Thermal limit – Stability limit • More participants whose actions affect grid stability • Technology in recent years is adding – Many more devices “intelligent” devices – Much more heterogeneity • Lack of central authority • Terrorist concerns – Labor disputes, Environmental Liberation Front, everybody is an insider (or can be with a little effort), … Copyright © 2003 Washington State University Dave Bakken GridStat–5 Power Companies Are More Interrelated • Traditional generation + transmission + distribution is no longer inside a single utility Generation Generation Distribution Distribution Customer Customer Customer UTILITY B UTILITY A Generation – Utilities can be affected by many things they cannot sense/detect/measure with today’s communication infrastructure – Interactions between power dynamics and grid communicaiton dynamics completely unknown Distribution Customer Copyright © 2003 Washington State University UTILITY C Dave Bakken GridStat–6 ISO and Grid Security • Independent System Operator (ISO): layer above the control area layer added in the last few years – A small number of ISOs for bigger grids – Real-time balancing of supply and demand • ISO is responsible for grid security – Means no actions being considered, or any probable contingency, can lead to a blackout or brownout – Roughly translates to what computer scientists would consider stability and reliability • Grid security is an on-line, real-time activity – ISO monitors status from all control areas – Receives status info from control areas and substations in its jurisdiction • ISO’s functionality previously performed by the verticallyintegrated utilities – Now too much power flowing across and around them Copyright © 2003 Washington State University Dave Bakken GridStat–7 Status Information & the Power Grid • Changing requirements – More general topology and connectivity including multicast – Existing hardwired, hierarchical structure does not suffice • Status items may be needed at multiple locations – New services require more quantity, timeliness, … • Improved real-time controls to push stability limits nearer thermal limits • Situation awareness: phone calls not adequate! • 4-second SCADA cycle moving to ½ to 4 times per 60 Hz cycle • Opportunities for new kinds of automatic real-time control – Local control of devices based on remotely sensed status – Closed loop controls with relatively long delays – Stochastic control Copyright © 2003 Washington State University Dave Bakken GridStat–8 GridStat in a Nutshell • SCADA is a distributed computing problem – Convey status data in a reliable, timely and secure manner (QoS) – Above the network layer • Implementing these QoS properties requires considerable sophistication • Exploit application-level semantics and QoS requirements • Want a re-usable architecture – Compare to Web:TCP as SDM:network layer • GridStat: status dissemination middleware tailored for the power grid – Common service platform for disseminating power grid status information within and between power utilities, marketers, etc. – Timely, robust, and secure delivery to multiple participants – Collaborative project with CS and EE at WSU, others – Also applicable to status dissemination needs of other infrastructures: transportation, water, gas, … Copyright © 2003 Washington State University Dave Bakken GridStat–9 GridStat is Publish-Subscribe Middleware • Publish-subscribe architecture – Publish: periodically announce status values – Subscribe: periodically receive status values – Simple, CORBA-compliant APIs for both publishers and subscribers, management/control infrastructure, etc. – Subscribers have transparent cache of latest status value – Network of internal servers managed for QoS – Optimized for semantics of status items • Not just arbitrary event delivery like generic publish-subscribe Subscriber #1 Publisher Subscriber #2 … Subscriber #N Copyright © 2003 Washington State University Dave Bakken GridStat–10 What is Middleware? • Middleware == A layer of software above the operating system but below the application program that provides common programming abstractions for distributed systems • Middleware exists to help manage the complexity and heterogeneity inherent in distributed systems • Middleware provides higher-level building blocks for programmers than the OS provides – – – – Makes code more portable Makes programmers more productive Final product is of higher quality Analogy — MW:sockets ≈ HOL:assembler Copyright © 2003 Washington State University Dave Bakken GridStat–11 GridStat Middleware (MW) in Context Host 1 Distributed Application MW QoS Management Client Host 2 Distributed Application Client control control Middleware API Middleware MW Router MW Router MW Router Middleware API Middleware Oper. System API Oper. System API OS OS Comm. CPU Storage Storage CPU Comm. Wide-Area Network Copyright © 2003 Washington State University Dave Bakken GridStat–12 Fundamental GridStat Research Issues • Status dissemination middleware is a new specialization of publishsubscribe MW – Recognize specialized requirements of status dissemination – Take advantage of status semantics in order to meet those requirements – What are the APIs? What promises are made by the middleware to the application regarding functionality and performance? – What architecture can deliver on these promises? – How can we validate the correctness, timeliness, quality, etc, of a concrete embodiment of the architecture (the framework)? – What are the trust issues between grid participants. What policies are required? How can they be implemented? – How can the architecture be made economically scalable and manageable • Goal of GridStat resarch is to begin answering some of these questions and embody those answers in the GridStat middleware framework Copyright © 2003 Washington State University Dave Bakken GridStat–13 Overview of Presentation • Preliminary Information • GridStat Architecture • Gridstat Implementation and Deployment Issues Copyright © 2003 Washington State University Dave Bakken GridStat–14 Publication and Subscription • Status variable – located at a publishers – Periodic sequence of time-stamped values or – Sporadic sequence of time-stamped alerts – Types supported (initial, illustrative set): 3.75 3.79 3.76 time3 time2 time1 27.93 27.56 24.33 time3 time2 time1 alert time1 • Basic values: scalars of type bool or int or float • Derived values: moving average, change rate, moving average of change rate, max or min over an interval – Derived values are first-class: subscribe to them just as to a basic value • Subscription – requested by a subscriber – A promise (by the MW) to deliver values from a particular status variable at a given rate within the requested delay (timeliness) – Subscriber’s rate must not exceed publisher’s rate – Delay is constrained by the network Copyright © 2003 Washington State University Dave Bakken GridStat–15 Basic GridStat Functionality GridStat Management QoS Requirements Control QoS Requirements Area Controller Generator Load Following Wide Area Computer Network Grid Area Controller … Department of Homeland Security … Subscribers Publishers Copyright © 2003 Washington State University ISO Dave Bakken GridStat–16 Overview of GridStat’s … Architecture QoS Broker QoS Broker QoS Requirements Pub1 … PubN R R Control … R R QoS Broker Control R R QoS Requirements …R R Sub1 … SubN GridStat delivers status events from publishers to subcribers Copyright © 2003 Washington State University Dave Bakken GridStat–17 Detailed Architecture QoS broker 1 Key: Border Status Router (Edge) Status Router Status Router QoS broker 2 leaf QoS broker 4 Publisher 1 leaf QoS broker 5 leaf QoS broker 6 J D E A F B QoS broker 3 C H G L I K M R O Q N P S Publisher 2 Publisher 3 Copyright © 2003 Washington State University Dave Bakken GridStat–18 Route Allocation to Subscriber1 QoS broker 1 Key: Border Status Router (Edge) Status Router Status Router QoS broker 2 QoS broker 3 Subscriber 1 leaf QoS broker 4 Publisher 1 leaf QoS broker 6 J D E A F B leaf QoS broker 5 C H G L I K M R O Q N P S Publisher 2 Publisher 3 Copyright © 2003 Washington State University Dave Bakken GridStat–19 Route Allocation to Subscriber2 QoS broker 1 Key: Border Status Router (Edge) Status Router Status Router QoS broker 2 QoS broker 3 Subscriber 1 leaf QoS broker 4 Publisher 1 leaf QoS broker 6 J D E A F B leaf QoS broker 5 C H G L I K M R O Q P N S Publisher 2 Publisher 3 Subscriber 2 Note: Sub2 may have a different rate or latency than Sub1 Copyright © 2003 Washington State University Dave Bakken GridStat–20 GridStat & UCA v2 & SCADA Co-Existence Key GridStat Status Data (phone calls today) Homeland Defense GridStat QoS Control (later status aggregation) ISO ISO Utility #1 Control Center UCAv2 Substation UCAv2 Substation Utility #2 Control Center SCADA Substation Copyright © 2003 Washington State University SCADA Substation Dave Bakken Other Critical Infrastructures Note: UCA v2 has no wide-area network mgmt, though it discusses exploiting “future communcations services” GridStat–21 Programming Model: GridStat Subscriber Caches … … Status1 Status1 Status Router Status2 Status2 Subscriber1 … cache1 Status2 … Status Router Status3 Status3 Subscriber2 … cache2 Copyright © 2003 Washington State University Dave Bakken GridStat–22 Programming Model: Condensation Functions Status1 … Status Router StatusJ Condense StatusN • If desired derivation is not built in, condensation functions allow applications to define new derived status variables – Sometimes subscribers just read a large set of status items once to calculate a derived variable – Supported by allowing user-defined condensation functions to be loaded in status routers Copyright © 2003 Washington State University Dave Bakken GridStat–23 Overview of Presentation • Preliminary Information • GridStat Architecture • Gridstat Implementation and Deployment Issues Copyright © 2003 Washington State University Dave Bakken GridStat–24 Pragmatic Deployment Feasibility Notes • GridStat not completely deployable using best-effort internet technology – But much of the traffic might be best effort, especially with bandwidth reservation • Most likely deployment path: grid-wide intranet – Washington Post: “In a book-length Electricity Infrastructure Security Assessment, the industry concluded on Jan. 7 that "it may not be possible to provide sufficient security when using the Internet for power system control." Power companies, it said, will probably have to build a parallel private network for themselves.” (emphasis mine) • Not only would security be lacking, but predictable timeliness and resilience too!!!!! The Internet is arguably as complex as the grid! GridStat Team Opinion – Some portions likely co-located with tel-cos and national ISPs’ facilities; others built using private facilities of electric utilities – In addition to short-term application, GridStat should be viewed as a platform for exploring what services to provide in status dissemination middleware Copyright © 2003 Washington State University Dave Bakken GridStat–25 GridStat Capabilities Today • Static routing of status variables to meet subscriber’s timeliness and redundancy requirements • Recovery from data link and management link failure • Hierarchical QoS brokers • Graphic visualization of status items (strip charts etc) and of the internals of leaf QoS brokers (queues, etc) • Note: GridStat could deliver remote control commands in addition to status data – Just another kind of data to deliver…. Copyright © 2003 Washington State University Dave Bakken GridStat–26 GridStat Prototype • Finished: 2nd-Generation Distributed Prototype with – Hierarchy of QoS Managers performing the allocations – Publisher delivery rate & redundancy QoS requirements satisfied – Optional exception callback to subscriber of QoS violated Copyright © 2003 Washington State University Dave Bakken GridStat–27 Future GridStat Capabilities (Funding Pending) • Fault Tolerance at many more levels • Broader QoS Routing with runtime feedback • Trust management system to allow secure runtime subscriptions • Pre-allocated subscription “packages” for rapid deployment in contingencies • Validation framework (SW quality, QoS delivery) • Hardware support • Modeling and control theory for communication dynamic and power dynamics interacting Copyright © 2003 Washington State University Dave Bakken GridStat–28 Ongoing & Future Research Issues • Investigating a range of optimizations – Periodic status items only delivered if enough change • percentage • fixed delta – Throttle back lower priority status flows when overload, attack, accidents, etc. using subscription info (max timeliness, min redundancy) – Subscription aggregation of different kinds of flows and sub-flows • Resilience – Subscriber cache extrapolation – Adaptive path management • Push the data path into hardware or embedded processors – 10% of code perhaps – Status routers, HW registers for publishers and subscribers Copyright © 2003 Washington State University Dave Bakken GridStat–29 Collaborators, Funding, and Colleagues • Faculty: David E. Bakken, Carl Hauser, Anjan Bose, • Students: Ioanna Dionysiou, Kjell “Harald” Gjermundrød, Thomas Evje, Ryan Johnston, Supreeth Sheshadri, Ping Jiang • Funding: – US Dept. of Commerce, National Institute of Standards and Technology (NIST), Critical Infrastructure Protection Program, Grant #60NANB1D0116 (Dr. Tim Grance, PM) – The National Science Foundation, Grant CCR-0326006 (Dr. Helen Gill, PM) – Pending: DHS, soon NSF ITR (Feb 03) • Collaborators: – Prof. Kevin Tomsovic, WSU, realtime grid control with varying feedback loops and varying time horizons – Prof. Sandip Roy, WSU, control theory with stochastic delays – Prof. Deborah Frincke, U of Idaho, security and trust – Other WSU professors with interests in temporal queries, hardware implementation, graph theory, software engineering – CMU/CERT: Easel Simulation System and GridStat; IT modeling for power grids, … Copyright © 2003 Washington State University Dave Bakken GridStat–30 GridStat & Avista • Now working with Avista Utilities to experiment with distribution status data dissemination – Utility for WSU’s area, with presence in 5 western states • Technology demonstration deployment underway • Avista has donated $2.4M in dark fiber around – Spokane area – Pullman (WSU) – Moscow, ID (U. Idaho; 8 miles from WSU) to support GridStat and similar research and distributed evaluation at WSU, U. Idaho – Funds for “access points” and Avista engineering labor also provided Copyright © 2003 Washington State University Dave Bakken GridStat–31 Related Work • Computer Science (networking, distributed computing): – – – – PASS (BBN/Gatech ICDCS ’99 Zinky/O’Brien/Bakken/…) Sienna (U. Colorado): content-based publish-subscribe InfoPipes (GaTech): fresh delivery of status info SpinGlass/Astrolabe (Cornell): scaleable multicast • Electrical Engineering (power) – A few research papers pointing out the wide-area communication deficiencies – UCA version 2: nice “wrapping” of substation devices, but no QoS management across WANs. Copyright © 2003 Washington State University Dave Bakken GridStat–32 Conclusions • Existing power grid SCADA/DCS infrastructure is not adequate – Deregulation and restructuring – Efficient use of transmission resources • GridStat: status dissemination middleware tailored for the power grid – Publish-subscribe architecture with simple, CORBA-compliant APIs for both publishers and subscribers – Subscribers have transparent cache of latest status value – Network of internal servers managed for QoS • Timeliness • Redundancy • Security Copyright © 2003 Washington State University Dave Bakken GridStat–33 Questions? Copyright © 2003 Washington State University Dave Bakken GridStat–34 Background Slides • • • • Power Grid 101 Washington Post quotations on power grid cyber attacks Middleware 101 More GridStat Details Copyright © 2003 Washington State University Dave Bakken GridStat–35 Context “The ultimate challenge in creating the power delivery system of the 21st century is in the development of a communications infrastructure that allows for universal connectivity.” “In order to create this new power delivery system, what is needed is a national electricity-communications superhighway that links generation, transmission, substations, consumers, and distribution and delivery controllers.” Clark Gellings, EPRI Vice President for Power Delivery and Markets, in “Smart Power Delivery ― A vision for the Future,” EPRI Journal Online, Electric Power Research Institute, June 9, 2003 http://www.epri.com/journal/details.asp?doctype=features&id=618. GridStat is researching, implementing, and evaluating this “national electricity-communications superhighway”, and not just for the power grid but for other critical infrastructures as well. Copyright © 2003 Washington State University Dave Bakken GridStat–36 Power Grid Today • Three fundamental roles in the power grid: 1. Generation 2. Transmission 3. Distribution • Traditionally owned by a single, vertically-integrated company – Based largely on geography – Hierarchical infrastructure – Communications network is • Hardwired • Dedicated • Slow • Everything is hard-coded based on this fixed hierarchy – Application programs – Status information – Control decisions Copyright © 2003 Washington State University Dave Bakken GridStat–37 Components of the Power Grid • Generator: generates power, based on requirements given it • Substation: point of monitoring and control in the grid – Can service many generators, and/or other functions • Distribution point to customers • Voltage boosting • Control functions – Generally only services one fundamental role – Always involved in control based on status of a lot of devices • Control area: a set of substations – Geographic area ranging from a county to a few states – Services all three fundamental roles – Roughly corresponds to one or a few utility companies (most 1:1) – Collects status info from all substations for control decisions • Grid: a set of control areas which are synchronously controlled – AKA “regional reliability council” or “region” Copyright © 2003 Washington State University Dave Bakken GridStat–38 Grids in Canada and the US Copyright © 2003 Washington State University Dave Bakken GridStat–39 ISO and Grid Security • Independent System Operator (ISO): new layer above the control area layer currently being added – A small number of ISOs for bigger grids • ISO is responsible for grid security – Means no actions being considered, or any probable contingency, can lead to a blackout or brownout – Roughly translates to what computer scientists would consider stability and reliability • Grid security is an online, real-time activity – ISO monitors status from all control areas – Receives all status info from any control area or substation in its jurisdiction • ISO’s functionality used to be performed by the verticallyintegrated utilities – Now too much power flowing across them or around them Copyright © 2003 Washington State University Dave Bakken GridStat–40 Pragmatic GridStat Deployment Feasibility Notes • Allocation algorithms & frequency of subscriptions – In practice nearly all are likely to be pre-allocated and static – Number of new subscriptions (allocation algorithm runs) per hour small – Could be batched for offline (weekend/night) computation unless critical – Even brute-force solutions to NP-hard problems may be practical in many cases Copyright © 2003 Washington State University Dave Bakken GridStat–41 Background Slides • • • • Power Grid 101 Washington Post quotations on power grid cyber attacks Middleware 101 More GridStat Details Copyright © 2003 Washington State University Dave Bakken GridStat–42 Background Slides Washington Post quotations on cyber attacks “CyberAttacks by El Qaeda Feared” (27 Jun 02, A01)) • Note: emphasis mine and [comments mine] in all cases…. • “The event I fear most is a physical attack in conjunction with a successful cyber-attack on the responders' 911 system or on the power grid,” Ronald Dick, director of the FBI's National Infrastructure Protection Center Copyright © 2003 Washington State University Dave Bakken GridStat–43 Other Related Quotations from Post Article • The devices are called distributed control systems, or DCS, and supervisory control and data acquisition, or SCADA, systems. …. What is new and dangerous is that most of these devices are now being connected to the Internet -- some of them, according to classified “Red Team” intrusion exercises, in ways that their owners do not suspect. Because the digital controls were not designed with public access in mind, they typically lack even rudimentary security, having fewer safeguards than the purchase of flowers online. Much of the technical information required to penetrate these systems is widely discussed in the public forums of the affected industries, and specialists said the security flaws are well known to potential attackers. Copyright © 2003 Washington State University Dave Bakken GridStat–44 Post Quotations (cont.) • Digital controls are so pervasive, he said, that terrorists might use them to cause damage on a scale that otherwise would “not be available except through a very systematic and comprehensive physical attack.” [He is Director John Tritak of the Commerce Department's Critical Infrastructure Assurance Office] • To destroy a dam physically would require “tons of explosives,” Assistant Attorney General Michael Chertoff said a year ago. To breach it from cyberspace is not out of the question. In 1998, a 12-year-old hacker, exploring on a lark, broke into the computer system that runs Arizona's Roosevelt Dam. He did not know or care, but federal authorities said he had complete command of the SCADA system controlling the dam's massive floodgates. Copyright © 2003 Washington State University Dave Bakken GridStat–45 Post Quotations (cont.) • Massoud Amin, a mathematician directing new security efforts in the industry, described the North American power grid as “the most complex machine ever built.” At an April 2 conference hosted by the Commerce Department, participants said, government and industry scientists agreed that they have no idea how the grid would respond to a cyber-attack. What they do know is that "Red Teams" of mock intruders from the Energy Department's four national laboratories have devised what one government document listed as "eight scenarios for SCADA attack on an electrical power grid" -- and all of them work. Eighteen such exercises have been conducted to date against large regional utilities, and Richard A. Clarke, Bush's cyber-security adviser, said " the intruders “have always, always succeeded.” Copyright © 2003 Washington State University Dave Bakken GridStat–46 Background Slides • • • • Power Grid 101 Washington Post quotations on power grid cyber attacks Middleware 101 More GridStat Details Copyright © 2003 Washington State University Dave Bakken GridStat–47 Why Middleware? • Middleware == “A layer of software above the operating system but below the application program that provides a common programming abstraction across a distributed system” • Middleware exists to help manage the complexity and heterogeneity inherent in distributed systems • Middleware provides higher-level building blocks (“abstractions”) for programmers than the OS provides – – – – Can make code much more portable Can make them much more productive Can make the resulting code have fewer errors Analogy — MW:sockets ≈ HOL:assembler • Middleware sometimes is informally called “plumbing” – Connects parts of a distributed application with “data pipes” and passes data between them Copyright © 2003 Washington State University Dave Bakken GridStat–48 Middleware in Context Host 1 Distributed Application Host 2 Distributed Server Application Client Middleware API Middleware API Middleware Middleware Operating System API OS Comm. Processing Storage Operating System API OS Comm. Processing Storage Network Copyright © 2003 Washington State University Dave Bakken GridStat–49 Middleware Benefit: Masking Heterogeneity • Middleware’s programming building blocks mask heterogeneity – Makes programmer’s life much easier!! • Kinds of heterogeneity masked by middleware (MW) frameworks – All MW masks heterogeneity in network technology – All MW masks heterogeneity in host CPU – Almost all MW masks heterogeneity in operating system (or family thereof) • Notable exception: Microsoft middleware (de facto; not de jure or de fiat) – Almost all MW masks heterogeneity in programming language • Noteable exception: Java RMI – Some MW masks heterogeneity in vendor implementations • CORBA best here Copyright © 2003 Washington State University Dave Bakken GridStat–50 Background Slides • • • • Power Grid 101 Washington Post quotations on power grid cyber attacks Middleware 101 More GridStat Details Copyright © 2003 Washington State University Dave Bakken GridStat–51 Path Determination to Provide Fault Tolerance and Timeliness • Initial QoS Specification for Subscriptions – Desired latency – Number of redundant paths desired • Timeliness and fault tolerance – Choose multiple, disjoint paths that meet the requested delay constraint while respecting capacity constraints of links and routers – Many variants of multi-constrained QoS routing are NP-hard (of course!), including most of the ones that we think are relevant for GridStat – Choose paths using heuristics; if a (set of) paths is found the system should guarantee its performance Copyright © 2003 Washington State University Dave Bakken GridStat–52 GridStat Mapping Capabilities (cont.) Note: there are not GridStat nodes deployed as above; for illustration purposes only Copyright © 2003 Washington State University Dave Bakken GridStat–53 GridStat Status Patterns • Pragmatic goal: give building blocks that nondistributed-systems-specialists can effectively use in grid monitoring and control applications – Try to capture status semantics + some QoS info • Initial derived value examples – Periodic: Bandwidth use downstream limited by downstream subscriptions. Can be boolean, floating point, integer. – Alert: Potentially catastrophic situation • Propagate to subscribers immediately • Deliver using callback/interrupt (not just cache update) • Note: it is very useful to distinguish between anomaly domains: is it a power grid problem or IT infrastructure problem Copyright © 2003 Washington State University Dave Bakken GridStat–54