Advisor: Jim French, Dept of Ecology Team Members: Scott Andersen, WSDOT Gary Duffield, DIS Doug Selix, OFM Thelma Smith, WSDOT Brian Sylvester, DOP How can the state achieve a coordinated approach to IT disaster recovery? How will we recover critical services and infrastructure knowing that we share services, platforms, and customers that rely on each other for data during the recovery? How do we expose the risks, identify the gaps and move toward meeting recovery time objectives? How do we ensure that the capacity to recover aligns with the risk tolerance of state leadership? 1. Establish and empower a central authority for ‘Enterprise’ (Statewide) D/R Planning 2. Standardize and consolidate IT Infrastructure where ever possible to ease D/R Planning 3. Practice D/R Planning at the ‘Enterprise’ (not agency) level 4. Mandate D/R planning for all IT systems 5. Develop and document State guidelines on ‘risk appetite’ Resilience and Recoverability (R/R) Leadership is about change! Shared Vision: Changes on the horizon Standardization & Consolidation System Level R/R Focus R/R Designed into All Systems Risk Tolerance and Oversight Senior Level Sponsorship State Agencies’ Partnership Strategic and Tactical Leadership Strategic = Resilience Tactical = Recoverability Governor Emergency Management Council State Agency Liaisons DIS, OFM, DOP, DOT, etc. Comprehensive Emergency Management Plan - CEMP ISB Standards State Agencies’ Plans Existing Catch 22 Change agency-centric approach to statewide R/R solution Establish shared vision for funding R/R Integrate R/R into Spending Plans Develop policy that cements R/R funding into IT initiatives Establish Ownership and Oversight Align R/R efforts with similar or preexisiting efforts Emergency management groups Agencies’ leadership teams Establish new teams or partnerships as needed Establish policies for: Compliance Success Metrics Change Management LEADERSHIP!!! Proactive = Resilience Reactive = Recovery Close Gaps and Remove Roadblocks Leverage Existing or Program Empower new Hardware and software consolidation and standardization is becoming the driving force behind organizations evaluating their Disaster Recovery plans. A 2009 survey from Symantec Corporation found that 64% of organizations are creating or reevaluating their DR plans based on a plan to consolidate and standardize their infrastructure. Hosting Service Matrix Increase provider mgmt, reduce agency resources Maturity Target Transition Target Leverage common infrastructure, consolidate hardware, reduce cost 2 Adopt a cost effective enterprise High Availability Architecture solution (Resilience). Future investments in Infrastructure and Applications should include Resilience and Recoverability. Planning for Resilience and Recoverability should be at the Enterprise Level. Planning for recovery by agency, technology, or individual application is not effective for an enterprise class system. Enterprise Level Planning is complex, and must be done for Essential Systems. Essential Systems support Essential Agency Functions as defined in agency COOP plans Must consider core agency systems - run by agency or service provider Must consider dependencies such as infrastructure and interface services Must consider dependant trading partner systems Must consider enterprise data at recovery point Must include procedures for assuring data integrity at recovery point OFM Example - The State Payment Process Payment Process based upon AFRS and all systems that it connects to Historical DR Plan “DIS will recover the mainframe and all will be good” Look at interfaces to partner agencies Look at known single points of failure Enterprise Class Planning requires someone to focus on getting it done for essential systems! A single organization must facilitate Enterprise planning Enterprise system owner and Stakeholders must fully participate in development and testing of R/R Plans Enterprise Planning is HARD! Enterprise Class Systems are COMPLEX! Someone Needs to GET ‘er DONE! Many, if not most, recent IT systems developed without Disaster Recovery – Why? Elimination viewed as a ‘Cost Reduction’ strategy. This is a ‘false economy’ – a calculated risk Real consequences to State citizens: Missing vital systems after a disaster Or Spend too much to ensure their availability Creation of WSRRO Mandate all new IT systems include R/R Review and approve Criteria Agency impact analysis Integration impact analysis Validate appropriateness of plan Types of ‘valid’ plans: ‘Resilience’ ‘Warm site’ ‘Cold site’ Data protection only No recovery plan Time Cost Assurance Assurance Cost Time Resilience Recovery (Warm) Recovery (Cold) Data Protection Only Mandate R/R planning for all IT systems Scope for critical functions only Ensure ‘Enterprise’ context If your house was on-fire, what would you save? We all live in the same house, we need to decide what is going to be saved! And how much! We won’t be able to save it all. Be careful what you choose! What is important to the WA State Enterprise? Public Safety (EMD/WSP/DOC/Roads/others?) Citizen Systems – Licensing, Social Systems, others? Financial Systems - How we dispense and receive funds. H/R Systems, Data Centers? State Enterprise Approach! How much and what loss is acceptable? Data? E-mail? File Systems? Hardware/infrastructure Network s, communications? Applications used by Citizens? Applications used by Agencies? What does this look like? How do we determine what and how much? Identify and Develop a Risk Matrix! Now we know what, How do we really know it will work? What are our expectations for Disaster Recovery? How do we ensure that RECOVERY WILL work? LEADERSHIP! Identify and apply standardized comprehensive testing (Know what and how much to test and test it the same way across the board! Perform Resilience and Recoverability Plans Review Results and apply Process Improvement! (Do it better next time!) Target Enterprise (State Level) Programs/Systems NOT silo agencies Identify how much of it we really need! RISK MATRIX! Standardized Comprehensive Testing applied Regularly perform Resilience and Recoverability Testing Process Improvement 1. Establish and empower a central authority for ‘Enterprise’ (Statewide) R/R Planning 2. Standardize and consolidate IT Infrastructure where ever possible to ease R/R Planning 3. Practice R/R Planning at the ‘Enterprise’ (not agency) level 4. Mandate R/R planning for all new IT systems 5. Develop and document State guidelines on ‘risk appetite’ Thank you! Scott Andersen, WSDOT Gary Duffield, DIS Doug Selix, OFM Thelma Smith, WSDOT Brian Sylvester, DOP