Let’s Get Real: Disaster Recovery and Business Continuity in Public Safety Is Yours Just a Paper Plan or a Real Way to Prepare and Respond to Incidents and Disasters? Presentation Overview • Key DR/BC Concepts and Issues – – – – – – – – – – – – • • • Player Scorecard: Who Is In the Game and Why? DR/ BC Framework Action Steps to a Real Plan – – – – – • Report card and dashboard Scenarios Requirements: What has to operational by when for work to be done by how many at what locations serving what customers who are where? Facilities People Systems Integration Coordination Daily readiness and simulated escalations Testing and independent verification and validation Implementation and triage Recovery, discovery, and improvements First steps Critical functions Funding and leveraging scarce resources Think out of the box Integration with the big picture DR/BC plan and activities of your jurisdiction Conclusions Key DR/BC Concepts and Issues The Report Card and Dashboard • All aspects of the plan, test, and implementation should be scored simply (Red, Yellow, and Green) • Key indicators of planning and readiness need a dashboard to enable assessment and action – Score or status – Trend – Key issue Public Safety Scenarios • Public safety entities have a more difficult challenge • Your IT DR/BC plan is intertwined with risk scenarios • You may be affected by the risks of a given scenario and your IT plan must address those risks appropriately to maintain operations • You also have a role in response to the scenario so the events will affect your operational requirements Scenarios Overview • Threat driven geographic circles of impact • Kinds of threats and events • Responsibility – What will you do, what is shared, what do others have to do for themselves • Tolerance for risk and uncertainty • Lesson learned: if you have a well known and documented local risk: – Have a real plan or get ready for a career change… Source: IBM Scenarios • Identify Possible and Likely Natural Disasters and Environmental Conditions By Kind and Duration of Effects – Tornado – Hurricane – Tsunami – Flood – Snowstorm – Drought – Earthquake Scenarios • Identify Possible and Likely Natural Disasters and Environmental Conditions By Kind and Duration of Effects – Electrical storms – Fire – Subsidence and landslides – Freezing Conditions Scenarios • Identify Possible and Likely Natural Disasters and Environmental Conditions By Kind and Duration of Effects – Contamination, Toxic releases and environmental hazards – Epidemic – Pandemic – Animal or crop disease outbreak Scenarios • Organized and/or Deliberate Disruption – Act of terrorism • WMD – Acute and short lived (bomb) – Acute and long lived (dirty bomb) – Chronic » Long term (contaminants and biohazards) » Permanent (radioactivity, etc.) • WLD (suicide bombers, car bombs, utility sabotage) • Bioterrorism or genetically modified or inorganic organisms – Direct contact – Infectious » Contact » Airborne Scenarios • Organized and/or Deliberate Disruption – Act of Sabotage – Product or food tampering – Act of war – Theft – Arson – Labor Disputes / Industrial Action Scenarios • Loss of Utilities and Services – Electrical power failure – Loss of gas supply – Loss of water supply – Petroleum and oil shortage • Raw materials • Refined materials – Communications services breakdown – Loss of drainage / waste removal and trash pickup Scenarios • Equipment or System Failure – Internal power failure – HVAC failure – Equipment failure (excluding IT hardware) Scenarios • Serious Information Security Incidents – Cyber crime – Malware – Zombie attacks – Denial of service – Loss or alteration of records or data – Disclosure of sensitive information Scenarios • IT system failure (local or hosted) – Hardware – Software • Commercial application • Locally developed application – Data – Communications Scenarios • Other Emergency Situations – Workplace violence – Public transportation disruption – Neighborhood hazard – Health and safety issues Scenarios • Multiple and compound hazards and events – Purposeful – Coincidental – Causally connected – Interrelated IT Requirements • What systems need to function • How fast – Maximum and optimum time frame for each system or function to be restored • How well – Sometimes minimal functionality is sufficient IT Requirements • Where will it be used and by whom and will the communications infrastructure support it? – Employees – Users or beneficiaries • By what priority will systems be restored • The priority will be modified by what contingencies – E.g. a long term total evacuation changes the operational needs for criminal justice systems and personnel Facilities • • • • • • • • Hot, warm, cold Mirrored, recoverable, reload-able Properly located EOC Non-EOC Operational IT facilities For user interaction with IT systems Facilities • New kinds of mutual aid and sister city/county/state arrangements – Work with friends, colleagues, associations, and vendors – To match you with a comparable entities that are located outside the various geographic threat circles – Who can mirror your IT operations (hardware, software, operating systems, and culture) People • The right numbers, skills, location, redundancy, etc. – Skills and abilities inventory • • • • Employees Contractors Vendors Mutual aid and “the cavalry” People • Force in depth—who is the backup to the backup to the backup? • Consider the actual health and physical abilities and disabilities of a person when assigning tasks for a disaster scenario – The disaster is not the time to find out the electrician in the hazmat suit has a heart condition • What family and personal duties may interfere with performing official duties (e.g. save your own kids or save a stranger)? Systems • • • • Daily operational Interdependent systems Emergency only Identity security and access management for physical and logical security – Follow FIPS 201 for federal/state/local interoperability Integration • With whom should you work closely? • Identify integration issues between: – Internal systems and public safety entities – Other governmental systems – Related actors – Non-governmental systems and processes • Example: 911 and 311or its equivalent – Normally separate but related – Emergencies blur the line – Co-location, cross training, and system integration Coordination • • • • • Within organization Within unit of government Across units of government Across levels of government Across public and private boundaries Daily Readiness and Simulated Escalations • • • • A disaster a day (“What, that’s not normal?”) Realistic scenarios Captured lessons Learning and actually responding to lessons learned within risk framework • A quality and security framework for daily operations has substantial overlap with DR/BC Security Capabilities Models Like similar capability models from the Carnegie Mellon SEI, SCMM models brings benefits: – Helps close security holes – Serves as a foundation for growth – Guides security leadership – Is evolutionary, not chaotic – Supports point solutions Security Leadership Strategy Security Sponsorship Causes Security Strategy Security Program Security Program Structure Security Program Resources and Skillsets Management Security Policies Security Policies, Standard and Guidelines Security Management Security Administration Security Monitoring Knowledge User Management User Management User Awareness Information Asset Security Application Security Technologies Database / Information Security Host Security Internal Network Security Network Perimeter Security Technology Protection and Continuity Support Physical and Environment Controls Contingency Planning Controls KPMG SCMM Model Effects Capability Maturity Like the SCI CMM models, the KPMG Security Capability Model has five levels of maturity: Optimizing (5) Continuously improving process Managed (4) Predictable process Standard, consistent process Disciplined process Informal process Defined (3) Repeatable (2) Initial (1) Testing and Independent Verification and Validation • Does the planned response or action step actually work? • Who verifies that it does? • What do you do if it fails the test? Implementation and Triage • Someone better be in charge • Dispute resolution processes • Who will be your Sensibility and Sanity Checker (off site, not affected by the disaster, and actually getting enough sleep to make sound decisions)? • Baton Rouge example with Mayor Holden Recovery, Discovery, and Improvements • What will the new normal be and when will it happen • Learn from history, both recent and long past • Document while the event occurs if at all possible (make it someone’s job) or soon after before memories fade Player Scorecard Who Is In the Game and Why Overlapping and InterRelated Responsibilities Disaster Preparedness and Recovery and Business Continuity Physical Security Public Safety Quality Assurance Methodologies Cyber Security The Usual Suspects in Public Safety • Police • Fire • Other sworn officers (transit, game, building or branch based, etc.) • National Guard • Public Health • Public Works • Transportation • Environmental Protection The Usual Suspects in Emergency Management • Federal, state and local emergency management entities • National Guard • NOAA, NWS, NSSL, other National Laboratories, • Corps of Engineers IT Entities • CIO, CTO, and Enterprise IT Shops • Distributed IT Departments and leadership • Government IT contractors – DR/BC specific entities – Applications developers and software – Hardware – Service providers (ASP, MSP, call centers, etc. • Communications providers Policy Makers • Executive, legislative, and judicial – Those who hold the seat and those who actually make the decisions… – Go below the top level to ensure clarity, alignment, and redundancy • EOC designees • Emergency authorizers Non-Governmental Organizations • Media – Broadcast and satellite • Emergency Broadcast System Members – Print – New media • The Web – Government site mangers – Commercial site managers – Citizens and bloggers – Self-organizing communities (e.g. Craig’s List) Non-Governmental Organizations • • • • Charities Businesses and business associations Community organizations Vital private services (hospitals, nursing homes, etc. ) A DR/BC Framework Business Operations and Technology • Create a matrix, not a linear or organizational view • Strategy • Organization • Processes • Applications and data • Technology • Facilities Source: IBM Action Steps to a Real Plan First Steps First Steps • Leadership: clarity, alignment, and commitment • Authority or consensus? • Stakeholders roles and responsibilities • Be clear about risk tolerance • Applications and IT assets inventory – If needed, dust off and update your Y2K work • Good data on plan status, readiness, test results, response, and compliance First Steps • Make a friend in accounting—actuarially accurate threat scenarios are more likely to be funded as risk and cost can be properly balanced • Review existing plan or make a plan • Borrow or buy a template • Review peer plans and conduct site visits • Communicate until it hurts Critical Functions Nail Down Your Critical Functions • Law and order essentials (people, mobility, tools, survival basics, etc.) • Communications • Personnel management (policies, scheduling, notification trees and systems, counseling, etc.) • Data and the connections to data and people • Transactional systems Nail Down Your Critical Functions • Rescue and response • Pipeline to the health care system • Building/location/hazmat information for fire and first responders • Justice processing and incarceration • Dispatch Nail Down Your Critical Functions • Records • Mobility – Devices and local storage if communications are intermittent or fail (e.g. mobile maps and databases) • Know what you can actually cover (and what you are just waiving your hands at and hoping it either works or is never needed) Funding and Leverage Funding and Leverage • Work within your risk/threat/cost/benefit matrix and follow your own rules • How serious are you about being prepared? Funding and Leverage • Stop building single purpose infrastructures and reuse what you have – “Ask not, what an infrastructure can do for you, but what it can do for your taxpayers” • Use shared services • Follow standards or help create them if lacking Funding and Leverage • Determine what pre-existing, unmet needs can be addressed by a new investment • Determine whether existing public safety or enterprise systems will do the job and if you can use them • Invest wisely – Vendors over inventors – COTS over customization – Web services over hard coding Think Out of the Box Think Third World • • • • • • • Hand crank your computers Bike generators Solar and wind power Portable water purifiers Emergency shelter Runners and mountain bikes Hand tools Think New World • Internet Protocol (IP) everything – Bridge between radio, wireless data/WI-FI and use each as IP conduits as needed • Gigs of portable flash memory • Satellite data and telephony Think New World • • • • Instant Message Text and mobile email Cell On Wheels/Boat/Balloon Negotiate/legislate priority and bumping rights in telecommunications provisioning Integrate With the Big DR/BC Picture The Big Picture • Consult EM before, during, and after • Once essential public safety systems have a DR/BC IT and overall plan it can be incorporated into the overall EM plan for the jurisdiction • Tie it all together in formal and informal agreements • Create a focal point such as your EOC EOC Basics • Not located in a hazard area (floodway) • 500 square feet minimum floor space • Communications section adjacent to EOC • Three methods of communications with state EMA and local responders • UPS and generator systems located above flood level • Sleeping space for identified staff • Kitchen space/food or meal contract • New construction to International Building Code Source: Alabama EMD Conclusion: Essential Public Safety Systems and Organizations Must Be Disaster Resistant, Flexible, Diversified, and Redundant (Or We Are All In Big Trouble) Contact Information Richard J. H. Varn Center for Digital Government rjmvarn@msn.com Model Plan Outline • What follows is a private sector based, but broadly applicable tool that sells for $199 • To buy a copy of the business continuity plan generator see http://www.eoncommerce.com/rusecure/bcp.asp Model Plan Outline • • • • Business Continuity - Preparing the Plan Initiating the BCP Project Project Initiation Activities BC 010101 Review of Existing BCP (if available) Model Plan Outline • BC 010102 Benefits of Developing a BCP • BC 010103 BCP Policy Statement • BC 010104 Preliminary BCP Project Budget • BC 010105 Procedure for Approving BCP Content Model Plan Outline • BC 010106 Communication on BCP Project to All Employees • Project Organization • BC 010201 Terms of Reference for BCP Project Manager • BC 010202 Appoint BCP Project Manager and Deputy • BC 010203 Select and Notify BCP Project Team Model Plan Outline • BC 010204 Initial BCP Project Meeting • BC 010205 Project Objectives and Deliverables • BC 010206 Project Milestones • BC 010207 Project Reporting Requirements and Frequency • BC 010208 Required Documents and Information Model Plan Outline • Assessing Business Risk and Impact of Potential Emergencies • Emergency Incident Assessment • BC 020101 Environmental Disasters • BC 020102 Organized and / or Deliberate Disruption Model Plan Outline • BC 020103 Loss of Utilities and Services • BC 020104 Equipment or System Failure • BC 020105 Serious Information Security Incidents • BC 020106 Other Emergency Situations • Business Risk Assessment Model Plan Outline • BC 020201 Key Business Processes • BC 020202 Establish Time-Bands for Business Service Interruption Measurement • BC 020203 Financial and Operational Impact • IT and Communications Model Plan Outline • BC 020301 Specifications of IT and Communication Systems and Business Dependencies • BC 020302 Key IT, Communications and Information Processing Systems • BC 020303 Key IT Personnel and Emergency Contact Information • BC 020304 Key IT and Communications Suppliers and Maintenance Engineers • BC 020305 Existing IT Recovery Procedures Model Plan Outline • Existing Emergency Procedures • BC 020401 Summary of Existing Procedures for Handling Emergency Situations • BC 020402 Key Personnel Responsible for Handling Existing Emergency Procedures • BC 020403 External Emergency Services and Contact Numbers Model Plan Outline • BC 020500 Premises Issues • BC 020501 Responsibility and Authority for Building Repairs • BC 020502 Back-up Power Arrangements • Preparing for a Possible Emergency Model Plan Outline • Back-up and Recovery Strategies • BC 030101 Alternative Business Process Handling Strategy • BC 030102 IT Systems Back-Up and Recovery Strategy • BC 030103 Premises and Essential Equipment Back-up and Recovery Strategy Model Plan Outline • BC 030104 Customer Service Back-up and Recovery Strategy • BC 030105 Administration and Operations Back-up and Recovery Strategy • BC 030106 Information and Documentation Back-up and Recovery Strategy • BC 030107 Insurance Coverage • Key BCP Personnel and Supplies Model Plan Outline • BC 030201 Functional Organization Chart • BC 030202 BCP Project Co-coordinator and Deputy for Each Functional Area • BC 030203 Key Personnel and Emergency Contact Information • BC 030204 Key Suppliers and Vendors and Emergency Contact Information • BC 030205 Manpower Recovery Strategy Model Plan Outline • BC 030206 Establishing the Disaster Recovery Team • BC 030207 Establishing the Business Recovery Team • Key Documents and Procedures • BC 030301 Documents and Records Vital to the Business Process • BC 030302 Off-site Storage Model Plan Outline • BC 030303 Emergency Stationery and Office Supplies • BC 030304 Media Handling Procedures • BC 030305 Emergency Authorization Procedures • BC 030306 Prepare Budget for Back-up and Recovery Phase Model Plan Outline • Disaster Recovery Phase • Planning for Handling the Emergency • BC 040101 Identification of Potential Disaster Status • BC 040102 Involvement of Emergency Services • BC 040103 Assessing Potential Business Impact of the Emergency Model Plan Outline • BC 040104 Project Management Activities • Notification and Reporting During Recovery Phase • BC 040201 Mobilizing the Recovery Team • BC 040202 Notification to Management and Key Employees Model Plan Outline • BC 040203 Handling Personnel Families Notification • BC 040204 Handling Media during the Disaster Recovery Phase • BC 040205 Maintaining Event Log during Disaster Recovery Phase • BC 040206 Disaster Recovery Phase Report • Business Recovery Phase Model Plan Outline • Managing the Business Recovery Phase • BC 050101 Mobilizing the Business Recovery Team • BC 050102 Assessing Extent of Damage and Business Impact • BC 050103 Preparing Specific Recovery Plan Model Plan Outline • BC 050104 Monitoring Progress • BC 050105 Keeping Everyone Informed • BC 050106 Handing Business Operations Back to Regular Management • BC 050107 Preparing Business Recovery Phase Report • Business Recovery Activities Model Plan Outline • BC 050201 Power and Other Utilities • BC 050202 Premises, Fixtures and Furniture (Facilities Recovery Management) • BC 050203 Communication Systems • BC 050204 IT Systems (Hardware and Software) Model Plan Outline • • • • BC 050205 Production Equipment BC 050206 Other Equipment BC 050207 Warehouse and Stock BC 050208 Trading, Sales and Customer Service Model Plan Outline • BC 050209 Human Resources • BC 050210 Information and Documentation • BC 050211 Office Supplies • BC 050212 Operations and Administration (Support Services) Model Plan Outline • • • • • Testing the Business Recovery Process Planning the Tests Develop Objectives and Scope of Tests Setting the Test Environment Environmental Disasters Model Plan Outline • • • • • • • Organized and / or deliberate disruption Loss of Utilities and Services Equipment or System Failure Serious Information Security Incidents Other Emergency Situations Prepare Test Data Identify Who is to Conduct the Tests Model Plan Outline • Identify Who is to Control and Monitor the Tests • Prepare Feedback Questionnaires • Prepare Budget for Testing Phase • Training Core Testing Team for each Business Unit Model Plan Outline • Conducting the Tests • Test each part of the Business Recovery Process • Test Accuracy of Employee and Vendor Emergency Contact Numbers • Assess Test Results • Training Staff in the Business Recovery Process Model Plan Outline • • • • • • • • Managing the Training Process Develop Objectives and Scope of Training Training Needs Assessment Training Materials Development Schedule Prepare Training Schedule Communication to Staff Prepare Budget for Training Phase Assessing the Training Model Plan Outline • • • • Feedback Questionnaires Assess Feedback Keeping the Plan Up-to-date Maintaining the BCP Model Plan Outline • Change Controls for Updating the Plan • Responsibilities for Maintenance of Each Part of the Plan • Test All Changes to Plan • Advise Person Responsible for BCP Training