Information Assurance for the Enterprise: A Roadmap to Information Security, by Schou and Shoemaker Chapter 10 Continuity Planning and Disaster Recovery McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Objectives Develop an effective business continuity approach Manage an effective incident response Plan for disaster recovery 10-2 Business Continuity Preserves essential organizational assets Protect resources from damage, destruction, and loss Serves as an information assurance lifeboat Does not preserve everything; preserves things essential to continue business operations Develops and maintains an up-to-date, comprehensive strategy 10-3 Business Continuity Planning Planning mitigates the interruption of essential services Seeks to re-establish operations quickly by focusing on critical functions Relies on contingency plans that itemize the steps to follow when needed First step in building the plan is to identify and prioritize critical assets through risk analysis Business continuity 10-4 Offsite storage and recovery facilities Continuity and Business Value Continuity planning Preparedness plan – prevention and minimization of damage as well as securing or recovering information after a disaster • Developed through a strategic planning process • Characterizes the operational measures followed to prevent avoidable disasters • Enumerates the contingency measures to be adopted, should a disaster occur • Itemizes the replacement and restoration procedures used to ensure the integrity of the information assets 10-5 Continuity and Business Value Contents of continuity plan Continuity planning process has two goals: • To avoid loss of critical information in a disaster • To return critical information functions to operation as quickly and efficiently as possible Continuity planning function targets the three components of an IT operation: • Systems • Personnel • Facilities 10-6 Continuity and Business Value Contents of continuity plan (cont’d) Plans must be established to respond to every possible threat • Key concept is feasibility Employs ongoing threat modeling and risk assessment processes • To identify and prioritize threats because of the need to identify and address only the feasible options Establishes a risk analysis procedure • To decide the order in which the threats should be addressed by a formal preparedness response 10-7 Proactive Response: Ensuring “Continuous” Continuity To ensure continuity, build real-time survivability into the overall information function Immediate “recoverability” – integration of protection strategies with a range of proactive recovery technologies The result should be a dynamic assurance solution that blends protection elements • Firewalls and intrusion detection systems Rigor is essential • Survival of critical technology processes is inextricably linked to the continuing effectiveness of functions 10-8 Recovery time Fundamental aim of the business continuity process is to: Ensure the shortest realistic recovery time possible Estimate recovery time calculated by determining the Maximum Tolerable Downtime (MTD) Estimate based on three concepts: • Recovery Time Objective - RTO • Network Recovery Objective - NRO • Recovery Point Objective - RPO 10-9 Recovery time Recovery Time Objective - RTO Network Recovery Objective - NRO Greatest amount of time a network can be out of service Recovery Point Objective - RPO 10-10 Maximum operationally acceptable period of time that a system can be out of service without causing harm The point in time to which data can be restored after a failure Recovery time Determining RTO, NRO, and RPO for one environment RTO/NRO and RPO are mutually supportive, but: • They are different concepts • They support different sets of decisions and protection requirements 10-11 Alternative Sites In the event of a disaster Systems should be able to switch processing functions efficiently to alternative sites Relationship between criticality requirements and alternative processing requires an understanding of: Hotsites Warmsites Coldsites 10-12 Data Recovery Hotsites In critical instances requiring an immediate restoration capability 10-13 Facilities mirror the real-time processing at the primary site Provides near instantaneous backup since they operate in parallel Ensures the optimum potential for total recovery of the data resource and continuity of operation Data Recovery Warmsites Provide the equipment and communications interfaces for establishing an immediate backup operation Cannot ensure that all the data will be preserved Usually the most practical approach Extremely cost efficient 10-14 Data Recovery Coldsites It provides a degree of protection Value – resumption of business operations as soon as the staff is moved Disadvantage – significant data from the primary site might be lost or have to be rebuilt 10-15 Analysis Processes Identify risks to critical systems and the effect their failure has on overall business processes Two kinds of analyses are associated with continuity plans development: • Business impact analysis • Risk analysis 10-16 Analysis Processes Business impact analysis Determines the effect that a potential disruption might have on a function or information asset Risk analysis Examines the critical functions and resources that support operations detailed in the impact study Driven by an estimate of the overall criticality of the system Major component of risk analysis is disaster tolerance 10-17 Analysis Processes Risk analysis (cont’d) Disaster tolerance • Implies various levels of criticality • Varying degrees of associated responses, which form four categories: • • • • 10-18 Minimal criticality Average criticality High criticality Mission-critical Ingredients of a Continuity Plan Continuity plans have two steps: The assumptions about the circumstances of the plan • Events that could change or affect those assumptions 10-19 The strategy for maintaining continuity, based on those assumptions Ingredients of a Continuity Plan Step 1: Assumption Derived from an understanding of the threats and the associated threat modeling Are dynamic since: • The threat picture changes constantly • The assumptions have to be periodically updated Should include the: • Timing • Extent of the threat • Areas of potential harm 10-20 Ingredients of a Continuity Plan Step 2: Priorities and strategy Strategy adopted and the philosophy that drives continuity • Must be understood and accepted throughout organization • Must adopt and communicate a single common continuity approach • Should originate from and align with the stated organization strategy and philosophy 10-21 Instituting the Business Continuity Management Process Management goal: keep critical systems operating and react to failures as soon as possible Management plan: protect the maximum number of assets with the highest degree of assurance Five questions to ensure that the plan has the right set of elements: • • • • • 10-22 What are the critical business systems? What is the business impact of each of these systems? What risks are associated with each system? What is the level of integrity required for each system? What are the RTO and the RPO for each system? Four Phases of the Business Continuity Planning Process Business continuity planning is best done in phases There are four phases: • • • • 10-23 Identify critical business functions Establish Recovery Time Objectives State the explicit work (SOW) Ensure acceptance and understanding of the solution Four Phases of the Business Continuity Planning Process 10-24 Planning process Phase 1: Identify the Critical Business Functions Function criticality is derived from a characterization of the explicit value of: • Products • Services, including supporting functions • Governance or administration factors Once these have been identified and evaluated they are assessed based on their overall contribution Volume and load factors – measures employed to describe the contribution 10-25 Phase 1: Identify the Critical Business Functions 10-26 Matrix allows the organization to understand the relative contributions Phase 1: Identify the Critical Business Functions Following classification characterizes the activities in the evaluation matrix: • Critical activities • Included activities • Non-essential activities Determining feasible alternatives Whether there are other ways to perform a given operation Whether it could be carried out by a similar set of tasks This determination must consider all redundancy provisions 10-27 Phase 1: Identify the Critical Business Functions Know that it is an ongoing effort Perform needs assessments on a continuous or regular basis because organizations change constantly Activities designated as “critical” • Must be addressed appropriately • It must be possible to validate them by direct observation 10-28 Phase 2: Set Recovery Time Objectives (RTO) Specified in the order of their criticality after considering redundancy and contract alternatives Assign a value describing how soon it must be operational • An estimate of the resources required to achieve it 10-29 Establish a mechanism to ensure the resources will be available Identify the internal and then any external resources and contractors Identify any potential shortfalls in either resources or capabilities Itemize and cross-reference shortfall areas to the RTO Phase 3: Identify and Record Solution in a Statement of Work Statement of work: 10-30 Is a specification itemizing the steps to be taken to meet each RTO Details the procedures followed to address foreseeable problems Identifies areas of shortfall in personnel, work area, equipment, supplies, or service capability Is a set of recommendations for how that shortfall will be addressed Specifies the organization’s assumptions about continuity Provides clear guidance for each foreseeable contingency Phase 4: Ensure Understanding Ensure that all participants in the process clearly understand their role and accountability Make appropriate parts of the plan available to each stakeholder Instill continuity concepts in active projects Bring the entire organization to the required level of capability 10-31 All levels of management have to understand and support the process Disaster Recovery Planning Disaster recovery planning or crisis management Aspect of business continuity management that applies after a disaster Focus on a narrower aspect of continuity Identify every disaster contingency and offer a prescription that allows an effective response to each Oriented toward restoring the technical operations with the aim of bringing an identified set of critical systems back to a desired level of operation 10-32 Timing and DRP Timing is important in the design of the disaster strategy and the implementation of the recovery plan Estimated time to return to normal operation at the damaged site must be significantly greater than the time it would take to migrate it A DRP requires understanding of the effect that the downtime has on business processes 10-33 Elements of Disaster Planning Disaster planning has: Long-term perspective – effective disaster planning centers on anticipating disasters and ensuring the proper solution • Planning process assumptions are based on selecting the most likely disaster scenarios and regularly updating their probability Short-term perspective – specify the steps taken if a particular disaster occurs • Anticipated events associated with a given scenario have to be clearly understood, laid out, and crossreferenced to the procedures 10-34 Elements of Disaster Planning Types of Disasters Natural disasters • Localized or area floods • Tornadoes, hurricanes, or earthquakes Site disasters • • • • Fire, water, and sewer emergencies Gas leaks, chemical leaks or spills Telephone or cable interruptions Explosion or other building failures Civil disasters • Car, plane, or train crash • Civil disturbance 10-35 Elements of Disaster Planning 10-36 A disaster recovery plan should be able respond to all credible threats Elements of Disaster Planning Three elements include: Disaster impact description and classification • Requires understanding and describing of the threat implications Response deployment and communication processes • Designates the right people to react in the case of a disaster Escalation and reassessment procedures • Helpful if the situation turns out to be worse than anticipated 10-37