Disaster Recovery in IT David Irakiza CSC 585-High Availability and Performance Computing 2012 Outline • • • • • • • • Introduction Requirements/Considerations Categories Benefits Strategies Precautionary measures Vendors and products/tools Questions Introduction • Disaster is an unexpected event with destructive consequences – Natural – Human – Technical • Disaster recovery - processes, policies and procedures that deal with preparing for recovery or continuation of technology infrastructure critical to an organization after a disaster Requirements/Considerations • Recovery Point Objective (RPO): amount of data loss that is tolerable. Represents point in time of most recent backup prior to system failure • Recovery Time Objective (RTO): system downtime that is acceptable. Includes time to detect failure, prepare backup servers, initialize failed application and reroute requests via backup site – Goal is to have very low RTO Requirements/Considerations • Performance: impact on normal operation after recovery. Should have minimal impact on performance of application being protected • Consistency: accuracy of application data and output. Application should be restored to consistent state after failure • Geographical location: disaster recovery (backup) site should not be in same location as system being protected Categories of backup sites • Hot backup site: mirrored standby servers always available to provide service incase of disaster. Use synchronous replication to avoid data loss, minimal RPO and RTO • Warm backup site: may use synchronous or asynchronous replication depending on desired RPO of business. Standby servers available but in “warm” state, could take some minutes to bring them online when needed Categories of backup sites • Cold backup site: data replicated periodically hence RPO in hours or days. Backup servers not readily available, could take hours or days to resume operation after failure hence high RTO – Very low cost option, suitable for applications that don’t require strong protection or availability guarantees Benefits • Reduce time spent on making decisions after failure • Provides confidence that business continuity is possible after failure • Standby system availability is guaranteed • Backups ensure information or documents can be provided when originals are destroyed • Risk of human disaster is reduced Strategies • Tape backups that are sent off-site at regular intervals • On-site disk backups automatically copied to off-site disk • Data replication to off-site location (using Storage Area Network Technology) • High availability systems that keep off-site replications of both data and system Precautionary measures to prevent disasters • Disk protection technology e.g. RAID • Surge protectors • Uninterruptible Power Supply and/or generators • Alarms, fire extinguishers • Anti virus software Vendors and products/tools • Three types of off-site disaster recovery replication products: – Array based: replicate between two proprietary storage arrays e.g. SRDF (EMC Corp.), IBM’s Peer to Peer Remote Copy, Hitachi Data Systems’ True Copy – Third party: replicate from site A heterogeneous environment to site B e.g. IPStor (FalconStor’s software), DoubleTake & GeoCluster (NSI), Data Protection Suite (Topio Inc.), Global Disaster Recovery (SANRAD) Vendors and products/tools – Managed services: outsourcer operates company’s remote site or entire disaster recovery operation e.g. WilTel Communications, StorageTek, IBM Global Services, Hewlett-Packard Co. • Fairly new and gaining popularity among mid-sized companies References 1. “Disaster Recovery”. bizhelp24. August 2009. Accessed January 27, 2012 < 2. 3. 4. 5. 6. http://www.bizhelp24.com/small-business/disaster-recovery.html > Disaster Recovery. Wikipedia.org. Retrieved January 27, 2012 from http://en.wikipedia.org/wiki/Disaster_recovery “The Benefits of Preparing for Disaster Recovery”. bizhelp24. September 2010. Accessed January 27, 2012 < http://www.bizhelp24.com/money/commercialinsurance/the-benefits-of-preparing-for-disaster-recovery.html > “Advantages of Disaster Recovery as a Service”. datacenterknowledge. October 2011. Accessed January 27, 2012 < http://www.datacenterknowledge.com/archives/2011/10/25/advantages-ofdisaster-recovery-as-a-service > Chad Bahan. “The Disaster Recovery Plan”. 2003. SANS Institute InfoSec Reading Room. Accessed January 27, 2012 Shane O'Neill and Beth Pariseau. “Tech Roundup: Disaster recovery tools ”. August 2005. Accessed February 7, 2012 <http://searchstorage.techtarget.com/tip/Tech-Roundup-Disaster-recoverytools> Questions?