Weathering the Storm Patricia Vella Resilience Matters Former Global Head Business Continuity Nortel patriciavella@resilience-matters.com About Resilience Matters Ltd. • Patricia Vella is owner of Resilience Matters, a small Business continuity consultancy • Patricia ran Nortel’s corporate wide Business continuity program for over 5 years – She worked closely with key outsourced and off shore facilities • Since Nortel Patricia has carried out business continuity, disaster recovery and resilience work for RAC, SAB Miller, The Economist and Deutsche Bank • Patricia moved into business continuity from a background as a technical architect for high availability telecoms solutions • http://uk.linkedin.com/in/pvella 4/13/2015 Copyright Resilience Matters 2 Contents • Before you start • Demystifying the Jargon – Emergency Response – Crisis Management – Business Continuity – Disaster Recovery • What sort of plans do you need ? • Case Studies of incidents 4/13/2015 Copyright Resilience Matters 3 Before you Start • Ensure you know what your company does and what is most critical – 999 service support almost unknown in Nortel • Identify where your company is located • Find work you can reuse – Emergency plans should already be in place – Quality plans may contain critical business info • Understand your company culture 4/13/2015 Copyright Resilience Matters 4 Demystifying the Jargon • • • • Emergency Response Crisis Management Business Continuity Disaster Recovery/ICT Service Continuity 4/13/2015 Copyright Resilience Matters 5 Emergency Response • These are your plans for responding to an emergency affecting a physical site eg. fire • Typically developed and owned by H&S – Fire Safety requirements specified in legislation Regulatory Reform (Fire Safety) Order 2005 • These must be enacted first – Ensure separation between personnel critical in emergency response such as first aiders/fire wardens and business continuity team members 4/13/2015 Copyright Resilience Matters 6 Emergency Plans Must include • Action on discovering a fire. • Calling the fire brigade. • Evacuation of the premises including those particularly at risk. • Power/process isolation. • Places of assembly and roll call. • Liaison with emergency services. • Identification of key escape routes. 4/13/2015 Copyright Resilience Matters 7 Emergency Plans May Include • Alternative assembly points in case of bomb threat • Premise search instructions for bombs • Instructions in case of disease outbreak on site eg. Include liaison with UK Health Protection Authority (HPA) • Contents and location of emergency grab bag 4/13/2015 Copyright Resilience Matters 8 Crisis Management • Process by which a major incident is managed • If incident affects business processes crisis management will invoke business continuity and manage that process • Some incidents such as kidnap and ransom are managed without involving wider business and may utilise specialist external agencies • Good idea to have clear definition of what constitutes a crisis and who can invoke 4/13/2015 Copyright Resilience Matters 9 Business Continuity • Business Continuity are the plans and processes that maintain critical operations after a major incident • Business Continuity is defined as – the strategic and tactical capability of the organization to plan for and respond to incidents and business disruptions in order to continue business operations at an acceptable pre-defined level BS 25999-1 4/13/2015 Copyright Resilience Matters 10 Business Continuity • Business Continuity for large organisations is much more than a set of plans • Business Continuity Program needs – Clearly identified leader (and alternate) – Annual programme of updates to BIAs and BCPs – Contact point for customer questions – Defined strategy for supply chain resilience – Annual test programme 4/13/2015 Copyright Resilience Matters 11 Business continuity • Business Continuity Planning includes mitigations carried out ahead of an incident that reduce impact/risk eg. – IT Service Continuity measures – dual source critical components • BCP response strategies typically include – Short term manual workarounds – Work transfer to alternate teams – Transfer of people to Work area recovery sites 4/13/2015 Copyright Resilience Matters 12 Disaster Recovery • Disaster recovery – is the process, policies and procedures that enable the recovery or continuation of technology infrastructure after a disaster • Disaster Recovery Plan (DRP) contains – steps to be followed to enable recovery of the technology infrastructure – steps to be followed to reconcile the data • Master DRP specifies running order for system DRPs 4/13/2015 Copyright Resilience Matters 13 Disaster Recovery/ICT Continuity • Huge variety of techniques to provide disaster recovery. Selection of what is right for you depends on your requirements (and budget). • Before you start you need to define – Recovery Time Objective (RTO) ie. how long can you tolerate the system being down for – Recovery Point Objective (RPO) ie. how much data could you lose 4/13/2015 Copyright Resilience Matters 14 Disaster Recovery Strategies • • • • • • Mirroring Hot/warm/cold standby High availability Backup – tape vs hot swappable disks Rollback strategy in case of corruption UPS and Standby generators 4/13/2015 Copyright Resilience Matters 15 What Plans do you need ? • Depends on company size, location, function and regulatory requirements • IT/Technology need DR plans. – prioritise most business critical systems first, – don’t overlook the middleware systems • Crisis plan should be simple and succinct, – know who will step in and take charge and when • Start simple for your business continuity plans – build up complexity over time 4/13/2015 Copyright Resilience Matters 16 4/13/2015 Copyright Resilience Matters 17 London BC Invocation 7/7 05 • Nortel had large Managed Services presence in London – This is where we managed parts of the telecoms network – 20 different customers – Managed all the switches for a major UK telecoms operator • London bombings quickly caused following impact to our business – Mass call event, unlike usual mass call events (eg. BGT) calls were not restricted to specific range of numbers and were sustained over several hours – London transport halted which caused significant impact to shift change for 24x7 operations – Difficulty in moving around London impacted spares, FLM, I&C and other services around London 4/13/2015 Copyright Patricia Vella 18 London BC Invocation 7/7 05 • Nortel Network Operators in London noticed a unusual pattern of mobile calls from several locations in London • • • • 4/13/2015 – They alerted the senior manager on duty that day, – They agreed whatever was going on was clearly bigger than a power surge on the underground Ops manager contacted me 20mins after the first explosion, we reviewed situation and invocated BC 30mins after first explosion – followed BC process (eg switch off provisioning, track shift change, transfer operators to alternate sites) – Incident EMT managed by conference calls throughout Thank-you email from CEO of major UK operator Friday am CSF Highly trained BC Primes CSF Every manager and team leader had been involved in at least one 2 day simulation exercise, some had participated in two previous exercises Copyright Patricia Vella 19 India ‘Bollywood’ Star Death April 06 • • • • • 4/13/2015 Bollywood informal name for film industry in India – ‘Bollywood’ Film stars have a huge following in India Rajkumar died 12th April 2006 in Bangalore – Approx 60,000 fans took to the streets, – Vehicles were set alight, police used teargas to control crowds – 8 people died in the riots – 2 days national mourning Technology companies shutdown – American companies hit first, then Indian companies – Impacted R&D, Nortel was notified of the incident, R&D schedules replanned Supplier run Nortel 24x7 call centre shut for 2 days – Nortel BC was to temporarily transfer calls to Canadian call centre – CSF Ability to transfer work to alternate supplier at short notice http://news.bbc.co.uk/1/hi/world/south_asia/4909432.st m Copyright Patricia Vella 20 Summary • Essential you understand what sort of plans you need and why – Emergency Response – Crisis Management – Business Continuity – Disaster Recovery • The type of incidents that can cause you to invoke your plans are many and unpredictable • Tests prove their value in real incidents 4/13/2015 Copyright Resilience Matters 21