Good Practice Guide BUSINESS CONTINUITY Contents 1. Introduction 2 2. The BCM life cycle 3 3. Understanding the organisation 6 4. Determining strategy 13 5. Developing a response 19 6. Exercise, maintain, review 27 7. Embedding BCM 31 APPENDIX 1: Incident Response 32 APPENDIX 2: Glossary 35 APPENDIX 2: Further Information 36 BUSINESS CONTINUITY Continuing your professional development (CPD) is all about keeping on the front foot in your career. Developments in facilities management come thick and fast through technological, legislative, environmental, economic and political changes so CPD is essential to stay informed and to help you reach your potential. Members of BIFM can access a wide range of knowledge and information, such as the Good Practice Guides, through the member’s area of the BIFM website. However to help you on your way and identify core activities look out for BIFM’s CPD logo. If you are not a member of the institute then be sure you don’t miss out! Join today at www.bifm.org.uk/join ISBN: 978-1-909761-17-9 Edition: Second Date: September 2015 Authors: Steve Dance, Managing Partner, Risk Centric and BIFM Risk and Business Continuity SIG Committee Member Peer reviewer: Mike Cronin, BIFM Risk and Business Continuity SIG Committee Member and Group Facilities Director, Haymarket Media Group BIFM Number One Building The Causeway Bishop’s Stortford Hertfordshire CM23 2EN T: +44 (0) 1279 712620 E: membership@bifm.org.uk www.bifm.org.uk Advertising T: +44 (0) 1279 712620 © The Good Practice Guides series is published by the British Institute of Facilities Management (BIFM). The guides do not necessarily reflect the views of BIFM nor should such opinions be relied upon as statements of fact. All rights reserved. This publication may not be reproduced, transmitted or stored in any print or electronic format, including but not limited to any online service, any database or any part of the internet, or in any other format in whole or in part in any media whatsoever, without the prior written permission of the publisher. While all due care is taken in writing and producing this Good Practice Guide, BIFM does not accept any liability for the accuracy of the contents or any opinions expressed herein. Business Continuity GPG 1 1. Introduction Business continuity is the ability to continue essential business functions at all times, under all circumstances and as far as humanly possible. The purpose of this Guide is to describe the general principles and the practical application of BCM, to enable facilities managers to develop an understanding of the issues. The Guide is aimed at those with little or no previous knowledge of business continuity, although familiarity with the working environment and the culture in which business continuity is to be implemented is assumed. 2 GPG Business Continuity The British Standard, ISO22301, the current Business continuity management code of practice, is followed with input from the Business Continuity Institute’s Good Practice Guidelines 2013 (BCI GPG 2013), in which further, more detailed, information can be found: www.thebci.org BCM and its skills and disciplines should be seen as “common sense applied in a structured manner”. 2. The BCM life cycle O RG AN IS BC M I UNDERSTANDING THE ORGANISATION T N ’ S IO E B U DEVELOPING AND IMPLEMENTING BCM RESPONSE CU LT I N G DD EXERCISING, BCM DETERMINING MAINTAINING PROGRAMME BCM AND REVIEWING MANAGEMENT STRATEGY E M The table (right) gives a more detailed overview of the key stages of the life cycle. N E TH A BCM needs to be embedded in the corporate culture or it becomes overlooked, forgotten or taken for granted. If this happens, it will lack buy-in and essential support from senior management in terms of funding, resources, exercises, etc. E The model begins with a setting-up procedure and then becomes an iterative process in four areas. BCM needs to be embedded in the corporate culture or it becomes overlooked, forgotten or taken for granted. R The British Standard, ISO22301, introduced a life cycle model of a Business Continuity Management (BCM) programme. Business Continuity GPG 3 Managing the Programme BCM programme management is first concerned with managing the introduction and maintenance of business continuity principles into the organisation. It should be based on a formal policy with defined responsibilities and processes, all documented as auditable evidence. Subsequently it will provide the impetus to promote, maintain and assure the implemented programme. BCM programme management requires: >A team to manage the programme with the authority to define and implement policies and standards and influence the prioritisation of business continuity activities. The team must operate with the board’s support and endorsement, otherwise essential support is unlikely to be available. >Policies, standards and guidelines that define the framework of the programme. The BCM policy is a document, issued by senior management, that communicates the organisation’s BCM framework, together with the responsibilities and expectations of those involved with managing and maintaining the organisation’s business continuity arrangements. 4 GPG Business Continuity The policy sets out what needs to be done and by whom. Typically it would cover: >Corporate business continuity organisation and responsibilities: – Senior management team – Steering committee – Business continuity co-ordinators >Criteria to determine which parts of the organisation will need plans >Requirements for review and exercising >Requirements for awareness and training >Audit review and reporting >Requirements for record keeping. Standards and guidelines would include: >Templates to provide a consistent format >Guidelines on key activities and template completion. BCM life cycle – the key stages Life cycle stage Main activities Outcome BCM programme management > E stablish management organisation > Committee structure and project staffing > BCM policy > Standards and guidelines Understanding the organisation > B usiness impact analysis > R isk assessment > C ontinuity requirements analysis > P roduct and service exposure map showing types of exposure causing operational disruption (see, for example, Table 1) > M aximum tolerable period of disruption and recovery time objective for each operation > P rioritised risks that could cause operational disruption, (see, for example, Table 2) > R ecovery point objective and matrix showing minimum resources to maintain each operation (see, for example, Table 3) Determining BC strategy > Identify countermeasures in order to achieve resumption of operations > R ecovery strategies for each operation (see, for example, Table 4) Developing and implementing a BCM response > Identify detailed actions necessary and resources required to manage an interruption and maintain effective communications with all affected parties > Incident management plan for notification, escalation and management of an incident (see, for example, Table 5) Exercising, maintaining and reviewing > E stablish a framework and organisation to support oversight, evaluation maintenance and testing of BC arrangements > B CP testing document (see, for example, Table 8) Embedding BCM in the organisation’s culture Ongoing initiatives to: > Regular communication programme > B CP to resume operations within a predefined timescale (see, for example, Table 6) > A ctivity resumption plan to resume individual activities (see, for example, Table 7) > P rovide access to details of BC arrangements > C reate quick reference resources and materials > Implement an enforceable policy > M easure levels of awareness Business Continuity GPG 5 3. Understanding the organisation To begin the BCM life cycle you must understand the organisation within which the strategy is to be implemented. Three principal tools are used in this context: Business Impact Analysis (BIA) Is a means of identifying, quantifying and qualifying the consequences of a loss, interruption or disruption of business activities over time. A BIA can be used at any level on any activity in the organisation. Risk Assessment (RA) Estimates the likelihood of loss, interruption or disruption from known threats. Continuity Requirements Analysis (CRA) Analysis (CRA) assesses the resources required for a resumption of activities. Business Impact Analysis (BIA) A BIA needs the following information: >What resources and services are critical to the core business activities >The potential impact of a disruption to the provision of those resources and services >The stage at which, in terms of the duration of the disruption, the impact on the business would become unacceptable. Deciding the scope of the analysis may limit the maximum extent over which a disruption is considered. This could be determined by geographical considerations, regulations or statutes, products, markets or specific customer requirements. BIA methods Collecting information from staff responsible for core business activities and their dependencies aids the choice of continuity strategies. Collection methods include: >Workshops which provide rapid results and engagement with the BCM programme >Questionnaires give a lot of data although the quality varies >Interviews offer good information but are time consuming. Combinations of the above can give excellent results. 6 GPG Business Continuity Table 1 Product and services exposure map From Legal Financial Reputation Regulatory contract RTO MTPD MTPD MTPD MTPD Responsible manager (example names) 1 Product 2 Vision Opticals – Direct to customer 3 Manufactured Operations 3 months 14 days 12 days John Priestly 4 Factored Logistics 3 months 14 days 12 days John Simmonds 5 3rd party Logistics 30 days 25 days Eric Dickens 6 Vision Opticals – Wholesale 7 Manufactured Operations 3 months 30 days 25 days Jane Ross 8 Factored 25 days Tom Mickleson 9 Opticals – Export 10 Manufactured Operations 3 months 30 days 25 days Toby Rice 11 Factored 25 days Anna Austin 13 Manufactured Operations 3 months 14 days 12 days Jack Prince 14 Factored 12 days Mike Reason 12 days Joel Kent Logistics Logistics None None None 3 months 30 days None None 3 months 14 days 15 Solar Opticals – OneVision None 16 Manufactured Operations 3 months 14 days 17 None 3 months 30 days 12 Solar Opticals – Techstrap Logistics None None None ETC. Business Continuity GPG 7 A standard reporting format will improve the consistency of recording and analysing information across multiple functions. The types of questions and the objectives are the same whichever approach is chosen. They include: >Location of activities >The impact of losing the activity >How long the organisation can last without the activity >Timeframes for activity resumption >Influences, such as peak periods or regulatory reporting >What the alternatives are. Factors to consider include: >Volumes, e.g. calls per hour, output on production line >Contractual, regulatory or legal requirements >Key tools to achieving continuity of the activity: buildings, processes, suppliers (how many, where and when) >People; staff (skill set), customers >Equipment; IT, telecommunications, manufacturing/industrial, plant >Data; paper and electronic >Dependencies; internal and external to the organisation >Public/media/brand implications. 8 GPG Business Continuity The main outputs from a BIA are: >The Maximum Tolerable period of Disruption (MTPD), leading to the recovery time objective (RTO) – the timescale within which a function must be restored to enable continuity of the business to be maintained or resumed. >Recovery Point Objective (RPO) – the condition to which the situation is to be restored to enable business activities to resume effectively. The output from the first stage of the BIA process would look similar to the information shown in Table 1. The main products of the company have been listed vertically; below each are broad sub-headings of their required resources and services. For each, the MTPD has been defined before exposure to a variety of business continuity concerns such as financial and reputation. The person responsible for recovery management is also identified. Several products have a MTPD of 14 days before there is a risk to the company’s reputation. Therefore the company must focus on RTOs for these which provide a minimum level of acceptable service, and the RPO within this timeframe to avoid damaging its customer relationships. The RTO in Table 1 is 12 days in order to give some margin. DOs and DON’Ts Risk Assessment (RA) In the BCM context, RA highlights specific threats that could cause a significant business interruption to the broad categories of resources and services identified as most crucial. In large or complex organisations it is desirable to carry out the exercise in manageable sections. The RA can be used to inform the decision about where to concentrate BIA efforts. The objective is to: >identify internal and external threats that could cause disruption and to assess their probability and impact, >DO ensure that business interruption risks are expressed in categories such as reputation, contractual/legal obligations, regulatory compliance and financial impact >DO ensure that the MTPD and RTO thresholds have been considered for all of the above risk categories >DO ensure that interruption threats at different stages in the business cycle have been fully enumerated >DO ensure you have documented the outcome of the “Understand the Organisation” activities >DON’T get bogged down in unnecessary detail >Consider appropriate measures to: >prioritise those threats according to an agreed formula – avoid the risk, eg, remove the cause of the threat >supply input to a risk management action plan – reduce the risk, eg, introduce further controls The key stages in an RA are: >Agree a scoring system for impacts and probabilities with the project sponsor >Calculate a risk from each threat using the list of vulnerabilities from the BIA >Prioritise these risks, taking account of the ability to control them >Obtain the sponsor’s approval and sign-off on these risk priorities >Review existing control strategies, noting where the risk level is out of step with the current strategies for that threat. – transfer the risk, eg, through insurance (but note that although insurance can provide financial compensation, it may not provide cover for the full expense of the incident or damage) – accept the risk, eg, low impact or probability. Ensure planned risk measures do not increase other risks. For example, outsourcing may decrease some types of risk but increase others. Business Continuity GPG 9 Table 2 Suggested prioritised risks Resource Threat Likelihood Impact Risk Risk Response treatment People Pandemic among staff Low High Accept Industrial action Low Low Reduce Maintain good staff links Extreme weather Medium High Reduce Multiple locations Co-location Loss of utilities Medium High Avoid Install backup systems Medium Medium Reduce Sprinkler system Co-location Low Medium Reduce Stock essential spares Multiple equipment Low High Reduce Install physical access controls IT Disaster Recovery plan Medium High Accept Install security software Premises Equipment Fire Lack of spares Technology Deliberate damage Virus infected IT equipment Supplies Stock Contingency plan Robust HR policy Transport disruption Medium Medium Reduce Contingency plan Contract private transport Sub-contract default Medium Medium Reduce Multiple sub-contractors Manufacturing fault Low Low Quality control procedures ETC. 10 GPG Business Continuity Avoid Multiple supply The outcomes from an RA include the identification and documentation of: >single points of failure >prioritised list of threats to the organisation or specific business processes >input to the risk control management strategy and action plan to address the risks >documented acceptance of identified risks that are not to be addressed. This activity should result in an understanding of: >how and why an incident could have an adverse impact on your business >time thresholds for key activities that must be re-established >the internal and external dependencies they rely on. It should be remembered that: >It is impossible to identify all threats >estimates of probability are only estimates >impacts increase over time at different rates >numeric scales may distort the perceived impact of minor events. Unacceptable concentrations of risk or “single points of failure” should be brought to the attention of the business continuity sponsor with options for addressing the issue. The decision to avoid, reduce, transfer or accept the risk should be formally documented and signed off (see Table 2). Continuity Requirements Analysis (CRA) The next step is the CRA. The aim is to quantify the resources (eg, people, technology, telephony) that are required over time to resume and continue business activities to a satisfactory level. In other words, to operate at an acceptable level, the RPO, within an acceptable time, RTO. This is usually done simultaneously with the BIA. Its purpose is to: >provide resource information to develop the recovery strategy to support agreed service levels >identify resource requirements resulting from dependencies between internal activities and external suppliers. It is important to explore whether systems must be recovered to the status they had when the failure occurred. The RPO for IT systems will be derived from the information restoration needs. The RPO is sometimes seen as “the amount of data we could afford to lose”. It is also necessary to take account of additional activities generated by the interruption and clearing of backlogs. For example, a call centre may have to cope with extra calls following an interruption. This information feeds into the business continuity strategy. Resource requirements help us to evaluate alternative recovery solutions in terms of capacity and performance. Business Continuity GPG 11 Table 3 Matrix of resource requirements Vision Opticals – Direct to customer, third party products RTO: RPO: Activity/Product Dependencies People Premises Equipment Technology Supplies Stock Current provision 20 1,000 m2 office 10,000 m2 warehouse Racking Packing Fork lifts ERP Email File server 2,500 units/day 5,000 units on hand Minimum requirement 5,000 m2 warehouse Shrink wrapping equipment to process 2,000 units per day Five users 2,000 with server units/day access within 24 hours 2,000 units to handle pending delivery 10 The dependencies would generally be mapped on a separate matrix (see Table 3), showing the product or service and the services, processes and resources that support it. Typically, there are six main areas of dependence: >People, including partners, customers and contractors, to provide skills, knowledge and manpower >Premises, to provide a working environment, accommodating equipment and stock >Equipment to perform specialised tasks >Technology to communicate and to store, manipulate and present data >Supplies for manufacturing processes > Stock to fulfil orders. 12 GPG Business Continuity All activities depend on such “enablers” and will require them to be available within a given timeframe to avoid the associated risk exposures. In Table 3, one of the products identified on the BIA has been analysed to reveal its dependencies. 4. Determining strategy In the BIA, the MTPD for key activities will have been determined, together with an RTO and RPO for each of those activities. The BCM strategy sets out an appropriate approach to recovering each activity. It is the selection of a goal (eg, “If we lose access to building XX, we will relocate staff to YY”) that needs to identify, in general terms, how many staff, what skills and what resources we might need to have available at the chosen locations, as well as any necessary travel arrangements. The BCM strategy describes what has to be done, not how it has to be done. It is therefore the selection of a high-level response such as: > Replicate and restore Keep copies in case the originals are lost or damaged (most IT recovery plans are based on this concept) > R epair Remedial work may be the quickest method of recovering key resources > R eplace If supply is plentiful then key resources can be replaced quickly > R eciprocity Arrange to borrow another organisation’s facilities > R elocate Move the workforce and workload temporarily to another location > W orkaround Temporarily adopt an alternative approach to a process > S uspend Adjourn the activity until normal service is restored. Different activities require different solutions. Strategy selection is influenced by practicalities such as the cost of implementation and maintenance. Transferring staff and operations takes time and effort. Normally, a fast and seamless recovery entails a more costly solution. Therefore, it is important to ensure that realistic RTOs and RPOs are set. Is it essential to recover systems to the status they had when the failure occurred or, for example, will restoring yesterday’s back-ups be sufficient? There are three main aspects to setting the BCM strategy to achieve the agreed RTO: > S electing the tactics for continuing the delivery of products and services > C onsolidating the resource requirements > Sourcing these requirements. The various options must be fully understood before selecting the appropriate tactics. Business Continuity GPG 13 Activity continuity strategies For each activity, the most appropriate tactics to meet the RTO must be selected based on cost, guarantees, additional benefits and other factors. Agreements may vary from verbal promises through to contractually committed service levels. The shorter the RTO, the more important the reliability of the delivery becomes. People Some of the following techniques should be considered: > P rocess mapping Allowing staff to undertake unfamiliar roles > Multi-skill training Of individuals > C ross-training of skills Across a number of individuals > Succession planning. Additional skills may arise from permanent or occasional use of third-party support. Alternatively, an inventory can be made of staff skills not used in existing roles. This might include previous experience in other roles – First-aid training, salvage or rescue experience or emergency management skills. Many stakeholders (including customers, partners and contractors) may be affected by an incident. In a major fire at your site contractors may be injured, local residents evacuated and local businesses closed for safety reasons or because of reduced trade. The organisation’s level of responsibility (both legal and moral) for these groups should be understood. 14 GPG Business Continuity The Business Continuity Manager should be aware of threat reduction techniques You should ensure various stakeholders’ needs are satisfied or they may impede the recovery effort. For example, the local residents could press the local authorities to refuse you permission to rebuild on the site following a fire. For civil emergencies dialogue with local emergency responders may provide useful information, such as: > R ecommendations for assembly points and evacuation routes > Notice of specific hazards in the vicinity > Likely position of any traffic cordons > Special access arrangements > Participation in exercises. The BC manager should be aware of threat reduction techniques, including: > P hysical security where advice can be sought from security professionals > Information security. ISO 27001, Information Security Management and ISO 17799, Code of Practice for information security management, provide useful guidelines. Premises The RTO is the principal determinant of worksite continuity tactics. Once the RTO parameter has been satisfied cost and availability will guide the choice of tactics. Premises tactics include: > Do nothing This may be acceptable for the least urgent activities identified in the BIA. Where the RTO exceeds a few months it allows time for buildings to be found and utilities installed post incident, all with minimal planning and preparation. > Relocate your staff Move up Use existing accommodation such as a training facility or canteen to provide recovery space, or increase office density. This needs planning and preparation. Displacement High priority activity personnel could temporarily displace some of those who are performing less urgent business processes. But beware of unmanageable backlogs. Remote working This includes “working from home” and from other non-corporate locations such as hotels. Reciprocal agreements Great care must be taken when establishing this type of agreement. It requires contracted regular testing. > Use third-party premises Third-party alternative site arrangements may be considered if they meet the RTO. Commercial services include fixed, mobile and prefabricated premises. Dedicated work areas provide exclusive use of the accommodation. “Syndicated” or “Subscription” options offer access, provided the accommodation is not already in use. This can be on a first come, first served or an equitable share basis whereby resources are allocated in proportion to the subscription. >Use diverse location tactics This option moves the activity and not the staff via dual-site operations or continuous availability solutions. In the event of an interruption at one site the business activities are transferred to alternative locations where staff and facilities are already prepared to handle it. Equipment With uninterruptible power supply (UPS) or back-up generators, some risks are acceptable. Risk reduction can use monitoring systems to warn against utility or equipment failures and destructive threats, eg, sprinkler and fire suppression systems in buildings with a high loading of flammable materials or expensive equipment. Possible recovery techniques to consider are: >Maintenance contracts, preferably with local firms >Salvage engineers can often restore equipment after damage by fire or water >Asset restoration specialists can often minimise damage after fire and flood to equipment, buildings and papers, and they may offer useful advice, as well as being available on request >Use of local subcontractors or competitors with similar equipment. Business Continuity GPG 15 Technology The loss of a data centre can have a major financial impact on a business. There are several options, including in-house resilience, recovery or third-party support. It is a complex and costly area in which technical expertise and a sound working knowledge of the critical systems are invaluable. The IT department, or the equivalent service provider, should investigate and recommend appropriate recovery options which include: > S hip-in contracts for IT and specialist equipment, including telephone systems. Terms of contract vary from ‘best efforts’ to guaranteed delivery. >Call redirection for telephony Most telecommunications operators offer solutions for redirecting calls from one site to another. The logistics of handling redirected calls must be addressed. > C onvergence of telephony and data networks, VoIP (Voice over IP): This creates new opportunities and issues, since telephones and email are often used as alternatives if one fails; these issues need to be assessed and the risks and impacts thoroughly analysed. Since business continuity incidents often involve denial of access, back-up copies of records should be kept at another location. There is no ‘correct’ separation distance, but one must consider denial of access factors such as loss of power or transport disruption. 16 GPG Business Continuity There may also be limits on the distances staff would be prepared to travel at short notice. Note that after an incident the regulatory, statutory or business standards for information management still apply. Key issues to address are: – Confidentiality – Integrity – Availability – Currency. Supplies One must determine what supplies (including equipment) are needed, and how quickly, to meet the RTO of each activity. Replacement strategies include: > S toring additional supplies at another location. If the supplies degrade over time they should be rotated with regular stock > C hanges in the core process may require stored supplies to be changed (eg, headed stationery may need new address or contact details) > Delivery of stock at short notice > D iversion of just-­in-­time deliveries to other locations > H olding materials at warehouses or shipping sites You should ensure that various stakeholders’ needs are satisfied or they may impede recovery Table 4 Recovery strategy for third party products RTO: Vision Opticals – Direct to customer, third party products RPO: Activity/Product Dependencies People Premises Equipment Technology Supplies Stock Current provision 20 1,000 m2 office 10,000 m2 warehouse Racking Packing Fork lifts ERP Email File server 2,500 units/day 5,000 units on hand Minimum requirement 5,000 m2 warehouse Shrink wrapping equipment to process 2,000 units per day Five users 2,000 with server units/day access within 24 hours 2,000 units to handle pending delivery 10 Recovery strategy Replicate/restore X Repair X (warehouse) Replace Reciprocity Relocate X X X X Workaround Suspend X (office) > T ransferring sub-assembly operations to new locations > H olding older equipment as emergency replacements or for spares > S pecific risk mitigation strategies are needed for unique or long lead X time equipment: replacing outdated equipment with long lead time updated versions may impede recovery >Geographical diversity of processes. Make sure that the RTO can be met by the alternative location. Business Continuity GPG 17 Techniques for reducing the impact of supply interruptions include: >Obtaining sign-off for financial and resource provision >Dual or multi-sourcing >Creating project and action plans >Inspection of supplier’s business continuity arrangements. This may include a requirement for certification to ISO22301 >Applying the agreed strategy. >Holding inventories off-site, at another site or at the supplier’s site >Penalty clauses on supply contracts (no protection against bankruptcy) > P re-acceptance of alternative suppliers. Resource level consolidation The objective of resource level consolidation is to understand and locate the resources necessary to achieve the RTO and RPO. It is necessary for two reasons: >Co-ordinating the acquisition and utilisation of resources can prevent conflicts, such as when more than one operation expects to use the same alternative workspace >Bulk purchasing may be more efficient and cost-effective. Resource consolidation includes the following stages: >Aggregating resource requirements from the CRA >Evaluating each option against the RTO and RPO and providing executive management with a strategic evaluation 18 GPG Business Continuity The result is a set of recovery resources and services for the restoration of business systems within their RTO and RPO. Executive management must make a strategic evaluation and sign off the strategy, together with the requisite financial and resource provisions. In Table 4 we have addressed recovery strategies for the thirdparty products, part of a direct to customer business. The following issues were considered: >We are going to relocate 10 staff. What skills are required, where will they go, how will they get there and what resources will they need? >We will need to find alternative warehousing facilities. Who will do that and what information will they need to source this? >We are going to replace any damaged equipment. Who will do that and identify potential suppliers in advance? >We will need to identify alternative suppliers. Who are they and who will contact them? >We may need to replace lost stock, so we are looking at a reciprocal fulfilment arrangement with another firm to supply products while we recover our operation. 5. Developing a response This part of the process concerns the most detailed planning documents, which are also likely to be the most fluid. The aim is to identify the actions and resources required to manage an interruption, whatever its cause. Key requirements for an effective response are: >A clear procedure for escalation and incident control >Communication with stakeholders >Business continuity plans (BCPs) to resume interrupted activities. A BCP is a set of guidelines that require interpretation by the business continuity team (BCT) according to circumstances. It is not possible – or even desirable – to predict what might occur. The Incident Management Plan (IMP) defines how strategic issues would be addressed and managed by the executive. This may include incidents where there is no physical disruption, right up to a national emergency. Media response to any incident is usually managed through an IMP. At a tactical level the BCPs address business disruption from the initial response through to the point at which normal business operations are resumed. Based on the BCM strategy, they provide procedures and processes for the BCT, allocating roles and responsibilities. They must also give details regarding liaison with external agencies such as recovery services’ suppliers and emergency services. If the event falls outside the scope of the BCP, the situation should be escalated to the senior incident management team (IMT). Operationally, Activity Resumption Plans (ARPs) provide detailed guidelines for the recovery teams to implement the resumption of normal business functions and support services. Incident Management Plan (IMP) The IMP provides a framework for managing any incident. The plan should contain initial prompts for action, such as a list of stakeholders to be contacted. The BIA will offer useful pointers to potential impacts which may need to be managed. Wherever a BCM response is required the IMT should be alerted. If no IMP exists it may be useful to run an exercise with the senior management team so that the many requirements become apparent (such as the need for a plan). All incidents differ and so the IMP is a framework of components and resources that may be useful, rather than a rigid procedure. Business Continuity GPG 19 The roles of the team and specific individuals should be documented. Deputies should be identified for each role. Responsibilities may include: >Managing communications (see section below) >Ensuring IMTs and BCTs are properly staffed >Liaising with the BCT to agree the resumption timetable >Approving significant expenditure >Monitoring recovery progress and personnel performance >Identifying and maximising opportunities or advantages arising from the incident >Looking at the strategic impact, which may require significant changes in direction or open up new opportunities >Maintaining a decision log throughout the incident. Clear invocation criteria should be set out, and the persons able to initiate the call-out decided. This should encourage action where there is doubt; it is easier to stand down a team than to activate them once the incident is out of control. The activation procedure should be documented so decisions are not delayed. A number of alternative meeting locations should be identified and, on invocation, the first person notified should select the most suitable, based on current information. All BCM strategies should take into account welfare issues in an incident. Staff are more likely to co-operate if their needs are met. At least two locations should be predefined to act as an incident management centre (control room or command centre). One is likely to be on-site where the senior management team are based but the other should be off-site. The off-site location does not have to be owned by the organisation. By prior arrangement, a 24-hour hotel may provide all the facilities required. Consideration should be given to: >Communication: inbound and outbound >Recording events, actions and issues >Monitoring the media >Access control. The following resources should be considered: >Whiteboards or flip charts (and pens that work) >Telephones, including an outgoing line and a recording facility >Hotline/helpline facility >TV and radio >Stationery >A means of logging all actions 20 GPG Business Continuity Table 5 Incident Management Plan Incident management framework Team members and responsibilities Role Responsibility Contact Details Deputy Contact Details Site evacuation Personnel accountancy Communication (staff & others) Emergency services liaison Telephone reception for next of kin Media & external communication Transport assistance Translation services Incident management centre locations: Incident management centre access arrangements: Incident management centre resources Location Desks/Chairs Phones PCs Fax Other Office Materials / Equipment >Refreshments and nearby or on-site sleeping facilities >An IMP >A locked trunk (often called a ‘battlebox’ or ‘recovery box’), in which hardware and information can be kept offsite at the alternative location. >Demonstration of preparedness All BCM strategies should take into account welfare issues during an incident and the recovery. Staff are more likely to co-operate if their welfare needs are met. Issues to consider include individual special needs during prolonged stay-in periods. An IMP should be succinct and clear because it will be used under pressure in stressful circumstances. The outcomes of the process include: >An incident communications plan >Compliance with statutory, regulatory and ethical requirements. The IMP should be documented. The template in Table 5 gives an example of a suitable format. Major incidents requiring an IMP can vary from those which threaten the continued existence of an organisation but have little impact outside of it, to those which, like the Buncefield oil depot explosion, can become a national emergency. Business Continuity GPG 21 The principles to be applied to the latter are exactly the same but there is increased emphasis on health and safety and liaison with emergency services. These are features which may have little or no prominence in a purely internal issue such as the failure in a supply chain. Appendix 1 describes the incident response structure employed by the UK emergency services. The model is suitable for organisations with the potential for major health and safety incidents. The BCPs and the ARPs, are similar in structure, but focus on different aspects of recovery: > B CPs cover the management of common resources such as facilities, information technology, finance and personnel – in essence the organisation’s infrastructure > A RPs focus on the recovery of specific activities, often customer facing, such as order taking, customer helpdesks or claims handling. Both types of plan have similar considerations and structure dealing with what has to be done by whom, when, where and how. Business Continuity Plans (BCPs) All BCPs should be ‘action orientated’, easy to reference at speed and exclude superfluous information. The BCP should document assumptions about the maximum scale of the incident. If these are exceeded then this should be escalated to the IMT. 22 GPG Business Continuity Key steps in developing a BCP are: > Appoint an owner for the BCP(s) >Define objectives and scope based on the BCM policy and strategy >Decide the structure, format, components and content >Gather information to populate the plan and prepare a draft plan Circulate the draft plan for consultation and review >Test/exercise the plan >Gather feedback from consultation and amend the plan as appropriate. All BCPs should be modular in design so that separate sections can be supplied to teams on a need-to-know basis. Each section could be printed on different coloured paper to provide ease of use and reference. Dynamic information, such as contact details, should be in appendices which can be amended easily, with job titles rather than names in the main body of the document. Software products are available to help you build and maintain a BCP. However, normal office software may well suffice and does not require special training. Customised software, though, may prove helpful in plan maintenance. Table 6 Business Continuity Plan (BCP) Business Continuity Plan Location: Activity: Recovery management centre: Available facilities: Alternative location: Contact list Name Office tel. Business Continuity Plan – the contents > Basic information Mobile Role/ Action > Actions – Responding to an invocation – Document owner and maintainer – Decision making – T eam members and their roles along with named deputies – Mobilising resources Responsibilities may include: – Liaising with emergency services Actions – Initiating activity recovery – Receiving information from other teams – Obtaining information from response teams – Reporting status to the IMT. – Reporting to the IMT – Personnel – M obilising suppliers of salvage and recovery services – Facilities and supplies – Allocating available resources to recovery teams – Invocation/mobilisation instructions. There should be a number of possible meeting locations, favouring those with the required resources. On invocation the first person notified should identify the most suitable meeting place, plus a fallback based on current information. > Resource requirements – Technology, communications and data – Security – Transportation and logistics – Welfare requirements – Emergency cash and payments – Any additional resource requirements for specific activities – Contact information to access required resources. Business Continuity GPG 23 DOs and DON’Ts > Vital information >DO remember that the BC response is a framework for a stressful situation – Customer information >DO ensure key roles and responsibilities are clearly defined – Contact details >DO check that staff holding BC response roles have been appropriately trained – Legal documents, such as contracts and insurance policies >DO have a “quick reference” guide readily available summarising key information – Service Level Agreements. > Forms and annexes – Checklists to assist recovery. A function-specific BCP generally has two main sections: – A list of key contacts and team members identifying the roles and responsibilities of each – An outline of the specific actions necessary for recovery. Examples in Table 6 provide a framework within which ARPs can operate. The BCP should be signed off by the executive. Activity Resumption Plans (ARPs) Introduction ARPs cover the response by a department or business unit to an incident. Examples are: >Procedures to assist the IMT, led by the facilities department, to deal with the incident and its physical impact >An HR response to welfare issues The complexity and urgency of the business processes may determine whether one plan covers a single activity or a department with several activities. Process Key steps in ARP development and planning include: >Making someone responsible for overall plan development with representatives from each operational unit >Providing a template to encourage standardisation but allow individual variations where appropriate >Ensuring that business units nominate individuals to fulfil roles within their plans >Circulating the draft plan for consultation, review and challenge within and, where necessary, outside the department >A business department (eg, finance department) plan to resume its functions within a predefined timescale >Validating and amending the plan as appropriate through a unit test >An IT department plan for the resumption of IT services to the business. >Documenting connections with the BCP and between the ARPs for each business unit 24 GPG Business Continuity >Consolidating the various business unit plans and reviewing for consistency >Conducting a resource requirements analysis across all plans to define the resource requirements for support functions. – Special procedures – Work in progress issues – Consumables required. Methods Outcomes and deliverables Development of an ARP is similar to that for other plans. Specific ARPs may include the following: Outcomes of planning include: >An ARP for each business activity or department > Facilities > Criteria for escalating issues to the BCT > Staff welfare plans >Clearly defined BCM roles within each department. > Business unit resumption > IT disaster recovery The above plans may include information on: Table 7 is a simple example of some of the activities needed to recover financial management after a major incident. > Evacuation and stay-in plans > Emergency services liaison > Dispersal of staff and visitors >Salvage resources and contracted assistance > HR and welfare issues >Procedures for contacting and accounting for staff >Counselling and rehabilitation resources > Escalation criteria and procedures > Contacting team members > Resumption plans for each process: – Staff numbers – Key contacts – Procedure for activity resumption – Activity priorities Business Continuity GPG 25 Table 7 Activity Resumption Plan (ARP) Business Recovery – Finance department responsibilities Action Outline here the actions that will need to be performed to put the chosen recovery plan into effect Primary Deputy Responsibility Who will be responsible for performing these actions Primary considerations Who will deputise if the primary responsibility holder is unavailable? Identify the support team(s)who will be responsible for these particular actions Assess financial exposures related to legal and financial issues 1. Potential civil liabilities Banking 1. Establish alternate means of accessing bank account balances and movements 2. Establish alternate means of making payments: > S mart cards, ‘dongles’, passwords >A uthentication software/devices installed on alternate PCs 3. Payroll 4. Activating company credit cards 5. Emergency limits for company credit cards 6. Emergency overdraft/funding considerations Accounts payable Emergency credit lines, renegotiating supplier payment terms Insurance Liaison with insurers and loss adjusters 26 GPG Business Continuity Support team 2. Compliance with industry-specific regulations 3. Extent of financial penalties under Service Level Agreements 6. Exercise, maintain, review All BC documents should be reviewed and the plans exercised at least annually. Reviews and exercises should also be carried out whenever there is a significant change to the business processes or environment. No plan is reliable if it has not been exercised, nor can the personnel involved be relied upon until they have had some form of practice. BC exercises are crucial because they develop the necessary competence, confidence and knowledge to act. Five stages of exercise are recognised, as detailed in Table 8. The normal progression would be to start at the bottom with a desktop exercise and work up towards the full-scale exercise at the top. (Most organisations limit themselves to stages 1-4.) DOs and DON’Ts >DO include requirements for exercising, maintaining and reviewing in the BC policy >DO ensure compliance is subject to independent assessment >DO ensure there is regular confirmation of roles, contact details and availability of BC response resources >DO ensure plans are subject to regular exercising, at least annually >DO ensure a formal issues log is created, maintained and reviewed Concepts and assumptions For any test to be ‘useful’, it needs to meet the following criteria: Stringency, realism and minimal exposure to additional risk. This may require some degree of compromise. > S tringency Ideally, tests should be as realistic as possible, however, it may not be practical to run certain tests without altering ‘live’ procedures. This applies especially to technical testing. > Realism This ensures that the audience engages in the event and ultimately gains more from it. > Minimal exposure Testing may increase exposure to risk. The designer of the test should ensure that the risk and impact of disruption is minimised. The business must understand and accept the risk. Business Continuity GPG 27 Table 8 Types of exercise Stage Purpose Style and focus 5 Full-scale exercise Develop the capability Demonstrate competence Simulation and realism Scenario based 4 Command post exercise Acquire the skills Develop the techniques Co-ordination and communication Scenario based 3 Active testing Practical experience Test the components Participation and interpretation Consequence based 2 Walk-through Challenge the assumptions Verify the dependencies Review and discussion Effect based 1 Desktop Validate the logic Spot the weaknesses Introduction and familiarisation Plan based Process > Debrief participants after the exercise All tests must be planned, the results captured and any remedial work monitored for successful implementation. A technical test may include the following steps: > E valuate results and prepare a report with recommendations. > A gree the scope, objectives and budget > D evise a simple scenario and a set of assumptions to put the test in context > Conduct the test and record the results > Assess and report the results > A ddress any issues raised. A scenario exercise will require similar steps, with some additional ones, such as: > C irculate report to participants and senior management. Methods > Participants Possible participants, in addition to staff, in desktop or scenario exercises include: – Facilitators – S uppliers of specialist resources, services or products > P repare a realistic and suitably detailed scenario – Communications and PR > B rief observers and prepare questionnaires to capture lessons learned – Outsourced activity providers. > P re-exercise information and briefing of participants. 28 GPG Business Continuity – Subject experts Table 9 Business Continuity Plan testing document Incident discovery and notification Tested Effective Remedial action required The right people were notified/alerted effectively The required emergency services were identified and notified in a timely manner Effectiveness of assembling the Incident Management Team Effectiveness of locating and communicating with staff Business recovery Recovery management organisation Working accommodation: relocated staff Staff working from home Insurance and disaster restoration services liaison Customer management activities established Handling and prioritisation of customer commitments performed effectively Telephony/voice communications successfully re-established IT systems fully and effectively restored Locating replacements for damaged or destroyed equipment/stock/raw materials Capacity and resources available at alternative working location Arrangements for relocating staff Order management/fulfilment resumption Business Continuity GPG 29 DOs and DON’Ts >DO make sure staff can easily find information relating to the plan and the structure of the supporting BCM organisation No plan is reliable if it has not been exercised, nor can the personnel involved be relied upon until they have had practice. > Outcomes and deliverables T he outcomes of the BC exercising process include: – Validation of the BC strategies – F amiliarisation of participants in responding to an incident – Testing of the plan(s) and the supporting infrastructure >DO remind staff regularly about BCM arrangements >DO ensure BC arrangements are on board meeting agendas at least once a year Maintenance A BCM programme must be established to ensure all relevant stakeholders have the current and relevant parts of the BCP. Review There are several ways to review a BCM programme including: > Internal audit – A post exercise report > External audit – Increased awareness of BC > Self-assessment. – An opportunity to improve preparedness. Table 9 gives an example of part of a BCP testing document outlining the assurances required, whether they were tested and where remedial work is required. This would normally form part of a wider report and follow-up process where the results of the test were communicated and responsibilities for remedial work identified. 30 GPG Business Continuity 7. Embedding BCM BCM is a holistic management process which identifies risks that may threaten an organisation. It provides a framework for building resilience and the capability for an effective response to safeguard the interests of its key stakeholders, reputation, brand and value creating activities. To be successful it has to be seen as a part of normal business management, regardless of the organisation’s size or sector. At all points in the BCM process, opportunities exist to introduce and enhance an organisation’s BCM culture. Precisely how everybody is made aware of BC and its implications will depend to a large extent on the existing culture and ways of communicating ideas. Three stages can be envisaged: (i) ssessing BCM awareness and A training needs Before planning and designing the components of an awareness campaign, it is critical to understand what level of awareness currently exists. Assessing awareness is just another aspect of “Understanding the Organisation” (see page 6) and the same techniques re: workshops, questionnaires and interviews can be used. These will identify the level of training required. Is it in just the practicalities of your organisation or is an explanation of the basic concept required? (ii) Developing BCM in the organisation’s culture This will build on the training needs identified above and lead to the design and delivery of a programme of education, training and awareness, which must: – explain the need for business continuity plans within the organisation, – provide access to details of the organisation’s specific business continuity arrangements, – create quick reference resources and materials, and – implement an enforceable policy. (iii) Monitoring cultural change The awareness assessment, stage (i), should be maintained as an ongoing task to identify any further requirements for education and training. The importance of a common understanding of the value of BCP should not be underestimated. It ranges from board-level support to staff commitment to exercises. The value will be clearly demonstrated in an incident. Business Continuity GPG 31 Appendix 1 Incident Response UK emergency services incident response structure UK emergency services use a three-tier incident response structure (see Figure 1) with responsibilities and relationships. > Temporary accommodation > C ounselling and rehabilitation services, perhaps within an employee health package > Welfare needs at alternative locations: In an incident the three levels of involvement address different issues during the various phases of the event, as the following diagram (Figure 2) shows. – Refreshments Smaller organisations may elect for a single hands-on management group with both tactical and strategic responsibilities. However, it is still important that they address the strategic issues despite the pressing issues of a tactical response. – Appropriate training on replacement equipment. For geographically diverse organisations a variety of models may be appropriate, perhaps with additional tiers beyond the three named above. For example: > A response team at each site backed up by a central BC team > A BC team at major sites with a central IMT > B CM at a national level with limited involvement from the international board unless global reputation is threatened. All BCM strategies should recognise people issues but in major health and safety incidents they can be the dominating issue. During such an incident someone should assume responsibility for the activities listed in Table 5 (see page 21). Subsequently there may be additional needs, including: 32 GPG Business Continuity – Personal safety and security – Transport and accessibility Someone should be appointed to liaise with the emergency services as they arrive on site and subsequently as required. The emergency services need to be advised of the whereabouts of any casualties, the status of the situation and any hazards they may encounter. While on site, the emergency services’ instructions take precedence over all others. When they depart, the organisation resumes responsibility for the site. The incident communications plan addresses communication with all stakeholders including: > S taff, relatives, friends and emergency contacts > Customers and suppliers > Shareholders, partners or owners > Informing and liaising with regulatory authorities (legal and compliance functions) > Issues relating to serious injuries or fatalities (with the emergency services) > M edia: local and national newspapers, radio, TV, internet and other media. Figure 1. Three-tier incident response structure GOLD Senior (incident) Management SILVER BRONZE CONTROL ESCALATION Strategic Tactical Business Continuity Team Incident Response & Business Unit Resumption Teams Operational Figure 2. Phases of an incident INCIDENT OVERALL OBJECTIVE: Back-to-normal as soon as possible NORMAL TIMELINE INCIDENT RESPONSE WITHIN MINUTES TO DAYS: Contact staff, customers, suppliers etc. recover critical processes; rebuild lost work-in-progress BUSINESS CONTINUITY WITHIN MINUTES TO HOURS: Account for people; deal with casualties; contain damage; assess damage; invoke business continuity RECOVERY / RESUMPTION WITHIN WEEKS TO MONTHS: Repair/replace damage; relocate to permanent site; recover costs from insurers Business Continuity GPG 33 When an incident gets into the public domain, effective communication plays a key role in protecting an organisation’s reputation. Answers to the following questions need to be considered: > What are the messages? > Who will form the IMT? > W hat resources and facilities are available? > A re the IMT and spokespeople properly trained? When an incident or business discontinuity gets into the public domain, effective communication plays a key role in protecting an organisation’s reputation. 34 GPG Business Continuity Appendix 2 glossary OF TERMS ItActivity is necessary to consider: Resumption Plan (ARP) > Ownership of the plan Detailed guidelines for operational recovery teams to implement the resumption of normal business functions and services. Everybody involved should agree Back-up A reserve copy of information which is deemed to be ‘Essential for beforehand about the who, Recovery’, how and including data and documentation. what of communication. Business Continuity (BC) > Perception is reality The capability to continue essential business functions under all circumstances. Reputation is affected Business Continuity Institute by perceptions The world’s leading membership organisation for BC practitioners. (BCI) > Act fast Business Continuity Reticence ruins reputations Those management disciplines, processes and techniques which Management (BCM) seek to provide the means for continuous operation of the essential > Be as open as you legally and business functions. practically can A set of procedures and processes to guide the Business Continuity Show you have nothing to hide Team in the tactical management of an incident. Business Continuity Plan (BCP) > Show you careTeam Business Continuity Staff responsible for the tactical management of an incident. (BCT)See it from your audiences’ point of view. Business Impact Analysis (BIA) The process of identifying, and quantifying, the impacts on an enterprise of the effect of a incident, in both financial and non-financial terms. Continuity Requirements Analysis (CRA) An assessment of the resources required for a resumption of activities. Disaster Any event which threatens or disrupts normal operations, or services, for sufficient time to affect significantly, or to cause failure of, the enterprise. Disaster Recovery (DR) A term normally used to describe the process for restoration and recovery of IT equipment, functions and applications. Incident Any event which may be, or may lead to, a disaster. Incident Management Plan (IMP) A framework document to guide the Incident Management Team in the strategic management of any incident. Incident Management Team (IMT) Staff responsible for the strategic management of an incident. Maximum Tolerable Period of Disruption (MTPD) The maximum period of time for which the business can afford to be without a critical function or process. Risk Assessment (RA) An estimate of the likelihood of loss, interruption or disruption from known threats. Recovery Point Objective (RPO) The point in a process or function which must be restored to enable continuity of the business operation to be maintained, or achieved. Recovery Time Objective (RTO) The time scale within which a function or business unit must be restored, usually determined by means of a Business Impact Analysis. Business Continuity GPG 35 Appendix 3 Further Information Further information can be found on the following websites: >www.gov.uk/government/policies/ emergency-planning > www.thebci.org Details of what the government is doing about emergency planning. The Business Continuity Institute (BCI): The world’s leading membership organisation for BC practitioners. The BCI’s Good Practice Guidelines are available from its website. > www.continuitycentral.com Continuity Central: a free source of news and information. > www.noaa.gov NOAA (National Oceanic & Atmospheric Administration): covers climate and weather patterns, including storm and hurricane forecasts. >www.bankofengland.co.uk/ financialstability The Bank of England maintains monetary and financial stability of the United Kingdom. 36 GPG Business Continuity > www.rothsteinpublishing.com/ppbc J Burtles, Principles and Practice of Business Continuity: Tools and Techniques (ISBN 978-1-931332-39-2) Rothstein Associates Inc, 2007. About BIFM The British Institute of Facilities Management (BIFM) is the professional body for facilities management (FM). Founded in 1993, we promote excellence in facilities management for the benefit of practitioners, the economy and society. Supporting and representing over 16,000 members around the world, both individual FM professionals and organisations, and thousands more through qualifications and training. We promote and embed professional standards in facilities management. Committed to advancing the facilities management profession we provide a suite of membership, qualifications,training and networking services designed to support facilities management practitioners in performing to the best of their ability. BIFM Number One Building The Causeway Bishop’s Stortford Hertfordshire CM23 2ER T: +44 (0) 1279 712620 E: membership@bifm.org.uk www.bifm.org.uk ISBN: 978-1-909761-17-9 ISBN 978-1-909761-17-9 9 781909 761179 Price: £19.99