Service Continuity Management for Enterprises Dedicated Plans Service Description Published: October 2011 The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2011 Microsoft Corporation. All rights reserved. Microsoft, ActiveSync, Lync, Outlook, and SharePoint are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners. Service Continuity Management Service Description (Dedicated Plans) | October 2011 2 Contents Introduction ........................................................................................................................................................ 4 Service Continuity Overview ............................................................................................................................ 5 Service Recovery Terms ................................................................................................................................................................ 5 Incident Classification ....................................................................................................................................... 6 Catastrophic Outage Response ........................................................................................................................ 7 Disaster Declaration ....................................................................................................................................................................... 7 Annual Exercise Program .................................................................................................................................. 8 Exercise Rescheduling ................................................................................................................................................................... 9 Service Continuity Management Service Description (Dedicated Plans) | October 2011 3 Introduction Service continuity management focuses on the ability to restore services for Microsoft® Office 365 for enterprises customers in a predetermined timeframe during a critical service outage. Achieving restored services requires preparation, planning, technical implementation, exercises that simulate outages, and execution at the time of an incident. This document describes the common approach to service continuity management that Microsoft has adopted for Office 365 services provided under dedicated subscription plans (“dedicated plans”).* The information applies to the following services: Microsoft Exchange Online Microsoft SharePoint® Online Microsoft Lync™ Online In addition, the document defines Microsoft and customer responsibilities that relate to service continuity management for the purpose of compliance and auditing. It document specifically covers: Service continuity provisions Incident classification Catastrophic outage response Managed annual exercise program * Services provided under Office 365 dedicated plans are delivered from a Microsoft hosting environment where each customer has their own dedicated data center hardware. Service Continuity Management Service Description (Dedicated Plans) | October 2011 4 Service Continuity Overview Microsoft Office 365 service offerings are delivered by highly resilient systems that help to ensure high levels of service. Services are hosted in Microsoft enterprise-level data centers that utilize the same worldclass operational practices as the Microsoft corporate line-of-business applications. In addition, Microsoft Office 365 has taken advantage of the extensive experience that Microsoft has in hosting services, and its close ties to the Microsoft product groups and support services, to create a service that meets the high standards that customers demand. Service continuity provisions are built in to the Microsoft Office 365 system design. These provisions enable Office 365 services to recover quickly from unexpected events such as hardware or application failure, data corruption, or other incidents that affect users. These service continuity solutions also apply when a catastrophic event occurs, such as a natural disaster or fire within a Microsoft data center that renders the entire data center inoperable. Service Recovery Terms Three terms that are commonly used in service continuity management to evaluate disaster recovery solutions are: Recovery point objective (RPO). The acceptable amount of data loss at the conclusion of the data recovery process. Recovery time objective (RTO). The acceptable amount of time the service can be down before being brought back online. Failover. To relocate an overloaded or failed resource, such as a server, a disk drive, a network, or a data center to its redundant, or backup, component. The RPO and RTO parameters for Office 365 dedicated plans are provided in Table 1. Table 1. Microsoft Office 365 Dedicated Plans RPO and RTO Parameters Microsoft Office 365 Service RTO RPO Exchange Online 2 hours or less 45 minutes or less SharePoint Online 4 hours or less 2 hours or less Lync Online 8 hours or less 8 hours or less Service Continuity Management Service Description (Dedicated Plans) | October 2011 5 Incident Classification An Office 365 service outage may be due to hardware or software failure in the Microsoft data center, a faulty network connection between the customer and Microsoft, or a major data center challenge such as fire, flood, or regional catastrophe. Most service outage incidents can be addressed using Microsoft technology and process solutions, and are resolved within a short period. However, some incidents are more serious and have the potential for lengthy outages. Microsoft uses the Service Interruption Scale that is shown in Figure 1, which classifies outage incidents as minor, critical, and catastrophic events based on their impact to customers. Figure 1: Service Interruption Scale The way that Microsoft manages incidents that impact availability of Office 365 services is described in the document “Microsoft Office 365 Support and Service Management." Service Continuity Management Service Description (Dedicated Plans) | October 2011 6 Catastrophic Outage Response Microsoft analyzes each outage that impacts Office 365 service availability to determine the scope of the incident and possible solutions. Outages that cause customer work stoppage may be considered catastrophic outages. In the event of a catastrophic outage, the Microsoft incident management team sends the initial outage notification to the customer via e-mail unless the e-mail service for the customer is not functional; in that case, a phone call is made to an agreed-on customer telephone number. Status updates are provided to the customer every hour or as appropriate for the particular incident. In addition, a Microsoft CritSit Manager helps ensure that outage notifications are received by customer executive contacts. The customer provides the most current contact information for these executives to the Microsoft Technical Account Manager (TAM) assigned to the customer. Initial contact with the executive contacts is made by phone within 60 minutes of the outage being declared. Additional followup frequency will be set and agreed upon by the executive contact. Disaster Declaration An outage may be declared a disaster if it is classified as a critical or catastrophic event based on the Service Interruption Scale (Figure 1). When an outage is declared a disaster, regular customer notifications are provided by the Microsoft incident management organization until a solution is found. Declaration of a disaster does not automatically result in failover of the customer’s redundant secondary site. Customer Responsibilities Provide contact information. Provide a single email group alias and phone number so that Microsoft can engage appropriate personnel at the time of an event to review the current status of the outage, disaster declaration criteria, and approval or disapproval of failing over to the secondary site. Provide declaration support. Provide executive-level contacts to the Microsoft declaration authority to help determine if failover to the customer secondary site is necessary. Microsoft Responsibilities Provide contact information. Provide a single email group alias and phone number so that the customer can engage appropriate personnel at the time of an event to review current status of the outage, disaster declaration criteria, and approval or disapproval of failing over to the secondary site. Decide whether failover is required. Make the decision, with input from the customer, on whether to fail over to the customer secondary site. Service Continuity Management Service Description (Dedicated Plans) | October 2011 7 Annual Exercise Program At a customer’s request, Microsoft facilitates a service continuity exercise—a best practice that is advocated by business continuity specialists. The purpose of the service continuity exercise—or what is commonly called a failover exercise—is to validate the incident management and service restoration solution and processes, and to provide Microsoft and customer management all supporting documentation regarding the results of the exercise. Microsoft provides for a maximum of one failover exercise for each contract year. The first exercise cannot be scheduled until after the purchased services are fully deployed, and subsequent exercises must be at least 12 months apart. A typical service continuity failover exercise will: Simulate a catastrophic outage, and validate incident management processes and declaration criteria and processes. Execute failover to the alternate data center as appropriate for each service. Impact a small number of customer-selected user mailboxes, BlackBerry user accounts, and SharePoint site collections. In addition to conducting failover exercises, Microsoft offers table-top exercise. A typical table-top exercise will: Enable participants—either in person or online via a Lync meeting—to review documented plan activities in a stress-free environment. Effectively validate team roles and responsibilities during an emergency. Identify needs for training, documentation errors, missing information, and inconsistencies across business continuity plans for both Microsoft and the customer. Microsoft recommends that customers consider conducting a table-top exercise prior to their first failover exercise. This initial table-top exercise can be conducted prior to completion of deployment and prepare customer teams for their first full failover exercise after service deployment is complete. Customer Responsibilities: Provide a project manager to work with the Microsoft service continuity project manager during the exercise planning, execution, and post-exercise phases. Participate in the pre-exercise, exercise, and post-exercise phases. Schedule at least one service continuity exercise. Microsoft recommends conducting the initial exercise within 10 working days after the customer migration to Microsoft Office 365 services is complete. Submit a written request to the TAM to schedule an exercise at least 90 days prior to the proposed exercise date. The request should identify a first and second choice for dates and timeframe, and define the service or services to be validated during the exercise. Provide written cancellation notice within 30 days of the scheduled exercise if the exercise is no longer required. Microsoft Responsibilities: Provide a service continuity project manager to work with the customer-provided project manager during the exercise planning, execution and post-exercise phases. Participate in the pre-exercise, exercise, and post-exercise phases. Respond in writing to the customer’s request for an exercise within 10 business days, with confirmation on resource availability and projected start date. Service Continuity Management Service Description (Dedicated Plans) | October 2011 8 Review the service continuity management exercise calendar to ensure that customer migration and provisioning are completed prior to scheduling an exercise, and to ensure that the requested exercise date does not conflict with other migration, deployment, upgrade, or maintenance work being managed by Microsoft teams. Determine the availability of the resources necessary for the exercise. Document exercise results via an executive summary and then work with the customer and Microsoft teams to close any issues encountered during the exercise. Exercise Rescheduling All scheduled exercises will be subject to immediate cancellation or termination if Microsoft or other Office 365 customers declare a disaster and request use of the resources that have been designated for the exercise. Exercises may also be postponed if there are active Catastrophic (Severity 1) or Critical (Severity A) issues in the customer environment 24 hours prior to the exercise. In the event an exercise is cancelled, reasonable effort will be made to reschedule it. Service Continuity Management Service Description (Dedicated Plans) | October 2011 9