Microsoft Operations Framework Version 3.0 Published: January 2004 For information on Microsoft Operations Framework, see http://www.microsoft.com/mof Process Model for Operations Contents Abstract ........................................................................................................................... 1 What’s New? ..................................................................................................................... 1 Introduction ...................................................................................................................... 2 Overview of the MOF Process Model ..................................................................................... 6 The MOF Process Model: An In-Depth View ......................................................................... 13 Using the Team and Process Models Together ..................................................................... 31 Where to Start? ............................................................................................................... 34 Summary ....................................................................................................................... 37 Appendix: Resources........................................................................................................ 38 Contributors.................................................................................................................... 43 The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred 2004 Microsoft Corporation. All rights reserved. Microsoft, MSDN, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Abstract This paper describes the Microsoft® Operations Framework (MOF) Process Model, one of the two core MOF models. (The other is the MOF Team Model.) The MOF Process Model describes Microsoft’s approach to the IT operations and service management life cycle. The Process Model organizes the life cycle into quadrants, with each quadrant having a specific focus and set of tasks that are carried out through its corresponding set of service management functions (SMFs). The MOF Process Model for Operations is a foundational document. For the complete list and location of each of the publications in the MOF series, see the Resources section at the end of this document. What’s New? Since the creation of MOF version 1.0 in the summer of 1999, only minor adjustments have been made to the core models in order to maintain synchronization with the service management function and product operations guides that were being developed. This update, MOF 3.0, is the first coordinated update to the full suite of MOF core models. The key goals of this revision are to: Incorporate feedback from our customers, partners, advisory councils, and internal Microsoft MOF users. Align with ITIL version 2.0, which was released after the original version of MOF. Align with the latest release (2003) of Microsoft Solutions Framework version 3.0 and incorporate a broader, end-to-end IT life cycle perspective than the original version. Provide more customer-focused details on the business value of effective operations, with reallife examples and metrics. Improve integration between the Team and Process models. Make the MOF guidance more practical and easier to implement, and include more specific guidance on how and where to begin. Include a new Infrastructure Engineering SMF in the Optimizing Quadrant, based on customer and partner feedback about the need for this set of activities in the existing MOF content, as well as to complement the ITIL publication, ICT Infrastructure Management. Include a new Security Management SMF in the Optimizing Quadrant to complement the existing Security Administration SMF and to align with both the security management book released as part of ITIL 2.0 and the Microsoft Solutions for Security guidance. This new SMF is needed because of heightened business dependence on security planning, as well as customer and partner feedback. Revise the Changing Quadrant content with recent updates developed as part of the MOF Changing Quadrant course to ensure consistent guidance. Rename the Release Approved Review to Change Initiation Review to better reflect the underlying intent of the review. (The content has not changed, just the name.) Remove Print and Output Management as a high-level SMF and incorporate its content into the Storage Management SMF to better reflect the overall approach to storage, file, print, backup, and recovery on the Microsoft platform. 2 Microsoft Operations Framework Introduction A Demanding Environment Today’s business environment places increasing demands (rapid change, financial constraints, security and reliability concerns, global interconnectedness) on IT organizations in order to meet the expanding needs of a wide variety of stakeholders. Technology advances have enabled IT organizations to better meet these demands and have created many new business opportunities. One drawback, however, is that teams dealing with new technology can quickly lose track of its business intent, and the business perception of the value of IT can be questioned. At the same time, technology projects continue to challenge IT—many projects are unsuccessful or squander precious resources through poor quality results. Moreover, enterprise businesses are increasingly dependent on their information technology systems and are therefore more vulnerable to failures in those systems. While rock-solid technology is necessary to meet demands for reliable, available, and secure IT services, technology alone is not sufficient; excellence in processes and people (skills, roles, and responsibilities) is also needed. According to industry analysts, 50 percent (or more) of all IT budgets is spent operating IT systems, and 80 percent of unplanned system downtime is caused by people and process failures. It is vital that enterprise businesses augment technology with skilled IT staff having well-defined roles and responsibilities and using effective IT operations processes and management skills. Microsoft’s Approach Microsoft understands the challenges facing today’s enterprise computing environments and has responded with best-in-class technology and proven best practice guidance on how to effectively design, develop, deploy, operate, and support solutions built on Microsoft technologies. This knowledge comes from Microsoft’s internal IT and operations experience with large-scale software development and service operation projects, its service consultants’ experience in conducting projects for customer organizations of all sizes and geographies, and the best knowledge from the worldwide IT industry. The guidance is organized into two complementary and well-integrated bodies of knowledge, or “frameworks.” These are Microsoft Operations Framework (MOF) and Microsoft Solutions Framework (MSF). Microsoft Operations Framework provides guidelines on how to plan, deploy, and maintain IT operational processes in support of mission-critical service solutions. MOF is a structured, yet flexible, approach based on: Microsoft consulting and support teams and their experiences working with enterprise customers and partners, as well as Microsoft’s internal IT operations groups. The IT Infrastructure Library (ITIL), which describes the processes and best practices necessary for the delivery of mission-critical service solutions. ISO 15504 (also referred to as SPICE), from the International Organization for Standardization (ISO), which provides a normalized approach to assessing software process maturity. Process Model for Operations 3 Microsoft Solutions Framework is an adaptable software development and deployment approach for successfully delivering technology solutions faster, with fewer people and less risk, while producing higher quality results. MSF has been widely used by Microsoft customers, partners, consulting services, and product teams and observed its tenth anniversary in 2003. MSF describes how to: Align business and technology goals. Establish clear project goals, roles, and responsibilities. Implement an iterative, milestone-driven process. Manage risk proactively. Respond to change effectively. MSF is a disciplined approach to managing technology projects based on Microsoft internal practices, the experiences of Microsoft Services working with customers and partners, and industry best practices in software development and project management. The IT Life Cycle In delivering an effective portfolio of IT services to the business, IT operations and project teams should focus on three key objectives: Understand the business and operational needs for the service and create a solution that delivers these within the specified constraints. Effectively and efficiently deploy the solution to users with as little disruption to the business as the service levels specify. Operate the solution with excellence in order to deliver a service that the business trusts. Microsoft Solutions Framework and Microsoft Operations Framework combine to provide a complementary, integrated set of guidance that addresses the need for a consistent and unified approach to the overall IT life cycle. The two frameworks work together to minimize the time to value—that is, the time between recognition of the need and delivery of the service. Consistency of terminology and concepts between the two frameworks also supports the delivery of a high-quality service by facilitating effective communications throughout the life cycle. Within the overall IT life cycle, MSF and MOF follow four basic steps to create a new solution (or change to an existing one) and to operate that solution in a production environment; these are: Plan the solution using MSF and MOF. Seek first to understand the business and operational requirements in order to create the right solution architecture, design, project plans, and schedules. Build the solution using MSF. Create and complete the features, components, and other elements described in the specifications and plans using the appropriate development tools and processes. Deploy the solution using MOF and MSF. Implement a smooth deployment into the production environment using strong release management processes and automation. Operate the solution using MOF. Follow the MOF models and processes for solution and systems management to achieve and maintain operational excellence. 4 Microsoft Operations Framework This approach recognizes that a change to a currently deployed solution can originate from an operations requirement, a new business requirement, or external factors such as regulatory requirements. These changes also need to follow the four basic steps of the life cycle and, depending on their complexity, can trigger either a new (MSF) project, or a smaller-scale request for change. Microsoft’s view of the IT life cycle unites the varied activities that take place in an IT organization to ensure smooth, coordinated, and cost-effective delivery of IT services to the business. MOF and MSF target different, but integral, phases in the end-to-end IT life cycle. Each framework provides useful and detailed information on the people, processes, and tools required to successfully function within its respective area. Both MSF and MOF provide technology-agnostic guidance for improving IT processes that can be used in any environment. Figure 1. How MSF and MOF work together to meet business needs How MOF Builds on ITIL Since the 1980s, current industry best practices for IT service management has been well documented within the IT Infrastructure Library (ITIL) from the Office of Government Commerce (OGC) in the United Kingdom. The OGC is a U.K. government executive agency chartered with development of best practice advice and guidance on the use of information technology in service management and operations. To accomplish this, the OGC charters projects with leading IT companies from around the world to document and validate best practices in the disciplines of IT service management. Process Model for Operations 5 The core ITIL guidance and publications are organized as follows: ITIL Service Support ITIL Service Delivery Incident Management Service Level Management Problem Management Availability Management Configuration Management Capacity Management Change Management Financial Management Release Management IT Service Continuity Management Service Desk Function MOF “adopts and adapts” ITIL, and combines these collaborative industry best practices with specific guidelines for running on the Microsoft platform in a variety of business scenarios. MOF extends the ITIL code of practice to support distributed IT environments and current industry directions such as application hosting, mobile-device computing, and Web-based transactional and e-commerce systems. Service Solutions and IT Service Management Two important concepts are key to understanding how Microsoft Operations Framework supports IT operations. These two concepts are service solutions and IT service management. Service solutions are the capabilities, or business functions, that IT provides to its customers and users. Some examples of service solutions are: Line-of-business (LOB) applications Messaging Knowledge management E-commerce Web services File and print services Information publishing Data storage Network connectivity MOF embraces the concept of IT operations providing business-focused service solutions through the use of well-defined service management functions (SMFs). These SMFs provide consistent policies, procedures, standards, and best practices that can be applied across the entire suite of service solutions found in today’s IT environments. 6 Microsoft Operations Framework Overview of the MOF Process Model Simplifying the Approach to IT Management Defining any high-level process model requires a compromise that balances simplicity and understanding with scientific accuracy. The world of IT operations is complex; it contains a multitude of operational environments and process dynamics that are difficult to capture and define with consistent accuracy. With so many processes, procedures, and communications happening simultaneously across a diverse set of systems, applications, and platforms, it is virtually impossible to model any particular environment exactly. In practice, a fully detailed and prescriptive model is generally inappropriate and cost prohibitive for most businesses to even attempt. In contrast, the MOF approach is to simplify process definition into a high-level framework that is easy to understand and whose principles and practices are easy to incorporate and apply selectively or comprehensively. The power of this simplified approach will enable the operations staff of a business of any size, regardless of maturity level, to realize tangible benefits to the existing, or proposed, operations environment. The intent of the MOF Process Model is to provide a simple representation of the complex components and their relationships within the model. Process Model Principles The MOF Process Model assists the delivery and support of IT services by addressing the following four principles: Structured architecture. The Process Model is built upon an architecture that provides a higherlevel order for all the operational activities that must be addressed in mission-critical computing. This architecture provides the structure for process integration, life cycle management, mapping of roles and responsibilities, and overall management command and control. It also provides the underlying foundation for process automation and technology-specific operations. Rapid life cycle, iterative improvement. The rate of change for IT operations continues to accelerate. This demand for change is in direct response to the needs of business to adapt and innovate to stay competitive. As a result, MOF promotes the concept of a rapid life cycle that supports both the ability to incorporate change quickly and to continuously assess and iteratively improve the overall operations environment. Recognizing that operations does not follow a sequential set of phases as in the typical IT development project, the MOF Process Model categorizes key operational activities into quadrants that make up a spiral life cycle, with the activities occurring in parallel, 24 hours a day, seven days a week. Review-driven management. Within an IT operations organization, several methods and techniques are used to assist management in the control and oversight of the environment. MOF recommends and describes many of these methods in the details of its service management functions (SMFs). However, these methods and techniques alone are insufficient in obtaining the most from the IT investment. MOF inserts higher-level operations management reviews (OMRs) at key points within the life cycle. These reviews can be used to evaluate performance for release-based activities as well as steady state, or daily, operational activities. The operations management reviews add significant value to MOF. Where ITIL points out that reviews of operations activity for efficiency and effectiveness should be conducted and describes these reviews at a high level, MOF makes these reviews an explicit part of the Process Model and provides detailed guidance on how to conduct them. Embedded risk management. Where ITIL includes a discussion of handling risks in each IT operations process description (especially in availability, IT service continuity, and problem management), MOF elevates the management of risk to its own discipline and discusses risk in the context of each SMF and Team Model role cluster. Detailed guidance for operations risk management is provided in the Risk Management Discipline for Operations paper. Process Model for Operations 7 The MOF Definition of “Release” An additional definition that will assist in understanding this document is the MOF definition of the term “release,” which has a specific meaning within the MOF context. A release is considered to be any change, or group of changes, that must be incorporated into a managed IT environment. These changes are not handled separately, but rather as a packaged release that can be tracked, installed, tested, verified, and/or uninstalled as a single, logical release. Under this definition, a release is any of the following: A new or updated line-of-business (LOB) system A new or updated website including content propagation New hardware (server, network, client, and so on) New or updated operations processes or procedures Changes in communication processes and/or team structures New infrastructure software Physical change in the building or environment This broad definition of release supports the fundamental principle of managing changes in people, process, and technology in the provision of service solutions. (Note that this definition of a release is broader than the ITIL definition of a release.) Key Components of the MOF Process Model The MOF principles just discussed are instituted through the key components of the Process Model architecture, which are: Quadrants. Operations management reviews. Service management functions. Normal operations activities are defined and grouped into quadrants, each of which is focused on a particular set of processes and tasks. Significant milestones in the operations life cycle are represented by operations management reviews, which are major decision and review points in the Process Model. These are essential in ensuring that key decisions involve the proper stakeholders, include all necessary information and expectations, and are well documented for future action. Service management functions document specific processes that are performed within each of the quadrants. These are typically quite specific and prescriptive toward a single activity—for example, capacity management or help desk functions. Each of these key components is described in detail below. 8 Microsoft Operations Framework Overview of the MOF Quadrants Where ITIL groups core operational processes into two sets—service support (processes associated with the end-users of IT services) and service delivery (those processes associated with the paying customers of IT services), MOF organizes these core ITIL processes, plus additional MOF processes, into four quadrants of the Process Model: Changing Operating Supporting Optimizing Each of the quadrants has a unique mission of service that is accomplished through the implementation and execution of underlying operational processes and activities contained in the SMFs. For example, in the Changing Quadrant, the underlying SMFs are: Change Management, Configuration Management, and Release Management. Together, these functions comprehensively support the Changing Quadrant’s mission of service, which is to effectively identify, approve, control, and release changes to the IT environment. The quadrants are represented in the model shown below. It is important to note that, although the model implies a sequential nature from quadrant to quadrant, in most cases activities from all quadrants will be occurring simultaneously. Each of the quadrants is briefly described below. Figure 2. The MOF Process Model, showing quadrants and operations management reviews Process Model for Operations 9 Changing Quadrant Once a release (which contains one or more changes) has been developed, tested, and is deemed ready for deployment (following MSF or another software development or project management approach), the MOF life cycle starts with a Release Readiness Review to determine if the release is ready and approved for deployment into the target environment. This review should not be the first time the release is evaluated in this manner but is rather a final review milestone prior to the actual deployment. Following a successful Release Readiness Review (that is, a “go” decision), rollout preparations are completed and the release is deployed and becomes fully functional in the target environment following the process guidance provided in the Changing Quadrant SMF guides. The use of these SMFs will provide a process and task road map to help ensure a successful deployment and rollout for managed releases. The post-implementation review then includes an evaluation to determine the degree of success and overall outcome of the deployed change(s) in the release. This is covered in greater detail later in this paper, and extensively in each of the quadrant’s SMF guides. Operating Quadrant Assuming a successful deployment, the release is now operational and the daily activities to run the system or application are being executed according to the operations guide for the system or service. The SMFs in this quadrant can be thought of as the typical data center activities, such as system administration, monitoring, batch processing, and so forth. These activities ensure the smooth and predictable operation of the release. It is crucial that, as the operations staff gains experience with a process, system, or application, the staff documents this experience and retains it in some sort of knowledge management system (or at least operations guide documents). Considering staff attrition rates and the fluctuating availability of qualified personnel, possession of such a knowledge base will enable the operations group to continue to provide consistent service levels to its customers. Having this knowledge and system documentation will also provide flexibility and added ease when workflow options for support and operations, such as outsourcing, are chosen. Supporting Quadrant The Supporting Quadrant contains the main SMFs required to provide ongoing support to the users of the IT service solutions. As with any process, system, application, or service, problems and issues inevitably will arise when operations begins. The support and operations staff must identify, assign, and resolve incidents and problems quickly to meet the requirements set forth in the service level agreements. The underlying SMFs in this quadrant include an integrated set of reactive and proactive resolution functions, which include service desk, incident management, and problem management. Optimizing Quadrant MOF recognizes that running IT operations successfully is an imperative to achieving business success in the competitive marketplace. The Optimizing Quadrant specifically addresses this idea by focusing on three fundamental elements of operations: Business-focused service level management IT cost management Planning for improvements and change 10 Microsoft Operations Framework As noted earlier, the mission of service for this quadrant is to reduce costs while maintaining or improving service levels. This is accomplished through the management and negotiation of service levels and the evaluation of several key operational metrics in the managed environment. With a thorough evaluation and subsequent understanding of these operational attributes, the IT staff moves from simply “running” a system to proactively managing a service solution. Planning is a key focus of this quadrant, making sure that the IT organization looks as far as possible into the future in order to align with the ever-changing business priorities—for example, in planning growth in areas such as capacity or anticipating changes in the labor pool and predicting its effects on workforce management. Placing the MOF quadrants into the logical order described here and applying the concept of iteration forms a spiral life cycle that can be applied to a specific application, a data center, or an entire operations environment with multiple data centers, including outsourced operations and hosted applications. Overview of the Operations Management Reviews The operations management reviews (OMRs) are a unique feature of MOF. Where ITIL points out that reviews of operations activity for efficiency and effectiveness should be conducted and describes these reviews at a high level, MOF makes these reviews an explicit part of the Process Model and provides detailed guidance on how to conduct them. Although there are many reviews and process checks that take place in any IT environment, these OMRs are specifically labeled on the Process Model diagram because they warrant senior management attention and can be used as a regularly reported “health check” on the state of the operations organization. (This document provides an overview of them; see the Resources section for pointers to detailed OMR documents and templates.) The operations management reviews are: Release Readiness Review Operations Review Service Level Agreement (SLA) Review Change Initiation Review The Process Model incorporates two types of management reviews: release-based and time-based. Two of the four reviews—Release Readiness and Change Initiation—are release-based and occur at the initiation and final installation of a release into the target environment, respectively. The remaining two reviews—Operations and Service Level Agreement—occur at regular intervals to assess the internal operations as well as performance against customer service levels. The reason for this mix of review types within the Process Model is to support two concepts necessary in a successful IT operations environment: The need to schedule and order the introduction of change through the use of managed releases. Managed releases allow for a clear packaging and scope of change that can then be identified, approved, tracked, tested, implemented, and operated. The release-based reviews accomplish this. The need to regularly assess and adapt the operational procedures, processes, tools, and people required to deliver and optimize the specific service solutions. The time-based reviews accomplish this. Process Model for Operations 11 The following table summarizes the mission of service and the operations management reviews for each of the four quadrants. Table 1. Mission of Service and Operations Management Review for Each Quadrant Quadrant Mission of Service Operations Management Review Changing Introduce new service Release Readiness solutions, technologies, systems, applications, hardware, and processes. Evaluation Criteria The release (the changes) The release package (all of the tools, processes, and documentation) The target (production) environment and infrastructure Rollout and rollback plans The risk management plan Training plans Support plans Contingency plans IT staff performance Operational efficiency Personnel skills and competencies Operations level agreements SLA defined targets and metrics Customer satisfaction Costs Cost/benefit evaluation of proposed changes Impact to other systems and existing infrastructure Performed prior to new release. Operating Execute day-to-day tasks effectively and efficiently. Operations Performed periodically. Supporting Resolve incidents, problems, and inquiries quickly. Service Level Agreement Performed periodically. Optimizing Drive changes to optimize cost, performance, capacity, and availability in the delivery of IT services. Performed at change identification. Change Initiation 12 Microsoft Operations Framework Overview of the MOF Service Management Functions As noted earlier, service management functions (SMFs) are the underlying processes and activities within each MOF quadrant that support the mission of service for that quadrant. These SMFs are at the core of the MOF Process Model. Although all of the SMFs are cross functional (and cross quadrant) in nature, each SMF is assigned a “home” or “primary” quadrant that aligns the functions performed with the mission of service for that quadrant. This natural alignment allows the IT manager to see all the key SMFs in each MOF quadrant that are required to effectively run the operations environment. The following diagram depicts the SMF alignment with each MOF quadrant in the Process Model. Figure 3. MOF and IT service management functions Many of the MOF SMFs shown in this diagram are based on ITIL. The notable exceptions are the SMFs in the Operating Quadrant, as well as the Workforce Management and Infrastructure Engineering SMFs in the Optimizing Quadrant. As a result, the Operating Quadrant is primarily where MOF provides most of the operations guidance specific to Microsoft products and technologies. In addition, due to the focus applied to IT operations and enterprise management by Microsoft, many products (both Microsoft’s and other vendors’) now incorporate features and functions directly designed to make them more supportable, reliable, and manageable in alignment with the MOF Process Model. Process Model for Operations 13 The MOF Process Model: An In-Depth View Introduction The following sections describe the MOF Process Model in more detail. The format for these sections is based on describing for each quadrant: The quadrant’s definition, goals, and objectives. MOF Team Model role cluster primarily involved in the quadrant. Operations management review. Service management functions. Note In the interest of brevity, only high-level descriptions of the individual SMFs are provided here. See the Resources section for pointers to the complete set of SMF guides. Changing Quadrant Definition, Goals, and Objectives The Changing Quadrant includes the processes and procedures required to identify, review, approve, and incorporate change into a managed IT environment. Change includes hard and soft assets as well as specific process and procedural changes. The goal of the change process is to introduce new technologies, systems, applications, hardware, tools, and processes, as well as changes in roles and responsibilities, into the IT environment quickly and with minimal disruption to service. The objectives for the Changing Quadrant are: Effectively respond to genuine business needs and demands. Maintain managed environments in a known state. Manage changes as a quantifiable and qualitative package. Smoothly deploy reliable new services. By specifying that changes must be approved before being made in the production environment, the mission of service for the Changing Quadrant incorporates the idea that changes should be inherently business driven, and that changes are made to create a competitive advantage. A fundamental principle of the Changing Quadrant is recognizing that the ability to quickly change and adapt the operations environment is a key, sustainable business advantage. Change management should be part of the entire project life cycle, not just part of steady-state operations. In many cases, change management processes have been hindered or blocked by red tape and bureaucratic committees. MOF recommends that the degree of scrutiny and rigor applied to change evaluation and adoption should be commensurate with the cost and risk associated with the change. Team Model Role Cluster The Release Role Cluster from the MOF Team Model is the primary role involved with each of the SMFs in this quadrant; the Release Role Cluster is also the connecting point to the Release Management Role Cluster within MSF, which initiates a solution deployment project. 14 Microsoft Operations Framework Operations Management Review The Release Readiness Review is the final management checkpoint and approval step before deploying a release. Through the Release Readiness Review, key attributes of a given release are assessed against standards, policies, and quality metrics, as well as release criteria that evaluate the readiness of the release, production environment, supporting release package, rollout and rollback plans, training plans, support plans, and the risk management plan. The Release Readiness Review results in a go/no-go decision about whether to deploy the release. Following deployment of the release, the change(s) now move to the change review process to evaluate and measure the success of the release in the production environment and to document lessons learned for future releases. This final review is called the post-implementation review (PIR) (which is also documented within ITIL). The SMFs The following three service management functions support the Changing Quadrant: Change Management Configuration Management Release Management MOF bases these SMFs on ITIL and extends them to include Microsoft-specific practices and additional industry best practices. Change Management The Change Management SMF is responsible for the process of documenting, assessing the impact of, approving, scheduling, and reviewing changes in an IT environment. A key goal of the change management process is to ensure that all parties affected by a given change are aware of and understand the impact of the impending change. Since most systems are heavily interrelated, any changes made in one part of a system may have profound impacts on another. Change management attempts to identify all affected systems and processes before the change is deployed in order to mitigate or eliminate any adverse effects. Typically, the “target” or managed environment is the production environment, but it should also include key integration, testing, and staging environments. The categories of assets that should be placed under change control are broad and include, but are not limited to, hardware, communications equipment and software, system software, applications software, processes, procedures, roles, responsibilities, and any documentation relevant to the running, support, and maintenance of systems in the managed environment. In other words, any asset that exists in the environment and is necessary for meeting the service level requirements of the solution should be placed under change control. Changes are also rated in their impact and urgency, and ITIL provides an excellent process flow for processing changes of different levels of importance. Configuration Management The Configuration Management SMF is responsible for identifying and documenting the components of the environment and the relationships between them. The goal of configuration management is to ensure that the current state is known and that only authorized components, referred to as configuration items (CIs), are used in the IT environment, and that all changes to CIs are recorded and tracked through the component life cycle. The information captured and tracked will depend upon the specific CI, but will often include a description of the CI, the version, constituent components, relationships to other CIs, location/assignment, and current status. Process Model for Operations 15 The information contained about the CIs should be held in a single logical data repository, referred to as the configuration management database (CMDB). Whenever possible, this database should be self-maintaining, with automated updates to CI records. CI records are the representation of the CIs in the CMDB, including attributes and relationships. At the enterprise IT level, this repository will often be a relational database with associated support tools, but for smaller organizations a spreadsheet may suffice. In addition, configuration management is responsible for maintaining the definitive software library (DSL), which serves as the repository for all master copies of software deployed in the IT environment. Configuration management is often confused with asset management. Asset management is an accountancy process that is a subset of the overall configuration management process and includes depreciation and cost accounting. Asset management systems typically maintain information on assets above a certain value, their business unit, purchase date, supplier, and location. The relationship to other assets is not usually recorded and the information is primarily used to track the whereabouts of expensive equipment. Release Management The focus of the Release Management SMF is to facilitate the introduction of software and hardware releases into managed IT environments and to ensure that all changes are deployed successfully. Typically, this includes the production environment as well as the managed preproduction environments. Release management coordinates and manages all releases and is typically the coordination point between the development release team and the operations groups responsible for deploying the release into production. In combining MSF and MOF in an end-to-end IT life cycle, this is the key point at which MSF-developed projects and solutions integrate fully with the MOF deployment process into a release product. The oversight role of release management is critical in the successful deployment of complex releases that often involve multiple service providers, operations centers, and user groups. Good resource planning and management are essential to successfully packaging and distributing such releases to customers. Release management takes a holistic view of a change to an IT service and ensures that all aspects of a release are considered together, both technical and non-technical. Releases should be defined, maintained, and scheduled for each IT service. Most organizations today implement changes on an as-needed basis—or worse, do not implement proactive changes such as service packs at all. The concept of releases and release management allows them to proactively schedule most changes so that high-importance and emergency changes that do not fit the change cycle are the exception, not the rule. 16 Microsoft Operations Framework Operating Quadrant Definition, Goals, and Objectives The Operating Quadrant includes the IT operating standards, processes, and procedures that are regularly applied to service solutions to achieve and maintain service levels within predetermined parameters. To successfully perform the underlying service management functions within this quadrant, the operations staff must ensure that specific technical guidance exists for a given service solution. Documented operations guides are the primary means for providing the prescriptive guidance and include the tasks and step-by-step procedures necessary to ensure the service solution is available and performs to stated requirements. They also reference standard service management functions and any required adaptation to these functions. Operations guides based on MOF now exist for many Microsoft server products and are available on Microsoft’s website. The goal of the Operating Quadrant is the highly predictable execution of day-to-day tasks, both manual and automated. The objectives of the Operating Quadrant include: Ensure that operations guides exist and are kept current for every service solution. Manage operating level agreements between the teams in support of the customer SLA. Provide automation to proactively monitor and self-heal system problems to the greatest extent possible. Team Model Role Cluster The Operations Role Cluster within the MOF Team Model is the primary role involved with the various SMFs in this quadrant. Operations Management Review The Operations Review is the management review within the Operating Quadrant. The primary goal of the Operations Review is to assess the effectiveness of internal operating processes and procedures and make improvements as appropriate. This review focuses on internal processes and procedures contained in the operating level agreements (OLAs) designed to support and fulfill the customers’ service level requirements, as well as how those activities can be improved. The information gathered in this review may be used in the customer-facing SLA Review. These improvements should go through the change management SMF processes described earlier. A secondary goal of the Operations Review is to validate that the operations staff has documented their day-to-day activities and tasks in a corporate knowledge management system. This ensures that the key operational knowledge remains current and accessible to all members of the operations staff. Process Model for Operations 17 The SMFs The seven service management functions in the Operating Quadrant are: System Administration Security Administration Directory Services Administration Network Administration Service Monitoring and Control Storage Management Job Scheduling The implementation of these SMFs will vary depending on the type of service solution being provided. In relating the MOF SMFs to ITIL, the SMFs in the Operating Quadrant are the most distinctive in that they are not based on any foundational processes provided by ITIL. Instead, these SMFs focus on the key operational activities required to manage a distributed computing environment, whether on the Microsoft platform or any other. They are also distinct from other SMFs within MOF in that they contain the majority of technical and/or product-specific guidance from Microsoft around MOF processes. System Administration System administration is somewhat of an umbrella process that is responsible for generally keeping IT systems running. The System Administration SMF administers centralized and distributed processing environments and often spans several tiers of operations and support. System administration is often referred to as operations management, both of which are very broad terms that need to be clarified within the specific IT organization or technology platform. However, system administration typically means overseeing a larger, enterprise-level organization and the administrative duties performed with that kind of organization. System administration includes responsibility for: Application management. Operating system administration. Messaging administration. Database administration. Web server administration. Telecommunications systems administration. 18 Microsoft Operations Framework Security Administration At the highest level, the Security Administration SMF is responsible for maintaining a safe computing environment. Security is a critical part of the IT infrastructure; an information system with a weak security foundation will eventually experience a security breach. The primary goals of security administration are to ensure: Data confidentiality. Only authorized individuals should be able to access data. Data integrity. All authorized users should feel confident that the data presented to them is accurate and not improperly modified. Data availability. Authorized users should be able to access the data they need, when they need it. The Security Administration SMF deals specifically with the day-to-day activities and tasks related to maintaining and adjusting the IT security infrastructure. In general, this includes daily risk management and mitigation, security administration based on the platform and technology being used, patch management administration, security incident management, and auditing and intrusion detection. Defined in terms of security policy and planning, the IT security infrastructure is further discussed in the Security Management SMF in the Optimizing Quadrant later in this document. Directory Services Administration Directory services allow users and applications to find network resources such as users, servers, applications, tools, services, and other information. The Directory Services Administration SMF deals with the day-to-day operations, maintenance, and support of the enterprise directory. The goal of directory services administration is to ensure that information is accessible through the network using a simple and organized process by any authorized requester. Directory services administration addresses: Directory-enabled applications. Metadirectories. User, group, and resource creation, management, and deletion. Daily support activities such as monitoring, maintaining, and troubleshooting the enterprise directory. In addition to the SMF guide for directory services, there is an extensively detailed operations guide for Microsoft Active Directory available on Microsoft.com. Network Administration Network administration is the process of managing and running all networks within an organization. The Network Administration SMF is responsible for the administration and maintenance of the physical components that make up the organization’s network, such as servers, routers, switches, and firewalls. Network administration must ensure that the network operates efficiently at all times to avoid any negative impact to the operation of the enterprise. This SMF works closely with the Infrastructure Engineering SMF (in the Optimizing Quadrant), which defines the architecture, topology, and components of the IT infrastructure. Network administration covers: Local area networks, including wireless and Internet access for employees. Wide area networks and storage area networks. Virtual private networks, including remote and dial-up access, as well as broadband and mobile devices. Daily support activities such as monitoring, maintaining, and troubleshooting all networked components including hardware. Process Model for Operations 19 Service Monitoring and Control Service monitoring allows the operations staff to observe the health of an IT service in real time. Within a distributed process environment, the accurate monitoring of a system is complicated by the integration of systems with partners and suppliers in automating a given company’s value and supply chain. To ensure the IT service remains available, the Service Monitoring and Control SMF is typically responsible for monitoring the following system components: Process heartbeat Job status Queue status Server resource loads Response times Transaction status and availability However, knowing the current health of a service or determining where a service outage might occur is of little benefit unless the operations staff has the ability to resolve it, or at the very least notify the appropriate group that a specific type of reactive or proactive action needs to occur. This is what is meant by the term “control.” When combined and implemented properly, the Service Monitoring and Control SMF provides the critical capability to ensure that service levels are always in compliance. Storage Management Storage management includes a great number of individual components such as servers, storage hardware, storage software, storage networks, tools, and operational processes that must be seamlessly melded together so that businesses can reliably safeguard their data while trying to realize cost and efficiency improvements. Businesses and organizations are also suffering from the tremendous data growth explosion as more and more information is stored electronically. Ensuring that these systems, and their stored data, keep operating is a critical part of business planning. While the Storage Management SMF lies within the Operating Quadrant and now includes the Print and Output Management SMF, it is intricately tied with the Optimizing Quadrant SMFs of Capacity Management, IT Service Continuity Management, Availability Management, and Security Management. Business continuance is the process of ensuring that critical data and systems remain available even if hardware, software, or environmental problems interrupt the primary servers’ normal operation. Storage management also works with the other SMFs in the Operating Quadrant to ensure that operating level agreements are achieved for items such as recovery time objectives and availability metrics, which then enable the customer’s SLA requirements to be met. 20 Microsoft Operations Framework Job Scheduling Job scheduling involves the continuous organization of (batch) jobs and processes into the most efficient sequence, maximizing system throughput and utilization to meet SLA requirements. The Job Scheduling SMF is closely tied to the Capacity Management and Service Monitoring and Control SMFs. The goal of job scheduling is to ensure that: SLAs and user requirements are met. Available capacity is used most effectively (the workload running at any given time does not exceed the acceptable capacity levels). Job scheduling entails defining: Job schedules. The workloads are organized by time periods (daily, weekly, monthly, annually) and jobs are scheduled for execution according to business needs, length of job, storage requirements, and associated dependencies. Scheduling procedures. Schedules are set up and maintained, conflicts and problems pertaining to scheduling are managed, and special needs (for example, as-needed jobs) are accommodated. Batch processing. Jobs are executed according to the work schedule, run priority, and job dependencies. Supporting Quadrant Definition, Goals, and Objectives The Supporting Quadrant includes the processes, procedures, tools, and staff required to identify, assign, diagnose, track, and resolve incidents, problems, and user/customer requests within the approved requirements delineated in the service level agreement. The key goal of the Supporting Quadrant is the timely resolution of these incidents, problems, and inquiries for end users of the IT services provided. The SMFs within this quadrant achieve this goal through the following objectives: Ensure that both reactive and proactive functions are in place to manage service levels. Prioritize the service desk’s focus on meeting customer needs and business requirements. Work with the Operating Quadrant’s SMFs in monitoring for issues before they affect the user. The reactive functions depend on an organization’s ability to respond and resolve incidents and problems quickly. The more desirable, proactive functions try to avoid any disruption in service in the first place by identifying root causes and resolving problems before any service levels are impacted. This is primarily achieved through effective monitoring of the service solution against predefined thresholds and by giving the operations staff time to resolve potential problems before they manifest into service disruptions. Clearly, although MOF defines these support processes in the Supporting Quadrant, they are in integral part of the daily functioning of every other quadrant, particularly the Operating Quadrant in tracing problems to their root cause. Process Model for Operations 21 Team Model Role Cluster The Support Role Cluster in the MOF Team Model is the role cluster most closely involved in implementing and facilitating the Supporting Quadrant SMFs. However, with the addition of the Service Role Cluster (in MOF version 3.0), the two role clusters (Service and Support) could potentially both be involved in the Supporting Quadrant SMFs. For example, the Service Role Cluster owns the overall end-to-end management of a specific service, such as provision of a messaging service solution across the organization, which would include the operations representation in the design and deployment of the solution as well as the operations aspects of the solution. The Support Role Cluster would work closely with the Service Role Cluster in this case, and would focus specifically on the service level and user support of the messaging solution. (These role clusters are discussed in detail in the MOF Team Model for Operations paper.) Operations Management Review The SLA Review is the operations management review that assesses the effectiveness of the IT operations group in delivering the agreed-upon service levels contained in the mutually approved (by both the customer and IT) SLA. This review focuses its assessment on the delivery of services to the customer and end users and is complementary to the Operations Review discussed earlier. Whereas the Operations Review focuses on internal operational efficiencies, the SLA Review focuses on external end-user service levels and any changes required to address inadequacies in these services. MOF recommends that customers, end users, and the operations staff use the SLA Review on a regularly scheduled basis (for example, monthly or quarterly) to monitor service delivery and to identify changes required in service levels, system functionality, new business requirements, and/or key process changes. The SMFs The Supporting Quadrant includes the following service management functions: Service Desk Incident Management Problem Management MOF bases these SMFs on ITIL and extends them to include Microsoft-specific practices and additional industry best practices. Service Desk The Service Desk is the overarching service management function of the Supporting Quadrant, which is also reflected in how ITIL refers to the service desk as a “function” rather than as a process, the only exception of its kind in ITIL. The Service Desk SMF provides guidance on setting up and running the organizational unit or department that is the single point of contact between the users and provider of IT services. It coordinates all activities and customer communications about incidents, problems, and inquiries related to production systems. Requests come to the service desk for help on solving issues and problems across a vast array of applications, communication systems, desktop configurations, and facilities. 22 Microsoft Operations Framework The Service Desk focuses on two areas: execution of the Supporting Quadrant SMFs and optimization of the service desk processes. Execution of the Supporting Quadrant SMFs includes: Managing the service desk staff. Monitoring service desk performance. Managing costs and charges. Reporting to management. Optimization of service desk processes includes: Comparing actual performance to commitments (for example, comparing service level metrics with customer and operations team’s OLAs and SLAs) and to industry benchmarks (such as Help Desk Institute). Optimizing headcount and staffing levels. Monitoring and continually assessing and improving service desk workflow and business processes. Monitoring and continually assessing and improving tools and technologies used in automating service desk activities. Incident Management Incident management is the process of managing and controlling faults and disruptions in the use or implementation of IT services, including applications, networking, hardware, and user-reported service requests. The effective management of incidents is a complex process that requires interaction with many other service management functions, most notably the Service Desk, Problem Management, Configuration Management, and Change Management SMFs. Because of this complexity and the need for clear communication about an incident, a robust incident taxonomy has been developed to facilitate incident management. The following list provides the key definitions within this taxonomy as well as summarizes the principle activities within the Incident Management SMF: Incident communication. Communicating to the enterprise the existence of and current status of service-disrupting incidents. Incident control. Ensuring that incidents are resolved as quickly as possible with minimal impact. Incident origin determination. Determining the infrastructure component or components that are causing the disruption. Incident recording. Ensuring that incidents are recorded as quickly as possible into the appropriate databases and support tools. Incident alerting. Communicating to all involved in the incident in order to ensure that action toward resolution is immediate. Incident diagnosis. Accurately determining the nature and cause of the incidents. Incident classification. Recording the incident and accurately allocating the correct degree of resources required for resolution. Incident investigation. Researching to determine if the incident is unique or has been experienced before. Process Model for Operations Incident support. Providing support throughout the entire life cycle of the incident in order to resolve the incident as quickly as possible and with the least impact to business processes. Incident resolution. Resolving the incident as quickly as possible through the effective use of all appropriate tools, processes, and resources available. Incident recovery. Returning the effected environment to stability once the incident has been resolved. Incident closure. Effecting proper closure of the incident, ensuring that all pertinent data surrounding the life cycle of the incident is properly discovered and recorded. Incident information management. Properly recording and categorizing incident-related information for future use by all levels and organizations within the enterprise. 23 Problem Management The key goal for problem management is to ensure stability in service solutions by identifying and documenting errors from the IT infrastructure, and either creating a workaround or initiating a request for change (RFC) to resolve or eliminate the root cause (where supported by the business case for doing so). The Problem Management SMF takes the lead in structuring the escalation process of investigation, diagnosis, resolution, and closure of problems. Problem management is closely interrelated with incident management performed at the service desk level. To better understand this interrelationship, it is necessary to understand the differences between incidents, problems, and known errors. The following table lists these key definitions. Table 2. Comparison of Problem Management Terms Item Definition Incident Any event that deviates from the expected operation of a system or service. Problem A condition identified from multiple incidents exhibiting common symptoms, or from a single significant incident, indicative of a single error, for which the cause is unknown. Known error A condition identified by successful diagnosis of the root cause of a problem when it is confirmed which configuration item is at fault, and a temporary or permanent fix or workaround is in place. 24 Microsoft Operations Framework The following diagram depicts the interrelationship of these items and the resultant connection with the Change Management SMF. Incident Mgmt. Incident Incident Incident Incident Incident Problem Mgmt. Problem Known Error RFC Change Mgmt. CI at Fault Change Figure 4. Incident, problem, and change management relationship An important aspect of problem management not to be overlooked is that problem management works proactively to prevent problems from occurring. For example, problem management works with availability management to ensure that increased redundancy is built into mission-critical systems and infrastructure components. Optimizing Quadrant Definition, Goals, and Objectives The Optimizing Quadrant’s activities entail planning and improving all aspects of IT service management, with a proactive, long-term holistic view of all the processes within the MOF Process Model. This quadrant’s functions include review of outages/incidents; examination of cost structures, staff assessments, and availability; and performance analysis as well as capacity forecasting. The goal of the Optimizing Quadrant is to add value to the business through the optimization of cost, performance, capacity, and availability in the delivery of IT services. The objectives of the Optimizing Quadrant include: Identify short- and long-term recommendations for changes that will lower IT costs. Assess and identify ways to improve or streamline processes and improve service levels across the IT organization. Align with the business growth and direction to evaluate existing operations and forecast future activity for IT operations. Process Model for Operations 25 Team Model Role Cluster Several role clusters could be actively involved with the SMFs in this quadrant depending on the organizational structure, size, geography, and business model. These include Security, Infrastructure, Partner, and Service role clusters. These are described in detail in the MOF Team Model for Operations paper. Operations Management Review The Change Initiation Review is the operations management review in which changes are evaluated for cost and benefits and in turn become the catalyst for the Changing Quadrant to begin executing the release. Changes may originate from anywhere: internal to IT, from the business, from supplies and partners, or any external source. The Change Initiation Review (formerly called the Release Approved Review) results in the formal approval of a proposed change, or set of changes, to be developed and packaged into a defined release. (The Change Initiation Review aligns with the change authorization process in ITIL.) This review is key to the operations environment because it begins the investment cycle for operations planning and deployment of a given release. The goal of the Change Initiation Review is to ensure that due diligence is performed in the costbenefit analysis of proposed changes. This is critical in deciding how best to spend the limited IT resources of any organization. It also ensures that the operations staff is appropriately represented in the decision-making process for these IT investments. In larger or more complex projects, this review corresponds directly to the MSF Project Plans Approved Milestone, which is the official approval to build the product/solution according to the defined specifications and timelines. It is at this point that money, people, and equipment now begin to come together to make the release a reality. The SMFs The following eight service management functions lie within the Optimizing Quadrant: Service Level Management Financial Management Capacity Management Availability Management IT Service Continuity Management Workforce Management Security Management Infrastructure Engineering The first five SMFs are based upon ITIL and have been extended to include Microsoft-specific practices and additional industry best practices. The sixth SMF—Workforce Management—is based on industry and Microsoft best practices. The last two SMFs—Security Management and Infrastructure Engineering—are new in MOF 3.0. Identified as a content gap in earlier releases, they are based on experience with customers and partners using and implementing MOF. 26 Microsoft Operations Framework Service Level Management MOF advocates the best practices of IT service delivery, and the Service Level Management SMF specifically provides a structured way for consumers and providers of IT services to meaningfully discuss and assess how well a service is being delivered. The primary objective of service level management can be summarized as providing the mechanism for setting clear expectations with the customer and user groups about the service being delivered and then measuring performance against these requirements. Satisfied customers are a result of first setting clear expectations and then consistently meeting those expectations through execution. The key activities within the Service Level Management SMF include: Creating a service catalog. Identifying and negotiating service level requirements for service level agreements. Ensuring that service level requirements are met within financial budgets. Setting accounting policies. Monitoring and reviewing support services. Financial Management The MOF Financial Management SMF is based entirely on the ITIL process by the same name. Its importance lies in the large role that financial management plays in controlling the overall costs that business incurs because of IT. (Latest analysts’ figures show that 50–60 percent, or more depending on which source is referenced, of an entire IT organization’s budget is typically spent on IT operations.) The efficient and effective use of MOF and ITIL best practices and guidance help to reduce the percentage required by IT, thereby enabling the money to be allocated to other parts of the business, such as innovating new products and services. Financial management activities include budgeting, IT accounting, charging (or charge back models), and system decommissioning. Budgeting includes predicting and controlling the spending of money within the organization. Financial management ensures that any service solution proposed to meet the needs identified from a request for change is justified from a cost and budget standpoint. This is often referred to as a costbenefit analysis and is included in an organization’s forecasting as well. The IT accounting process enables the IT provider to account for how its money is spent—for example, costs by customer, service, activity, organization, or any other of the myriad ways accounting is performed. Charging is the activity required to bill customers for services. In addition, many corporations today are utilizing cost allocation or charge-back models where business units are funding their own key IT projects. This places more accountability for the business value of IT projects in the hands of those who must justify the expenditure and prove the benefits. A consequence of these models is that they put more pressure on the IT groups to become more efficient and cost effective. With the surge in IT outsourcing, application hosting, and e-commerce, charging and charge-back models are becoming integral components of business operations. One more activity within the Financial Management SMF addresses system decommissioning or retirement. Far too often, a system or application is deployed and continues to be supported far past its useful life span. It is critical that systems be assessed over time to consider not only upgrades and new functionality, but also replacement, outsourcing, or simple retirement. Financial as well as business intelligence must be considered when making these types of assessments. Process Model for Operations 27 Capacity Management Capacity management is defined as the process of planning, sizing, and controlling business, service solution, and resource capacity such that it satisfies user demand within the performance levels established in the capacity plan and service level agreements. The Capacity Management SMF consists of the following three components: Business capacity management. Responsible for ensuring that the future business requirements for IT services are considered, planned, and implemented in time to be in place and functioning when the business needs them. Service capacity management. Focuses on the management of the performance of the production, operational IT services used by the customers. Resource capacity management. Focuses on the management of the individual components of the IT Infrastructure. Accomplishing these activities requires information about usage scenarios, patterns, and peak load characteristics of the service solution as well as stated performance requirements. Obviously, server and network capacity are key components to overall capacity and, based on the usage scenarios, the IT operations staff can set predetermined thresholds that will indicate when additional capacity is required. In addition to system parameters, it is important to consider staffing levels in capacity planning. As a service solution is required to scale to larger and larger loads, the manual activities associated with the solution may require an increased number of resources to support the increased load. An obvious example of this would be the service or help desk. Increases in user loads will generally increase the number of incidents that must be addressed. An often overlooked element of capacity planning is the operational processes themselves. Many times the processes deployed to deliver a service solution are not reevaluated when user volume increases until process response times become problematic. Analysis typically discovers that the process, while perhaps adequate for low user volumes, could not scale to support the increased loads. Thus, “process scale” must be examined on a regular basis along with the more traditional system parameters. Availability Management The singular goal of the Availability Management SMF is to ensure that customers can use a given IT service when they need it. A goal of maximum availability (with total annual downtime measured in just minutes for most organizations) is a worthy objective for any operations staff to achieve. Ensuring high availability for a service solution must begin early in the software or service development process. Here again, Microsoft frameworks add great value in that MSF is “operations aware,” or in other words, is complementary to MOF, and is positioned to ensure that designing for availability, reliability, manageability, and maintainability occur and are documented in the specification of the product or service (beginning with the project’s Envisioning Phase). Whether the service solution is an off-the-shelf package, custom application, or outsourced operation, high availability cannot be achieved without a solid technical architecture and system design. Assuming the service solution has been constructed to achieve high-availability requirements, it then becomes necessary to support the service with solid operational processes and skilled people. These latter elements are the key focus of this Availability Management SMF. Availability is related to, but different from, reliability. Reliability, in statistical terms, measures how frequently and at what intervals the system fails, whereas availability measures the percentage of time the system is in its correct operational state. 28 Microsoft Operations Framework The common method for calculating availability is to subtract downtime from total time and divide by total time. These numbers must be obtained from the service level agreement requirements in order to be accurate and meaningful. For example, downtime is defined as the occasions when users cannot utilize the service at the times prescribed by the SLA. For instance, if the SLA specifies six hours downtime every Saturday for maintenance on a reporting system, those hours do not become downtime that detracts from the availability of the service because it was originally agreed to in the user expectations. Total time is the number of hours that the service should be available for use as defined in the SLA. In this example, the weekly six-hour maintenance window would be subtracted from the total time. So, the key question is, how do you improve your system’s availability? The only variable in the equation that will affect this is downtime. Re-examine availability in the context of downtime. In looking at likely causes of downtime, the operations and support staffs require accurate configuration data as well as access to the incident and problem records. Changes may result from initiatives to improve service reliability and availability. The availability process manager must assess RFCs to establish their likely effect on reliability and availability and then review the implemented changes for their actual effect. The ITIL book on availability management considers availability management processes in great depth and detail. It also provides additional resources and templates, such as how to create an availability plan and how to use the IT Availability Metrics Model (ITAMM). See the Resources section for the website for ITIL books. IT Service Continuity Management The IT Service Continuity Management SMF, which was previously known in ITIL as contingency management, focuses on supporting the overall business continuity management process by ensuring that in the event of a business interruption, required IT services can be recovered according to an agreed schedule. The focus is on minimizing the business disruption of mission-critical systems. This process deals with planning to cope with and recover from an IT disaster. An IT disaster is defined as a loss of service for protracted periods, which requires that work be moved to an alternative system in a nonroutine way. It also provides guidance on safeguarding the existing systems by the development and introduction of proactive and reactive countermeasures. IT service continuity management also considers which activities need to be performed in the event of a service outage not attributed to a full-blown disaster. Many project methodologies, such as PRINCE2 and MSF, will refer to risk management as a critical area of managing a successful project. This is sound best practice, and the discipline of risk management applied to operations provides guidance in this area (see Resources section). IT service continuity management builds upon risk management principles and identifies key risks to service provision, assesses the likelihood of occurrence, determines the impacts, defines mitigation measures to reduce the probability of occurrence and/or reduce the impact of the risk condition, and provides contingency plans for business continuity in case the risk event actually occurs. Objectives of IT service continuity management include: Preventing interruptions to IT services as well as recovering services after an interruption occurs. Producing an effective service continuity (contingency) plan. This plan will be utilized in a time of disaster and/or protracted service outage to support the overall business process by ensuring that the required IT technical resources and services facilities can be recovered within the business time-scales that the SLA requires. Ideally, systems are designed to include sufficient levels of resilience, such as diversely rooted networks and geographically distributed servers, so that the design phase of the project addresses many of the requirements for sound IT service continuity planning. Again, MSF offers detailed guidance in these areas in designing for operations and infrastructure deployment. Process Model for Operations 29 Note that IT service continuity and availability management are significantly interrelated, but have different imperatives. Availability management is concerned with designing and building services with the appropriate availability characteristics under normal day-to-day operating conditions and expected downtime and maintenance. IT service continuity management is concerned with preparing for, preempting, and managing business interruptions—that is, not “business as usual” situations, but the unexpected and/or disasters. Risk management is heavily used in both SMFs. Workforce Management Achieving any of the objectives described in this paper requires an adequately skilled and trained workforce. It is important to put best practices in place to continuously assess changing economic conditions and impacts on the IT workforce and make the appropriate investments and adjustments. This includes recruiting, skills development, knowledge transfer, competency levels, team building, process improvements, and resource deployment. The Workforce Management SMF is unique to MOF. The Workforce Management SMF recommends best practices to recruit, retain, maintain, and motivate the IT workforce. Although ITIL emphasizes the need for good workforce management practices across all operations processes, MOF highlights workforce management by explicitly promoting it to the status of an SMF in the Process Model. The Workforce Management SMF is complementary to the MOF Team Model in that the Team Model describes the core operations functional roles and their activities in enabling MOF Process Model activities, while the Workforce Management SMF focuses more on the human resource components of staff development, training, and so forth. Security Management Security management has been elevated to a new SMF in the Optimizing Quadrant in MOF 3.0 for a number of reasons. The existing Security Administration SMF addresses the routine daily tasks of the Operating Quadrant in administering and maintaining security across the services and systems, but no higher-level guidance was available that told which security policies and guidelines should exist in the first place. Additionally, since the initial release of MOF, ITIL has published a new book on security management, and Microsoft’s own security group has conducted extensive work in creating process, policy, and technology guidance through the “Security Push” initiative of the past several years. Both of these resources are being used and referenced in the content guidance for the Security Management SMF. The goal of the Security Management SMF is to define and communicate the organization’s security plans, policies, guidelines, and relevant regulations defined by the associated external industry or government agencies. Security management strives to ensure that effective information security measures are taken at the strategic, tactical, and operational levels. It also has overall management responsibility for ensuring that these measures are followed as well as reporting to management on security activities. Security management has important ties with other processes; some security management activities are carried out by other SMFs, under the supervision of security management. Infrastructure Engineering The Infrastructure Engineering SMF is the second new SMF to be included in MOF 3.0. Again, this content was identified as a gap in the MOF Process Model and is based on experiences using MOF in both our internal IT operations as well as our customer environments with partners. Infrastructure engineering processes focus on ensuring coordination of infrastructure development efforts, translating strategic technology initiatives into functional IT environmental elements, managing the technical plans for IT engineering, hardware, and enterprise architecture projects, and ensuring quality tools and technologies are delivered to the users. 30 Microsoft Operations Framework IT personnel responsible for implementing the processes contained in the Infrastructure Engineering SMF typically perform coordination duties across many other SMFs, liaising with the staffs who implement them. The Infrastructure Engineering SMF has close links to such SMFs as Capacity Management, Availability Management, IT Service Continuity Management, and Storage Management, as well as across ITIL functions such as Facilities Management. It provides a means of coordination between separate, but related, SMFs that was previously lacking in MOF. The Infrastructure Engineering SMF includes the following activities: Ensuring that the technology and application portfolio aligns with the business strategy and direction. Directing solution design and creating detailed technical design documents for all infrastructure and service solution projects. Verifying the quality assurance efforts of infrastructure development projects and developing standard quality metrics, benchmarks, and guidelines. Identifying and making recommendations for reducing costs and/or increasing efficiency by employing technological solutions. Infrastructure engineering is, in several ways, an embodiment of MSF management principles within the MOF Optimizing Quadrant. The processes primarily involve project management and coordination, within an IT operations context. They are linked with nearly every other SMF in order to communicate engineering policies and standards and to ensure that they are included and adhered to when implementing projects and production functions. To accomplish this, those in the Infrastructure Role Cluster (of the MOF Team Model) work with management teams in each of the operations areas to apply guidance from the Infrastructure Engineering SMF. The MOF Risk Management Discipline is performed continually during this process to evaluate whether engineering standards and guidelines are helping to mitigate operations risks across the environment. Process Model for Operations 31 Using the Team and Process Models Together Overview of the MOF Team Model The MOF Process Model and the Team Model are the two core models that define Microsoft Operations Framework. The Team Model provides a flexible set of guidelines for organizing effective operations teams and describes the key activities and competencies of each role cluster in running and operating distributed computing environments. The role clusters in the Team Model exist in synergy with the SMFs of the Process Model; the Team Model role clusters enable the SMF processes to be carried out. A pointer to the online MOF documentation, including the Team Model, is listed in the Resources section of this paper. The diagram shown below displays the seven Team Model role clusters. Figure 5. The MOF Team Model role clusters Mapping of Team Model Role Clusters with SMFs It is helpful to consider where the ownership of SMF processes from the MOF Process Model resides within particular role clusters of the MOF Team Model. For example, if change management is to work well within an organization, then the ownership of that process should fall within the scope of a team whose mission of service or quality goal is aligned with the successful outcome of that process. The following table maps each Process Model SMF to the Team Model role clusters primarily responsible for it. Keep in mind that more than one role cluster can be involved in an SMF, and that a role cluster can be involved in many SMFs as well. 32 Microsoft Operations Framework Table 3. Team Model Role Cluster / Process Model SMF Mapping SMF MOF Role Cluster Security Changing Operating Partner Release Change Management Configuration Management Release Management Service Support Infrastructure System Admin Security Admin Operations Directory Services Admin Network Admin Service Monitoring and Control Storage Management Job Scheduling Supporting Optimizing Service Desk Incident Management Problem Management Service Level Management Financial Management Capacity Managementt Availability Management IT Service Continuity Management Workforce Management Infrastructure Engineering Security Management Process Model for Operations 33 Notice that the Partner Role Cluster does not explicitly own any processes. This is because the Partner Role Cluster is specifically responsible for working with supplier or partner groups outside of IT operations to facilitate the delivery of service to IT operations. The partner groups will often have their own processes controlling how they deliver services. 34 Microsoft Operations Framework Where to Start? Understand the Needs If you have just started learning about MOF and ITIL, a common reaction after reading about the MOF Process Model is "Okay, interesting and helpful info, but what do I do now? How and where should I start?” It is beyond the scope of this MOF paper to go into great depth on implementing service improvement projects, and there are excellent resources listed at the end of this document for taking the next step. This section, however, will provide a few things to consider and some example “quick wins” that can provide a catalyst for future thought and action. Each company has differing business goals that will affect how IT will be used to support the overall organization. In order for the business to meet the needs of its customers and shareholders while staying ahead of its competition, it must make the best use of its IT organization. Below are some current situations faced by many IT organizations. These scenarios provide some possible starting points for MOF to assist IT in meeting the goals of the business. Scenario: Is your business customer requesting you to commit to specific response time frames and/or levels of service? If so, it may be appropriate for you to further explore the Optimizing Quadrant content and, in particular, the Service Level Management SMF. The Optimizing Quadrant will provide you with the necessary information required to support an ongoing service commitment to your customers by ensuring that the proper processes are in place to manage your infrastructure for today and plan for the future (Capacity Management/Availability Management), ensure that you have the capabilities to support your agreed-to levels of service even in the event of an unplanned occurrence or disaster (Service Level Management/IT Service Continuity Management), and confirm that all of your IT teams are working with the same understanding and expectations to deliver service (OLA in Service Level Management). Scenario: Is your organization constantly on the “hot seat” because unscheduled outages occur and cause user downtime? Is the business requesting a large number of changes to existing infrastructure including servers, workstations, and application changes? If so, you may want to consider the Changing Quadrant and, in particular, the Change Management and Release Management SMFs. Having a solid process structure for building, testing, and deploying a new or updated infrastructure can assist IT to better manage change while reducing downtime. Change Management handles the approval and scheduling of the change request, while Release Management handles testing and deploying the change or release into production. Scenario: Does your organization frequently perform “fire fighting” to resolve infrastructure or application issues in production? Do you find it takes an extended period of time to restore service in your organization because the escalation paths through various support teams are unclear? If so, you might consider the Supporting Quadrant and, in particular, the Incident Management and Problem Management SMFs—with some alignment to Service Level Management. Incident Management ensures restoration of service as quickly as possible; while Problem Management focuses on root cause analysis and the application of lessons learned to prevent future “fire fighting,” thus assisting in expediting the restoration of service should the incident recur. The tie to Service Level Management ensures that operating level agreements are in place so that support groups understand how to escalate and respond appropriately within their support boundaries. Process Model for Operations 35 Example “Quick Wins” Broad experience in the service management community (ITIL and MOF) has shown that starting with “quick wins” and “low hanging fruit” proves the value of the process changes and enables the organization to move forward with bigger, longer-term, and more complex process improvement efforts. The following examples show the types of tasks that may assist an organization in demonstrating the value of process before proceeding with a more encompassing service improvement project (SIP). Table 4. Examples of SMFs and Quick Wins Change Management Job Scheduling Establish a standard Request for Change (RFC) form. Look at demand management techniques in order to run jobs in the most optimum manner. Release Management Infrastructure Engineering Ensure that Operations Acceptance Testing (OAT) is carried out as a separate exercise to User Acceptance Testing (UAT). Include member of infrastructure engineering in every Change Initiation Review to ensure that IT infrastructure can accept or be ready for the change. Configuration Management Storage Management Produce a “logical” CMDB from existing system Liaise with Capacity Management to agree to data (for example, inventory, human resources, data retention standards. Automate this with purchasing) rather than a “physical” CMDB. Service Monitoring and Control. Systems Administration Service Desk Document new users/user IDs into a standard change format. Merge multifunctional service desk into one central point of contact, either virtually or physically. Security Administration Incident Management Establish a standard password policy using strong password concepts. Establish consistent definitions of incident severities to ensure proper response by support staff to logged incidents. Network Administration Problem Management Ensure that the network has the capacity to perform full backups and restores. Ensure that changes are correctly referenced to problems so that the root cause is only resolved once. Directory Service Administration Availability Management Set up automatic input into the CMDB. Use historical analysis on hardware mean time before failure (MTBF) in order to proactively replace equipment before it fails. Service Monitoring and Control Service Level Management Identify system baselines to establish accurate thresholds for monitoring and alerting on normal and abnormal system utilization. Summarize service level agreements into two pages or less so that they are easier to read. Workforce Management Financial Management Define and document the target skill sets for operations staff so that they have training and career paths mapped for them. Charge on service management issues (service desk calls, problems/changes raised) rather then CPU, memory I/O, etc… 36 Microsoft Operations Framework Capacity Management IT Service Continuity Management Input threshold figures for CPU, memory, and disk to Service Monitoring and Control. Ensure that documentation copies are stored at an off-site location either electronically or physically. Security Management Publish simple, easy to follow security guidelines that all employees can understand and act on. Process Model for Operations 37 Summary Reading about all the processes and procedures required in a “best practices” IT environment based on MOF, or any other service management-centered framework, may result in a concern that simple changes to an IT environment can be very complicated. Implementing a fully functioning operations center based on service management practices may initially seem bureaucratic and costly. If MOF is implemented carefully and with common sense, however, bureaucracy can be kept to a minimum and the cost of implementing IT service management will be much less than the cost of not doing it. It is critical to realize that implementing MOF (or ITIL) is not an all-or-nothing concept. Both are meant as frameworks for you to pick and choose which pieces will work for your organization. Very few, if any, organizations will implement 100 percent of every detail described in MOF or ITIL because of lack of need, cost, impracticality, the culture of their business, or any other possible reasons. Many critical success factors exist in the successful implementation of IT service management. This paper has addressed just a few of these. But perhaps the guiding principle to remember when embarking on an IT service management improvement project is the need to evaluate the value to your organization of each process and procedure being considered or implemented. Evaluate the value to the business against the risk associated with failure or nonconformance. Often, a culture of fire fighting is the norm until the pain of the current situation becomes greater than the pain, or effort, required to initiate a change; at that point, change will occur. For example, the lack of adequate change control in many production environments exists because the need for rapid change is believed to outweigh the need for managing a stable environment. This attitude tends to change quickly when even minor changes to the target environment result in business outages. For further information about MOF and the MOF Process Model, please see the Resources section at the end of this paper. 38 Microsoft Operations Framework Appendix: Resources Once you have a general understanding of the needs of your organization, you will most likely want to take the next step and learn more. The tables below show the various options available to assist you in improving IT services within your organization. Where appropriate, a brief description of the resource and its intended audience is also provided. Training and Certification* Resource Description Microsoft Operations Framework Essentials (1737A) MOF Essentials provides an overview of the three models in the frameworks and leverages a simulation for applying concepts. (2 days) http://www.microsoft.com/traincert/syllabi/1737Afinal.asp. Microsoft Operations Framework Changing Quadrant (1787A) MOF Changing Quadrant provides an in-depth focus on the processes in the Changing Quadrant and includes a simulation for applying concepts. (3 days) http://www.microsoft.com/traincert/syllabi/1787Afinal.asp. Foundation Certificate in IT Service Management (ITIL Foundation) The Foundation Certificate is intended for people working in the field of IT Service Management. The Foundation Certificate is a prerequisite for the Practitioner’s and Manager’s certificates in IT Service Management. Practitioner’s Certificate in IT Service Management (aka ITIL Practitioner) The Practitioner’s Certificate in IT Service Management Problem Management is intended for those in an IT organization responsible for activities that are part of a specific ITIL process. Manager’s Certificate in IT Service Management (aka ITIL Service Management) The Manager’s Certificate in IT Service Management is aimed at managers and consultants in IT Service Management, especially those who are involved in implementing ITIL or advising on ITIL. The Foundation Certificate is a prerequisite for the Manager’s Certificate in IT Service Management. * For course availability, see http://www.microsoft.com/mof and http://www.itil.co.uk/. Websites Resource Description http://www.microsoft.com/mof/ All publicly available MOF content. http://www.microsoft.com/man Includes list of Microsoft partners certified on Microsoft agement Operations Framework and Microsoft Solutions for Management. Also includes tools (Microsoft and partners) that support MOF and ITIL processes. http://www.microsoft.com/msf/ All publicly available MSF content. http://www.itil.co.uk/ Official ITIL site. http://www.helpdeskinst.com/ Help Desk Institute. http://www1.bcs.org.uk/bm.asp British Computer Society (BCS) Industry Structure Model (ISM). !sectionID=514 site Process Model for Operations 39 Engagements Resource Description MOF overview presentation Contact your Microsoft Account Team (Account Executive, Service Executive, or TAM) to request an overview presentation on MOF. They can engage a resource to provide in-depth answers to your questions. MOF Operations Assessment The MOF Operations Assessment offers an in-depth look at particular processes within your organization. Certified consultants will assess your needs and deliver in-depth recommendations to assist your organization in streamlining IT operations. MOF Implementation This engagement typically follows a MOF Operations Assessment and is used to implement its recommendations. The length of this engagement varies depending on the needs of the organization. Service Improvement Project Microsoft Worldwide Services and all MOF partners can work with you on a structured service improvement project. This typically will follow a MOF Operations Assessment to identify your priorities and is an MSF-based project specifically designed to achieve the desired maturity levels in the priority processes identified in the assessment report. MOF Documentation Publication Description Audience MOF Executive Overview Business value of MOF, why organizations should adopt MOF, road map to adoption. CEO/CTO/CIO, senior IT managers, implementation teams. MOF Process Model for Operations High-level discussion of the MOF Process CIO, IT managers, Model, how it relates to the other models implementation teams. and disciplines, and implementation scenarios MOF Team Model for Operations High-level discussion of the MOF Team Model, how it relates to the other models and disciplines, and implementation scenarios. CIO, IT managers, implementation teams. MOF Risk Management Discipline for Operations Detailed discussion of the MOF Risk Management Discipline and its importance to an IT organization. All IT staff. MOF service management function (SMF) guides Detailed discussion of each service management function (SMF) as described in the MOF Process Model for Operations paper. Also discusses in detail the relationship between the SMFs, the other MOF models, and the Information Technology Infrastructure Library (ITIL) disciplines. IT managers, practitioner staff, implementation teams. 40 Microsoft Operations Framework Publication Description Audience Operations management review (OMR) guides Detailed discussion about the four operational management reviews, what they consist of, and how they are conducted. IT managers, practitioner staff, implementation teams. Product operations guides Detailed discussions regarding the operation of Microsoft products. Starting with the Windows® 2000 operations guides, all major Microsoft product releases will be accompanied by detailed product operations guides that may be freely modified for an organization’s internal use. Technical staff, implementation teams. Best practices/patterns and practices guides All best practices guides designed to impact operational processes should be MOF-compliant. Level of depth with references to MOF will vary based on the document’s target audience. Varies based on document. Solution accelerators There are many solution accelerators available through MSDN® and TechNet. All Microsoft Solutions for Management solution accelerators are built on MOF. Examples of these include Microsoft Solution for Patch Management, Service Monitoring and Control, Business Desktop Deployment, and Windows Server™ 2003 Deployment. Varies based on document. Microsoft Solutions Framework (MSF) documentation Varies by document. Listed here to ensure that those interested in MOF understand that there is a connection between MOF and MSF and references between the two frameworks appear throughout both sets of documentation. Varies by document. Process Model for Operations 41 ITIL Documentation* Publication Description Planning to Implement Service Management (ISBN: 0113308779), 2002 This title explains the steps necessary to identify how an organization might expect to benefit from ITIL, and how to set about reaping those benefits. Service Support (ISBN: 0113300158), 2000 Focuses on ensuring that the customer has access to appropriate services to support the business functions. Issues covered include service desk, incident management, problem management, configuration management, change management, and release management. It expands the necessary interactions between these and other core IT service management disciplines, and updates best practice to reflect recent changes in technology and business practices. Service Support is published in print and CD-ROM formats and a demo version of the CD-ROM has also been made available. For more information, follow the links below. Service Delivery (ISBN: 0113300174), 2001 The second element in the new ITIL series. Service providers need to offer business users adequate support. Service Delivery covers all aspects that must be taken into consideration. Issues covered include service level management, financial management for IT services, IT service continuity management, availability management, contingency planning, and capacity management. The purpose of Service Delivery is to show the links and the principal relationships between all the service management and other infrastructure management processes. Service Delivery is published in print and CD-ROM formats and a demo version of the CD-ROM has also been made available. For more information, follow the links below. Security Management, (ISBN: 011330014X), 1999 This is a recent ITIL guide that explains the process of security management with IT service management. The guide focuses on the process of implementing security requirements identified in the IT service level agreement, rather than considering business issues of security policy. The book was developed taking into consideration the plans for consolidating and interlinking the ITIL Service Support and Service Delivery core guides. ICT Infrastructure Management (ISBN: 0113308655), 2002 Covering network service management, operations management, management of local processors, computer installation and acceptance and, for the first time, systems management. 42 Microsoft Operations Framework Publication Description Application Management (ISBN: 0113308663), 2002 Embracing the software development life cycle, expanding the issues touched on in Software Lifecycle Support and Testing of IT Services. Applications Management also provides more detail on business change, placing emphasis on clear requirement definition and implementation of solutions. The Business Perspective (ISBN: 0113308949), available in 2004 Concerning the understanding and provision of IT service provision. Issues covered include business continuity management, partnerships and outsourcing, surviving change, and transformation of business practices through radical change. The ITIL Back Catalogue The ITIL Back Catalogue represents an historical repository of knowledge on IT service management. Given its age, however, it is not part of OGC's current Best Practice Portfolio, and may not be updated in the future. The back catalogue is available as PDF downloads in three discrete sets, each of which can be purchased on a single user basis or on a network user basis. It is available at www.tso.co.uk/ITIL/ * The above ITIL publications are available from The Stationery Office Online Bookshop at www.tso.co.uk. Process Model for Operations Contributors Program Manager Kathryn Pizzo (Rupchock), Microsoft Corporation Writers Karri Alexion-Tiernan, Microsoft Corporation Bret Clark, Microsoft Corporation Mike Lubrecht, Microsoft Corporation Steven McReynolds, Microsoft Corporation Joe Michelotti, Microsoft Corporation Thierry Paquay, Microsoft Corporation David Pultorak, Fox IT Jeff Yuhas, Microsoft Corporation Editor Patricia Rytkonen, Volt Technical Services Production Editor Kevin Klein, Volt Technical Services Additional Contributors Vic Auletta, Merrill Lynch Dave Backham, Microsoft Corporation Marcel Burghoorn, Microsoft Corporation David Cannon, ManageOne Angel Del Gaudio, Microsoft Corporation Laurie Dunham, Microsoft Corporation Holly Dyas, Microsoft Corporation Jerry Dyer, Microsoft Corporation Neil Fairhead, Microsoft Corporation Ken Hamilton, ManageOne Graham Hammonds, FoxIT Paulo Henrique Leocadio, Microsoft Corporation Steve Ingall, FoxIT 43 44 Sergio Klarreich, Microsoft Corporation Morten Lauridsen, Microsoft Corporation Volker Leitzgen, Microsoft Corporation Charles Levy, Microsoft Corporation Carroll Moon, Microsoft Corporation Neil Pinkerton, Microsoft Corporation Andy Savvides, Consultant John Shinner, Microsoft Corporation Wallace Simpson, Microsoft Corporation Marvin Stein, Microsoft Corporation Shane van Jaarsveldt, Microsoft Corporation Gijs van Weijen, Microsoft Corporation Hans Vriends, Pink Roccade Suzana Vukcevic, Microsoft Corporation Randy Young, Microsoft Corporation Microsoft Operations Framework