Technical Case Study - Center

Security Patch Management Evolution
for Data-Center Servers at Microsoft
Published August 2013
The following content may no longer reflect Microsoft’s current position or infrastructure. This content
should be viewed as reference documentation only, to inform IT business decisions within your own
company or organization.
Assessing and maintaining the integrity of software in a networked environment through a
well-defined patch management program is a key first step toward successful information
security. By focusing on policies, technologies, and processes, Microsoft Information
Technology (MSIT) was able to reduce risk, improve performance, and improve availability
of software resources at Microsoft.
Situation
Without a standardized tool or process, Microsoft IT was challenged to manage data center server
patching. This resulted in unacceptable vulnerability to Microsoft's server environment.
Solution
Microsoft IT chose a multi-pronged approach to address this situation. Focusing on policy
changes, technology solutions and well defined processes enabled MSIT to achieve their goals.
Security patch management is a process that gives organizations control over the deployment and
maintenance of interim software patches into their production environments. It helps
organizations maintain the security and stability of the production environment.
At Microsoft, the configuration management program of today evolved from a program that
initially used Microsoft Systems Management Server (SMS) to address only security patch
management. When System Center Configuration Manager released, MSIT began to use the
product as a discovery mechanism for asset inventory information and security patch
management.
From the perspective of managing security patches, not much has changed from the core activities
of earlier efforts. The number of servers for which MSIT manages the configuration continually
grows—up from 24,000 servers in 2010 to 34,000 servers in 2013. The integration and
enhancement of the features available in Configuration Manager has helped MSIT keep up with
the ever-increasing number of threats and the volume of security patches now regularly released.
Benefits
 Patch compliance increased from 70% to 96%
 Patch variability decreased from 40% to 5%
 The patch cycle improved from 30 to 19 days
2 | Technical Case Study
Situation
In 2010 MSIT continued with renewed rigor, a journey to improve the security of data center servers.
The primary driver of this effort centered on server security.
Server security is as important as network security because servers often hold a great deal of an
organization's vital information. If a server is compromised, all of its contents may become available
to steal or manipulate at will. Applying security patches in a timely fashion highly reduces the risk of
having a security breach and all the related problems that come with it, like data theft, data loss, or
even legal penalties. Patches were being applied to Microsoft servers on average 30 days from patch
release, leaving vulnerabilities to zero-day attacks that occur during the vulnerability window that
exists in the time between when a vulnerability is first exploited and when developers start to develop
and publish a patch to counter that threat.
Contributing factors to the situation included having many instances of Microsoft System Center
Configuration Manager spread across IT adding cost and complexity to operational management of
the environment. In addition, patch compliance was running at 60% with variability of 20-40%. This
resulted in compounding vulnerabilities month over month as patches lagged.
Long patching cycles, low patch compliance and high variability left Microsoft vulnerable to well
published hacks as well as emergency situations. This necessitated emergency scrambles and out of
band patching requirements resulting in increased costs as large teams of people rallied to address
the issue. An additional negative outcome was outages for users as patching took line of business
applications and operations offline.
Solution
MSIT's solution approach included policy changes, use of new technologies and process changes.
This multi-pronged approach supported the increasing need Microsoft had to ensure a secure
environment.
Policies
One of the foundational policies required to improve patching at Microsoft was the implementation
of compliance deadlines. The organization was serious about limiting and meeting their risk
obligation to the board of directors which required senior leadership to uphold compliance deadlines.
MSIT adopted the policy that that a patch not installed by a server owner prior to a compliance
deadline would be installed for them. Executive sponsorship was key to getting server owners to
participate and adhere to deadlines.
Technologies
In addition to policy implementation, technology was also adopted to support the goals. For 2010
the focus was on configuration manager server agent health. Instead of continually reviewing issues
server by server, MSIT started grouping issues by symptom and doing root cause analysis on the
largest buckets of issues. Once root cause was determined and the fix implemented, a new baseline
would be measured and the process repeated until that bucket of symptoms was at zero. This focus
was responsible for the jump in patch compliance in 2011 from 70% to 90%.
In 2013 MSIT expanded their automation tool set to include System Center Orchestrator, a
component of the System Center suite.
The first scenario targeted was patching servers in a clustered environment. In this complex scenario,
the goal is to patch and reboot each server participating in a cluster in sequenced fashion, ensuring
the end-user experience is not compromised. Traditionally an operator running scripts and validation
steps tailored to an application would perform these steps until each server in the cluster was
compliant. Using Orchestrator, these scripts and business logic were transformed into a workflow and
programmatically executed across the entire cluster. The result was improved predictability by
reducing error-prone manual activities.
3 | Technical Case Study
Orchestrator is also used as the “suspenders” to the “belt” provided by System Center Configuration
Manager. In situations where Configuration Manager logs a failed attempt to patch a server, a signal
is passed to Orchestrator to initiate a standard patch workflow. The workflow repeats until the server
is successfully patched, or the service windows expires. In this scenario transient infrastructure or
unhealthy SCCM issues are mitigated.
With the addition of System Center Orchestrator, MSIT has improved patch compliance from 90% to
96%, and done so with a smaller labor footprint.
Processes
Along with policy and technology efforts, there was a significant focus on processes. This included reengineering current processes as well as implementing new ones. One of the initial wins in 2010 was
consolidating system center configuration manager server instances into one operational group in
MSIT. Approximately 150 instances were consolidated into a handful of centrally managed servers.
This resulted in decreased operational and maintenance costs as the footprint to manage became
much smaller.
Also in 2010, MSIT implemented a new role called Service Transition Managers to be interface
between IT operations and internal IT group needs. This provided an opportunity to onboard internal
clients to more automated processes and tools decreasing variability further and decreasing the need
for manual patching across the company. The priority was on driving adoption of the automated
patching service with internal MSIT groups. Service Transition Managers collected requirements for
further features to the service to increase adoption.
In 2012, MSIT instituted the Patch Cycle Triage process. This included weekly instead of monthly
reviews of agent health issues and publishing of metrics and reports to all patching stakeholders. This
process change increased the visibility of the patching efforts and clear accountability resulted in
more complete and rapid resolution of issues.
Below is a list of example metrics that MSIT gathers data on for review and to ensure visibility to the
overall performance of the area.
Table 1. Patch Management Metrics
Metric
Description
Number of patches released
Number of released patched per month, provides a baseline
for month-over-month comparison.
Overall compliance per patch cycle
Patch success ratio (per patch)
Patch success ratio (per server)
Number of support incidents (per
patch)
Agent health – 98% healthy
(daily measurement)
Time from smoke test success to
60% saturation deployment
Overall compliance metric for all patched servers in the
environment against the successful deployment of all
updates during a patch cycle.
This metric can be used to determine whether a single patch
failure negatively impacted overall compliance metrics.
Can be used to determine whether a specific type of server
or configuration is the common factor in patch success or
failure
Number of support engagements that are initiated during a
patch deployment per patch.
Number of systems with a CM agent installed which have
successfully returned inventory data and patch results within
configured refresh schedule
This measurement establishes an ongoing baseline
comparison that helps validate each milestone success of the
patch process in meeting overall compliance goals for each
patch cycle.
4 | Technical Case Study
MSIT has formalized the security patch process. Patches are released the second Tuesday of every
month. MSIT has adopted a 19-day cycle to complete patch and software updates. The 19-day cycle,
developed in cooperation with executive leadership, operations, server and application owners, and
Information Security, balances the desire to reduce risk and provide the business the time to prepare
and orchestrate updates across test and production servers. The process drives the activities of the
teams that are accountable for security patching. This 19 day cycle is a significant improvement from
the 30 day cycle followed in 2010.
To provide context for this process, the below diagram outlines the architecture that Microsoft uses
for server configuration management.
Architecture for server configuration management at Microsoft
5 | Technical Case Study
Conclusion
The biggest change that has occurred since Microsoft first employed a patch management process
has been the cadence and consistency in which patches are applied. The established process of patch
management allows predictability for a server or application owner, resulting in the ability to meet
compliance expectations. Patching compliance increased from 70% in 2010 to 96% in 2013 and patch
variability decreased from 20-40% in 2010 to 3-5% in 2013.
Improved processes and the use of System Center Configuration Manager and System Center
Orchestrator have reduced the patch cycle from 30 to 19 days, despite a steady increase in the
number of released patches, the inclusion of non-security software updates, software distributions,
and growth in the number of servers in the environment. Successfully deploying System Center
Configuration Manager and the Orchestrator based solutions functionality has automated patching
and significantly reduced manual patching efforts.
The security patch management service was designed to proactively narrow risk by shortening the
amount of time that a security or configuration vulnerability can affect servers on the network. This
has been achieved through the creation of a predictable global process, centralized reporting and
administration, and policy support to ensure compliance.
Resources
Server Configuration Management at Microsoft
Microsoft IT was able to improve performance and server availability and reduce risks by shortening
the cycle time to deliver security and non-security updates. Desired configuration management has
enabled IT administrators to identify configuration drift across platforms services and Line of Business
applications.
Technical White Paper
Related videos
Delivering Results - Using System Center Orchestrator to Patch Complex Data Center Scenarios (Level
200)
Learn how Microsoft achieves immediate and greater than 95% patch management compliance
including remediation within maintenance windows for complex automation scenarios using System
Center 2012 - Orchestrator.
Listen
How Microsoft IT Implements Server Patch Management
Minimizing the threat of vulnerabilities requires organizations to have properly configured systems, to
use the latest software, and to install the recommended software updates. Assessing and maintaining
the integrity of software in a networked environment through a well-defined patch management
program is a key first step toward successful information security. Microsoft IT uses the Systems
Center Suite as the primary solution in its server patch management process.
Watch video
Learn more
6 | Technical Case Study
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information
Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750.
Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access
information via the World Wide Web, go to:
http://www.microsoft.com
http://www.microsoft.com/microsoft-IT
© 2013 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered
trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The
names of actual companies and products mentioned herein may be the trademarks of their respective
owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES,
EXPRESS OR IMPLIED, IN THIS SUMMARY.