KPI and Analytics Workshop

advertisement
KPI and Analytics
Workshop
itSMF – May 2014
This deck was designed to be used in small group formats to
discuss, develop and improve KPIs and Analytics.
The examples within are common ITIL KPIs and contain both
lagging and leading indicators.
Each group should select 2-3 examples to work through and
present results back to the class.
Introduction to Performance Analytics
• Performance Analytics supports Performance
Management
• Performance Management
— Wikipedia:
Performance Management includes activities which ensure that goals are
consistently being met in an effective and efficient manner. Performance
management can focus on the performance of an organization, a department,
employee, or even the processes to build a product of service, as well as many
other areas.
— Allows organizations to align their resources, systems, and employees
to strategic objectives and priorities
• Performance Analytics supports Performance Management by
delivering dashboards, scorecards and analytics to define, measure
and improve Key Performance Indicators (KPIs)
Introduction to Performance Analytics
Steps for successful Performance Management are:
• Identify the processes/activities/domains of what needs
to be managed
• Articulate the goals to be met
• Define the metrics/indicators that measures if goals are
being met
• Collect the data for the indicators
• Take improvement actions
KPI Example #1:
% of closed incidents without a Configuration Item (CI)
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of closed incidents without a CI /
Articulate the goals to be met
• An increasing trend of incidents being closed with
being related to a CI, whether component or
Business Service, is critical for building a history of
‘what’ the incidents are associated with
• As a result of improving on this KPI, an IT
organization will be capable of understanding and
reporting on trends as they relate to CI’s – which is
a key input into the Problem Management process
Number of closed incidents * 100
Suggested improvement actions
• Although CI’s may not always be identifiable at
•
ticket creation, set configuration item field to be
a mandatory field upon setting the incident state
to resolved or closed
Consider using coaching loops or weekly
exception reporting to identify undesired
behavior in assigning CI’s to incidents
What do the ITIL books say?
• Relating incidents to CI’s is essential for quality
reporting
• Measuring MTBF, MTTR, and MTBSI are all
dependent on being able to correlate incidents
being logged against devices/services (CIs)
KPI Example #2:
% of new critical incidents
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of new incidents with critical priority /
Articulate the goals to be met
• Understanding the trend of when critical incidents
are logged, can help a service provider to
understand important patterns such as:
– Relationships of the incidents to implemented
Number of new incidents * 100
Suggested improvement actions
• A weekly review by a Problem Manager to
identify possible trends that warrant root cause
investigation, and drive a reduction in major
incidents
(and problematic) changes or projects
What do the ITIL books say?
– Common timeframes for critical incidents
with a priority of ‘critical’
being logged which can help managers to staff
the support teams
• This KPI is a key input to the Problem Management
process, as the patterns and trends can be used to
trigger investigations into the underlying (and
possibly recurring) issues that should be
understood and corrected
• ITIL KPI: Percentage of daily incidents logged
KPI Example #3:
% of open and overdue incidents
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of overdue incidents (not closed and not
solved within the established time frame) relative
to the number of open (not closed)
Articulate the goals to be met
• A key element of quality service delivery is
meeting and exceeding the expectations of your
customers.
– Customer expectations should be managed by
quantifying (with customers) them into
measureable SLAs.
• Service providers who cannot consistently meet
the agreed upon expectations of customers, will
quickly be perceived as incapable.
• These perceptions are very difficult to change,
once they have been formed.
Suggested improvement actions
• Use notifications to escalate SLA thresholds
• A daily review of previous the previous days SLAs
targets to identify breaches
What do the ITIL books say?
• ITIL KPI: Percentage of open incidents that have
•
not been resolved within agreed timelines (SLA)
The desired direction for this KPI is for it to be
reducing
KPI Example #4:
% of incidents closed in time (within SLA)
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of incidents closed in time (based on SLA) /
Number of incidents closed that should have
been closed in time (based on SLA) * 100
Articulate the goals to be met
• SLAs are promises made to customers regarding
levels of service to be expected
• This KPI is a reflection of how well the service
provider has been able to meet the mutually
agreed upon targets, therefore reinforcing the
capability, reliability, and dependability of the
service provider
Suggested improvement actions
• Notifications based on elapsed SLA timeframes
• Daily reports/reviews of incidents that were not
resolved within the SLA
What do the ITIL books say?
• It is essential that IT aligns incident management
•
activities and priorities with those of the
business
ITIL KPI: Percentage of incidents handled within
agreed response time
KPI Example #5:
% of open incidents not worked in last 30 days
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of open incidents not worked on in the
Articulate the goals to be met
• Focus is placed on ensuring that incidents are not
neglected and/or forgotten
• An increase in this KPI could be an indicator of:
– Incident workload is exceeding capacity.
– Staff may be lacking skills/knowledge to
resolve incidents
– Incidents may be kept open while
investigating ‘root cause’. This is a classic
process problem, as this activity should be
done within Problem Management, to avoid
affecting Incident metrics.
last 30 days / Number of open incidents * 100
Suggested improvement actions
• Regular review of aging incidents
• Use of escalations/notifications/SLAs
What do the ITIL books say?
• This KPI supports the focus of ‘resolving
incidents as quickly as possible while
minimizing impacts to the business’
KPI Example #6:
% of incidents closed by first assigned group
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of incidents closed by first assigned
Articulate the goals to be met
• Focus is placed on ensuring that the number of
interactions required between teams before an
incident is resolved, is minimal.
• Incidents that can be resolved by the first
assigned group, are a demonstration of:
• The service desk being well trained to know
which teams support which CIs / services
• The support team is knowledgeable and
capable of resolving incidents dispatched to
them
• A good knowledge base for both service desk
and support teams
group / Number of closed incidents * 100
Suggested improvement actions
• Accuracy of knowledge base
• Use of assignment groups
What do the ITIL books say?
• This KPI supports the focus of ‘resolving
incidents as quickly as possible while
minimizing impacts to the business’
• Complimentary KPI to MTTR, as MTTR
highlights overall duration; while this KPI
highlights the ability to engage the right
resources early – to support MTTR.
KPI Example #7:
MTTR - Mean Time To Repair
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Mean time (in hours) between incident start to
Articulate the goals to be met
• Provides valuable insight for customers in regards
incident resolution.
Suggested improvement actions
• Automated assignment lookups and routing
• Accuracy of the CMDB
• MTTR trends are especially useful when reviewing • Incident templates
workloads and staffing needs
• SLA escalation and notifications
• Majority of SLAs between IT and their customers What do the ITIL books say?
are based on MTTR, or timeliness of resolution
• MTTR is the average time taken to repair a
to their ability to quickly resolve service
interruptions
configuration item or IT Service after a Failure.
• MTTR is measured from when the CI or Service
fails until it is repaired.
KPI Example #8:
Request Backlog
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of new requests – number of closed
Articulate the goals to be met
Suggested improvement actions
• Increasing backlogs of tickets created vs. tickets
• Trend request backlogs using the 7 day running
closed are indications of demand exceeding
delivery capability
• Increased backlogs can indicate a reduction in staff
productivity and increase in customer
dissatisfaction
• Backlogs should be closely monitored to ensure
that requests are managed within the expected
customer commitments (SLAs)
requests
•
SUM aggregate, to manage weekly trends – from
a high level perspective
Monitor individual SLA commitments on request
and requested item records to manage trends by
staff/day workloads – from a daily perspective
What do the ITIL books say?
• Customer requests should be closely managed to
ensure that user requests are properly
reviewed/approved and fulfilled within expected
timelines
KPI Example #9:
% open problems classified as known error
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of open problems classified as known
Articulate the goals to be met
Suggested improvement actions
• Problem records that never get classified as a
• Priority setting and resource allocations should
known error, are an indication that the problem is
either low priority (should it really be a problem),
or that the organization may have a lack of
training/skills/resources to adequately investigate
logged problems.
error / Number of open problems * 100
•
be done during weekly problem review meetings
Problem managers should ensure that the most
important problem records are being worked on,
and that progress is being made towards
identifying root cause.
What do the ITIL books say?
• Known errors document the status of a problem,
its root cause and workaround.
KPI Example #10:
Related incidents in open problems
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• The sum of the related incidents
Articulate the goals to be met
• Incident matching is an important part of the
Incident Management process.
• Matching an incident to a documented cause of
similar incidents (problem) can assist in timely
resolution of incidents
• Support staff can leverage workarounds that are
documented in the related problem record, as part
of resolving new occurrences of the problem.
(problem.u_related_incidents) to the number of
problems open at the end of the collection date.
Suggested improvement actions
• Foster and encourage a culture of relating
incidents to problem records, during the logging
(if possible) and resolution stages of the Incident
lifecycle.
What do the ITIL books say?
• Incidents related to problems are an indicator of
recurring incidents, and should be leveraged for
prioritizing problem investigations.
KPI Example #11:
% of rejected changes
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of closed changes rejected / Number of
Articulate the goals to be met
Suggested improvement actions
• Changes take time and effort to plan/prepare for
• As part of the Change Management process, the
closed changes * 100.
implementation
• Rejected changes are an indication of poorly
planned changes
• An increase in this KPI can indicate the following:
– Change requestors are not planning outside of their
own technology ‘silo’ to understand the larger
potential impact of proposed changes
– Increase in wasted resources, as time spent on
planning/documenting rejected changes does not
produce value
•
Change Manager should include a review of
‘rejected changes’ in the CAB agenda.
This review will use a continual improvement
focus, so that the reason for rejection can be
understood and avoided in future.
What do the ITIL books say?
• Change authorities have a responsibility to
review proposed changes, and only authorize
activities that adhere to the risk and business
value policies of the organization
KPI Example #12:
Average % of emergency changes in last 7 days
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of new changes classified as
Articulate the goals to be met
• Emergency changes are associated with higher
levels of risk, as they involve less planning,
coordination, and scrutiny than comprehensive
changes (aka: ITIL ‘normal’ changes).
• This KPI is aligned to the goal of optimizing overall
business risk.
emergency / Number of new changes * 100
Suggested improvement actions
• Establish a defined emergency change approval
•
process
Weekly reviews of emergency changes
implemented, to any determine negative
impacts
What do the ITIL books say?
• The number of emergency changes proposed
should be kept to an absolute minimum, because
they are generally more disruptive and prone to
failure
• ITIL KPI: Reduction in the percentage of changes
that are categorized as emergency changes
KPI Example #13:
% of new incidents caused by changes
Identify the processes/activities/domains of
what needs to be managed
Define the metrics/indicators that measures
if goals are being met
• Incident Management
• Number of new incidents caused by changes /
Articulate the goals to be met
• One of the fundamental objectives of Change
Management is to minimize disruption to the
business
• This metric assists in analyzing the extent to which
this objective is achieved
• The number of incidents resulting from change
should be minimized and decrease over time.
Number of new incidents in last week * 100
Suggested improvement actions
• Preventative: Ensure testing of changes includes
•
scenarios simulating the live environment, to
identify unanticipated change impacts.
Reactive: Daily reviews of incidents caused by
changes, to determine any issues with current
projects/change efforts.
What do the ITIL books say?
• ITIL KPI: % Reduction in the number of incidents
attributed to changes
• Aligned with the ITIL goal of ‘optimizing overall
business risk’
KPI Example Bonus:
Lost production days of closed incidents
Identify the processes/activities/domains of
what needs to be managed
• Incident Management
Articulate the goals to be met
• To the Business, a loss of productivity means fewer
products shipped, less items sold, more paid
resources doing less
• It is important for service providers to understand
the impact that IT services are having on the
Business
• IT needs to be efficient and effective when
delivering IT services, but must never lose visibility
of the impact the quality of services has on its
customers
Define the metrics/indicators that measures
if goals are being met
((([Closed incidents for Priority-Critical]a=0 * [Mean
time to resolve (MTTR) incidents - Cumulative
AVERAGE year to date]a=6 * [critical-incident people
affected] * [Organization Size]) + ([Closed incidents for
Priority-Urgent]a=4 * [Mean time to resolve (MTTR)
incidents - Cumulative AVERAGE year to date]a=6 *
[urgent-incidents affected people] * [Organization
Size]) + ([Closed incidents for Priority-Moderate]a=18 *
[Mean time to resolve (MTTR) incidents - Cumulative
AVERAGE year to date]a=6 * [moderate-incidents
affected people] * [Organization Size]) + ([Closed
incidents for Priority-Normal]a=8 * [Mean time to
resolve (MTTR) incidents - Cumulative AVERAGE year to
date]a=6 * [normal-incidents affected people] *
[Organization Size]) + ([Closed incidents for PriorityLow]a=6 *[Mean time to resolve (MTTR) incidents Cumulative AVERAGE year to date]a=6 * [low-incidents
affected people] * [Organization Size])) / 24) *
[Productivity loss]
What do the ITIL books say?
• It is critically important to measure and
understand the impact to the Business, or
Service interruptions (aka: incidents)
Download