Microsoft Word Viewer - tpc_421_best_practices

IBM Tivoli Storage Productivity Center V4.2.1 –
Performance Management Best Practices
Version 2.5 (March 7, 2011)
Sergio Bonilla
Xin Wang
IBM Tivoli Storage Productivity Center development,
San Jose, CA
Second Edition (March 2011)
This edition applies to Version 4, Release 2, of IBM Tivoli Storage Productivity Center
© Copyright International Business Machines Corporation 2011. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by
GSA ADP Schedule Contract with IBM Corp.
2
Table of Contents
1 Notices........................................................................................................................................ 4
1.1
Legal Notice........................................................................................................................ 4
1.2
Trademarks......................................................................................................................... 5
1.3
Acknowledgement .............................................................................................................. 6
1.4
Other IBM Tivoli Storage Productivity Center Publications ................................................ 6
2 IBM Tivoli Storage Productivity Center Performance Management........................................... 7
2.1 Overview ................................................................................................................................ 7
2.2 Disk Performance and Fabric Performance.......................................................................... 7
2.3 Performance Metrics............................................................................................................ 8
3 Setup and Configuration........................................................................................................... 10
3.1 Performance Data collection................................................................................................ 10
3.1.1 Adding a Device .......................................................................................................... 11
3.1.2 Create Threshold Alerts .............................................................................................. 12
3.1.3
Create Performance Monitor ................................................................................... 14
3.1.4
Check Performance Monitor Status......................................................................... 15
3.2
Retention for performance data........................................................................................ 15
3.3
Common Issues................................................................................................................ 16
3.3.1
General issues ......................................................................................................... 16
3.3.2
ESS and DS Related Issues.................................................................................... 16
3.3.3.
DS4000/DS5000 Related Issues ............................................................................. 17
3.3.4
HDS Related Issues................................................................................................. 17
4.
Top Reports and Graphs a Storage Administrator May Want to Run .............................. 17
4.1
Tabular Reports................................................................................................................ 18
4.2
Drill up and Drill down....................................................................................................... 20
4.3
Historic Charts .................................................................................................................. 20
4.4
Batch Reports ................................................................................................................... 21
4.5
Constraint Violation Reports............................................................................................. 23
4.6
Top Hit Reports ................................................................................................................ 24
5.
SAN Planner and Storage Optimizer................................................................................ 25
6.
Summary .......................................................................................................................... 26
7.
Reference ......................................................................................................................... 26
Appendix A Available Metrics ...................................................................................................... 28
Appendix B Available Thresholds................................................................................................. 57
Appendix C DS3000, DS4000 and DS5000 Metrics ................................................................... 65
3
1 Notices
1.1 Legal Notice
This information was developed for products and services offered in the U.S.A. IBM may not offer
the products, services, or features discussed in this document in other countries. Consult your
local IBM representative for information on the products and services currently available in your
area. Any reference to an IBM product, program, or service is not intended to state or imply that
only that IBM product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may be used instead.
However, it is the user's responsibility to evaluate and verify the operation of any non-IBM
product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document does not give you any license to these patents. You
can send license inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties
in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new editions of
the publication. IBM may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and
do not in any manner serve as an endorsement of those Web sites. The materials at those Web
sites are not part of the materials for this IBM product and use of those Web sites is at your own
risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate
without incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their
published announcements or other publicly available sources. IBM has not tested those products
and cannot confirm the accuracy of performance, compatibility or any other claims related to nonIBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
This information contains examples of data and reports used in daily business operations. To
illustrate them as completely as possible, the examples include the names of individuals,
4
companies, brands, and products. All of these names are fictitious and any similarity to the
names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates
programming techniques on various operating platforms. You may copy, modify, and distribute
these sample programs in any form without payment to IBM, for the purposes of developing,
using, marketing or distributing application programs conforming to the application programming
interface for the operating platform for which the sample programs are written. These examples
have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply
reliability, serviceability, or function of these programs. You may copy, modify, and distribute
these sample programs in any form without payment to IBM for the purposes of developing, using,
marketing, or distributing application programs conforming to IBM's application programming
interfaces.
1.2 Trademarks
The following terms are trademarks or registered trademarks of the International Business
Machines Corporation in the United States or other countries or both:
AIX®
Passport Advantage®
Tivoli Storage®
DB2®
pSeries®
WebSphere®
DS4000, DS6000, DS8000
Redbooks (logo)
XIV®
Enterprise Storage Server®
Redbooks
zSeries®
server®
iSeries
Storwize®
Tivoli®
The following terms are trademarks or registered trademarks of other companies:
Microsoft, Windows, Windows XP and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.
UNIX is a registered trademark of the Open Group in the United States and other countries.
Java, Solaris, and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc.
in the United States, other countries, or both.
Intel is a registered trademark of the Intel Corporation or its subsidiaries in the United States and
other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
CLARiiON and Symmetrix are registered trademarks of the EMC Corporation.
5
HiCommand is a registered trademark of Hitachi Data Systems Corporation.
Brocade and the Brocade logo are trademarks or registered trademarks of Brocade
Communications Systems, Inc., in the United States and/or in other countries.
McDATA and Intrepid are registered trademarks of McDATA Corporation.
Cisco is a registered trademark of Cisco Systems, Inc. and/or its affiliates in the U.S. and certain
other countries.
Engenio and the Engenio logo are trademarks or registered trademarks of LSI Logic Corporation.
Other company, product, or service names may be trademarks or service marks of others.
1.3 Acknowledgement
The materials in this document have been collected from explicit work in the IBM Tivoli Storage
Productivity Center development lab, other labs within IBM, from experiences in the field at
customer locations, and contributions offered by people that have discovered valuable tips and
have documented the solution.
Many people have helped with the materials that are included in this document, too many to
properly acknowledge here, but special thanks goes to Xin Wang who compiled the original
version of this document. It is a source of information for advanced configuration help and basic
best practices for users wanting to get started quickly with Tivoli Storage Productivity Center.
1.4 Other IBM Tivoli Storage Productivity Center Publications
IBM Tivoli Storage Productivity Center 4.2.1 - Performance Management Best Practices is a
supplement to the Tivoli Storage Productivity Center publications that are available. It is intended
to be a supplement to the Tivoli Storage Productivity Center publications, providing additional
information to help implementers of Tivoli Storage Productivity Center with configuration
questions and to provide guidance in the planning and implementation of Tivoli Storage
Productivity Center. It is expected that an experienced Tivoli Storage Productivity Center installer
will use this document as a supplement for installation and configuration, and use the official
Tivoli Storage Productivity Center publications for overall knowledge of the installation process,
configuration, and usage of the Tivoli Storage Productivity Center components.
This document is not intended to replace the official Tivoli Storage Productivity Center
publications, nor is it a self-standing guide to installation and configuration. You can find
the entire set of Tivoli Storage Productivity Center publications at
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp. These documents are essential to
a successful implementation of Tivoli Storage Productivity Center, and should be used to make
sure that you do all the required steps to install and configure Tivoli Storage Productivity Center.
You should have the official publications available in either softcopy of printed form, read them
and be familiar with their content.
6
2 IBM Tivoli Storage Productivity Center Performance
Management
2.1 Overview
There are three main functions for IBM Tivoli Storage Productivity Center performance
management: performance data collection, performance thresholds/alerts, and performance
reports.
The product can collect performance data for devices - storage subsystems and fibre channel
switches - and store the data in the database up to a user-defined period. The product may
collect performance data from either IBM devices using native-agent APIs, or IBM and non-IBM
devices that are managed by CIM agents that are at least SMI-S 1.1 compliant. The product can
set thresholds for important performance metrics, and when any boundary condition is crossed,
can notify the user via email, SNMP, or other alerting mechanisms. And lastly, the product can
generate reports, historic trend charts, and help analyze the bottleneck of a performance
congestion by drilling down to threshold violated components and the affected hosts.
The combination of those functions can be used to monitor a complicated storage network
environment, to predict warning signs of system fallout, and to do capacity planning as overall
workload grows. The collected performance data may also be utilized by both the Storage
Optimizer and SAN Planner functions.
The IBM Tivoli Storage Productivity Center Standard Edition (5608-WC0) includes
performance management for both subsystems and switches, while IBM Tivoli Storage
Productivity Center for Disk (5608-WC4) is only for storage subsystems. IBM Tivoli Storage
Productivity Center Basic Edition (5608-WB1) and IBM Tivoli Storage Productivity Center for Data
(5608-WC3) do not include performance management function.
2.2 Disk Performance and Fabric Performance
Performance management for subsystems is done via the disk manager. Data collection for
subsystems can be scheduled under Disk Manager -> Monitoring -> Subsystem Performance
Monitors, and subsystem performance reports are under Disk Manager -> Reporting ->
Storage Subsystem Performance. Performance management for fibre channel switches is done
via the fabric manager. Data collection for switches can be scheduled under Fabric Manager ->
Monitoring -> Switch Performance Monitors, and switch performance reports are under Fabric
Manager -> Reporting -> Switch Performance.
Some disk performance and all fabric performance require a CTP certified CIMOM that is at
least SMI-S 1.1 compliant. Devices that do not require a CIM agent for disk performance
collection include some DS8000 subsystems, and all SVC, Storwize V7000, and XIV Storage
subsystems. These devices use native-agent APIs introduced in IBM Tivoli Storage Productivity
Center V4.2.
7
2.3 Performance Metrics
IBM Tivoli Storage Productivity Center can report on various performance metrics, which indicate
the particular performance characteristics of the monitored devices.
Two very important metrics for storage subsystems are the throughput in I/Os per sec and the
response time in milliseconds per I/O. Throughput is measured and reported in several different
ways. There is throughput of an entire box (subsystem), or of each cluster (ESS) or controller
(DS6000, DS8000), or of each I/O Group (SVC, Storwize V7000). There are throughputs
measured for each volume (or LUN), throughputs measured at the Fibre Channel interfaces
(ports) on some of the storage boxes and on fibre channel switches, and throughputs measured
at the RAID array after cache hits have been filtered out.
For storage subsystems, it is common to separate the available performance statistics into two
separate domains, the front-end and the back-end of the subsystem. Front-end I/O metrics are a
measure of the traffic between the servers and the storage subsystem, and are characterized by
relatively fast hits in the cache, as well as occasional cache misses that go all the way to the
RAID arrays on the back end. Back-end I/O metrics are a measure of all traffic between the
subsystem cache and the disks in the RAID arrays in the backend of the subsystem. Most
storage subsystems give metrics for both kinds of I/O operations, front- and back-end. We need
to always be clear whether we are looking at throughput and response time at the front-end (very
close to system level response time as measured from a server), or the throughput and response
time at the back-end (just between cache and disk).
The main front-end throughput metrics are:
• Total I/O Rate (overall)
• Read I/O Rate (overall)
• Write I/O Rate (overall)
The corresponding front-end response time metrics are:
• Overall Response Time
• Read Response Time
• Write Response Time
The main back-end throughput metrics are:
• Total Backend I/O Rate (overall)
• Backend Read I/O Rate (overall)
• Backend Write I/O Rate (overall)
The corresponding back-end response time metrics are:
• Overall Backend Response Time
• Backend Read Response Time
• Backend Write Response Time
It is important to remember that the response times taken in isolation of throughput rates are
not terribly useful, because it is common for components which have negligible throughput rates
to exhibit large (bad) response times. But in essence those bad response times are not significant
to the overall operation of the storage environment if they occurred for only a handful of I/O
operations. It is therefore necessary to have an understanding of which throughput and response
time combinations are significant and which can be ignored. To help in this determination, IBM
Tivoli Storage Productivity Center V4.1.1 introduced a metric called Volume Utilization
Percentage. This metric is based on both I/O Rate and Response Time of a storage volume and
8
is an approximate measure of the amount of time the volume was busy reading and writing data.
It is therefore safe to ignore bad average response time values for volumes with very low
utilization percentages, and conversely, those volumes with the highest utilization percentages
are the most important for the smooth operation of the storage environment and are most
important to exhibit good response times. When implementing storage tiering using 10K, 15K, or
even SSD drives, the most highly utilized volumes should be considered for being placed on the
best performing underlying media.
Furthermore, it will be advantageous to track any growth or change in the throughput rates and
response times. It frequently happens that I/O rates grow over time, and that response times
increase as the I/O rates increase. This relationship is what “capacity planning” is all about. As
I/O rates and response times increase, you can use these trends to project when additional
storage performance (as well as capacity) will be required.
Depending on the particular storage environment, it may be that throughput or response times
change drastically from hour to hour or day to day. There may be periods when the values fall
outside the expected range of values. In that case, other performance metrics can be used to
understand what is happening. Here are some additional metrics that can be used to make sense
of throughput and response times.
• Total Cache Hit percentage
• Read Cache Hit Percentage
• NVS Full Percentage
• Read Transfer Size (KB/Op)
• Write Transfer Size (KB/Op)
Low cache hit percentages can drive up response times, since a cache miss requires access to
the backend storage. Low hit percentages will also tend to increase the utilization percentage of
the backend storage, which may adversely affect the back-end throughput and response times.
High NVS Full Percentage (also known as Write-cache Delay Percentage) can drive up the write
response times. High transfer sizes usually indicate more of a batch workload, in which case the
overall data rates are more important than the I/O rates and the response times.
In addition to the front-end and back-end metrics, many storage subsystems provide additional
metrics to measure the traffic between the subsystem and host computers, and to measure the
traffic between the subsystem and other subsystems when linked in remote-copy relationships.
Such fibre channel port-based metrics, primarily I/O rates, data rates, and response times are
available for ESS, DS6000, DS8000, SVC, and Storwize V7000 subsystems. ESS, DS6000, and
DS8000 subsystems provide additional break-down between FCP, FICON, and PPRC operations
at each port. SVC and Storwize V7000 subsystems provide additional breakdown between
communications with host computers, backend managed disks, and other nodes within the local
cluster, as well as remote clusters at each subsystem port. XIV subsystems do not provide portbased metrics as of IBM Tivoli Storage Productivity Center V4.2.1.
Similar to the Volume Utilization Percentage mentioned earlier, IBM Tivoli Storage Productivity
Center V4.1.1 also introduced the Port Utilization Percentage metric (available for ESS, DS6000,
and DS8000 storage subsystems). The Port Utilization Percentage is an approximate measure of
the amount of time a port was busy, and can be used to identify over-utilized and under-utilized
ports on the subsystem for potential port balancing. For subsystems where port utilizations are
not available, the simpler Port Bandwidth Percentage metrics provide a measure of the
approximate bandwidth utilization of a port, based on the port’s negotiated speed, and can be
used in a similar fashion. However, beware that the Port Bandwidth Percentages can potentially
provide misleading indicators of port under-utilization when ports are not under-utilized if there is
a performance bottleneck elsewhere in the fabric or at the port’s communication partner.
For fibre-channel switches, the important metrics are Total Port Packet Rate and Total Port
Data Rate, which provide the traffic pattern over a particular switch port, as well as the Port
9
Bandwidth Percentage metrics providing indicators of bandwidth usage based on port speeds.
When there are lost frames from the host to the switch port, or from the switch port to a storage
device, the dumped frame rate on the port can be monitored.
All these metrics can be monitored via reports or graphs in IBM Tivoli Storage Productivity
Center. Also there are several metrics for which you can define thresholds and receive alerts
when measured values do not fall within acceptable boundaries. Some examples of supported
thresholds are:
• Total I/O Rate and Total Data Rate Thresholds
• Total Backend I/O Rate and Total Backend Data Rate Thresholds
• Read Backend Response Time and Write Backend Response Time Thresholds
• Total Port I/O Rate (Packet Rate) and Data Rate Thresholds
• Overall Port Response Time Threshold
• Port Send Utilization Percentage and Port Receive Utilization Percentage Thresholds
• Port Send Bandwidth Percentage and Port Receive Bandwidth Percentage Thresholds
Please see Appendix A for a complete list of performance metrics that IBM Tivoli Storage
Productivity Center supports and Appendix B for a complete list of thresholds supported.
The important thing is to monitor the throughput and response time patterns over time for your
environment, to develop an understanding of normal and expected behaviors. Then you can set
threshold boundaries to alert you when anomalies to the expected behavior are detected. And
you can use the performance reports and graphs to investigate any deviations from normal
patterns or to generate the trends of workload changes.
3 Setup and Configuration
3.1 Performance Data collection
Performance data may be collected either directly from select device types using native-agent
API or from devices managed by a CIM agent (CIMOM) using SMI-S interfaces. Devices that do
not require a managing CIM agent include DS8000, SVC, Storwize V7000, and XIV subsystems.
Devices that require a CIM agent include non-IBM subsystems and switches, ESS subsystems,
DS4000 and DS5000 subsystems, and DS6000 subsystems. XIV subsystems require version
10.1 or higher to collect performance data.
For devices that require the use of a CIM agent, you need to make sure all of the following
prerequisites are met before adding the CIM agent to TPC:
• The version of the CIMOM and the firmware for the device is supported.
• A CIMOM is installed in the environment, either as a proxy on another server or
embedded on the device it manages.
• For subsystems or switches on a private network, be sure to have the CIMOM installed
on a gateway machine so the IBM Tivoli Storage Productivity Center server on a different
network can communicate with it.
• The CIMOM is configured to manage the intended device.
See IBM TotalStorage Productivity Center V3.2 Hints and Tips on how to install and configure
a CIMOM:
10
http://www-1.ibm.com/support/docview.wss?rs=597&context=STCRLM4&context=SSMMUP
&dc=DB500&uid=swg21236488&loc=en_US&cs=utf-8&lang=en
The following steps cover setup and configuration for both subsystems and switches.
3.1.1 Adding a Device
You may add a device to collect performance data from to TPC through either:
1. Launching the IBM Tivoli Storage Productivity Center -> Configure Devices panel.
2. Clicking the Add Storage Subsystem button on the Disk Manager -> Storage
Subsystems panel.
3. Clicking the Add Fabric button on the Fabric Manager -> Fabrics panel.
4. Clicking the Add button on the Administrative Services -> Data Sources -> Storage
Subsystems panel.
5. Clicking the Add CIMOM button on the Administrative Services -> Data Sources ->
CIMOM Agents panel.
The first four options will launch a “Configure Devices” wizard, at different stages of the
configuration settings.
Adding a device using the first option allows the user to configure either a storage subsystem
or fabric/switch to be used for future performance monitors. The second and fourth options
automatically choose the “Storage Subsystem” option and advances to a panel that allows you to
choose the subsystem type, while the third option automatically chooses the “Fabric/Switches”
option. Going through the first panel also allows the user to choose to add and configure a new
device or to configure a previously discovered device.
Adding a storage subsystem allows you to choose between configuring an IBM DS8000, IBM
XIV, IBM SAN Volume Controller/IBM Storwize V7000, or other storage subsystems managed by
a CIM agent. Adding fabric/switch allows you to choose between configuring a Brocade, McData,
Cisco, QLogic, or Brocade/McData device.
The “Configure Devices” panel updates depending on the device chosen. Fields with labels in
bold are required in order to connect to the chosen device type. If you are configuring “other”
devices, read the CIMOM documentation to get all the information required to connect to the
CIMOM. In order to collect performance data for fabrics, you must choose to configure a CIMOM
agent for the device, instead of configuring an out of band fabric agent.
Configuring a device using these wizards will perform the necessary discovery and initial probe
of the device required in order to run a performance data collection against the device.
11
3.1.2 Create Threshold Alerts
A performance threshold is a mechanism by which you can specify one or more boundary values
for a performance metric, and can specify to be notified if the measured performance data for this
metric violates these boundaries.
Thresholds are applied during the processing of performance data collection, so a performance
monitor for a device must be actively running for a threshold to be evaluated and a violation to be
recognized
The Tivoli Storage Productivity Center ships with several default thresholds enabled (see
Appendix B for a full list of thresholds supported) that do not change much with the environment,
but metrics such as throughput and response time can vary a lot depending on the type of
workload, model of hardware, amount of cache memory etc. so that there are no recommended
values to set. Boundary values for these thresholds have to be determined in each particular
environment by establishing a base-line of the normal and expected performance behavior for the
devices in the environment. After the base-line is determined, thresholds can then be defined to
trigger if the measured performance behavior falls outside the normally expected range.
Thresholds are device type and component type specific, meaning that each threshold may
apply to only a subset of all supported device types and to only a subset of supported component
types for each device type. Every threshold is associated with a particular metric; checking that
metric’s value at each collection interval determines whether the threshold is violated or not.
To create an alert for subsystem thresholds, go to Disk Manager -> Alerting -> Storage
Subsystem Alerts, right-click to select create storage subsystem alert (see Figure 1):
• Alert tab – In the triggering condition area, select from the drop down list a triggering
condition (threshold alerts have names ending with “Threshold”), ensure that the
threshold is enabled via the checkbox at the top of the panel, and then enter the
threshold boundary values for the desired boundary conditions. Tivoli Storage
Productivity Center V4.1.1 and above allows decimal values in the threshold boundary
values, prior versions only allow integer values.
• Alert tab – Some thresholds are associated with an optional filter condition, which is
displayed in the triggering condition area. If displayed, you can enable or disable the filter,
and if enabled, can set the filter boundary condition. If the filter condition is triggered, any
violation of this threshold will be ignored when the filter is enabled.
• Alert tab – In the alert suppression area, select whether to trigger alerts for both critical
and warning conditions or only critical conditions or not to trigger any alerts. The
suppressed alerts will not alert log entries or cause any action to be taken as defined in
the triggered action area, but they will still be visible in the constraint violation reports.
• Alert tab – In the alert suppression area, select whether to suppress repeating alerts.
You may either suppress alerts until the triggering condition has been violated
continuously for a specified length of time or to suppress subsequent violations for a
length of time after the initial violation. Alerts suppressed will still be visible in the
constraint violation reports.
• Alert tab – In the triggered action area, select one of the following actions: SNMP trap,
TEC/OMNIbus event, login notification, Window’s event log, run script, or email.
• Storage subsystem tab – move the subsystem(s) you want to monitor into the righthand panel (Selected subsystems). Make sure these are the subsystems for which you
will define performance monitors
• Save the alert with a name
12
Figure 1. Threshold alert creation panel for storage subsystems.
To create an alert for switch thresholds, go to Fabric Manager -> Alerting -> Switch Alerts,
right-click to select create switch alert, and follow the same steps as for subsystems described
above.
There are a few points that need to be addressed in order to understand threshold settings:
1. There are two types of boundaries for each threshold, the upper boundary (stress)
and lower boundary (idle). When a metric’s value exceeds the upper boundary or is
below the lower boundary, it will trigger an alert.
2. There are two levels of alerts, warning and critical. The combination of boundary type
and level type generates four different threshold settings: critical stress, warning
stress, warning idle and critical idle. Most threshold values are in descending order
(critical stress has the highest value that indicates high stress on the device, and
critical idle has the lowest value) while Cache Holding Time is the only threshold in
ascending order.
3. If the user is only interested to receive alerts for certain boundaries, the other
boundaries should be left blank. The performance manager will only check boundary
conditions with input values, therefore no alerts will be sent for the condition that is
left blank.
4. The storage subsystem alerts will be displayed under IBM Tivoli Storage
Productivity Center -> Alerting -> Alert logs -> All, as well as under Storage
Subsystem. Another important way to look at the exception data is to look at
constraint violation reports. This is described in section 4.4.
13
3.1.3 Create Performance Monitor
A performance data collection is defined via a mechanism called a monitor job, and that can be
run manually (immediately), can be scheduled for one-time execution, or can be scheduled for
repeated execution, as desired.
Only after the device has been probed successfully, can a monitor job be run successfully. To
create a performance monitor on a subsystem, go to Disk Manager -> Monitoring ->
Subsystem Performance Monitors, right-click to select create performance monitor.
• Storage Subsystem tab - move the subsystem(s) you want to monitor into the right-hand
panel (Selected subsystems)
• Sampling and scheduling tab – enter how frequently the data should be collected and
saved (the smaller the interval length, the more granular the performance data), when the
monitor will be run and the duration for the collection
• Save the monitor with a name
To create a performance monitor for a switch, go to Fabric Manager -> Monitoring -> Switch
Performance Monitors, follow the same steps as above, substituting storage subsystem with
switch.
The monitor will start at the scheduled time, but the performance sample data will be collected
a few minutes later. For example, if your monitor is scheduled to start at 9 am to collect with an
interval length of 5 minutes, the first performance data might be inserted into the database 10-15
minutes later, and the second performance data will be inserted after 5 more minutes. Only after
the first sample data is inserted into the database, in this case, around 9:10 or 9:15 am, you will
be able to view the performance reports.
Because of this, there are some best practice information related how to set up the schedule
and duration for a performance monitor:
1. Monitor duration – if a monitor is intended to run for a long time, choose to run it
indefinitely. The performance monitor is optimized such that running indefinitely will
be more efficient than running, stopping, and starting again.
2. You should only have one performance monitor defined per storage device.
3. Prior to v4.1.1, if you want to run the same monitor at different workload periods, set
the duration to be 1 hour less than the difference between the two starting points.
This gives the collection engine one hour to finish up the first collection and shutdown
properly. For example, if you want to start a monitor at 12 am and 12 pm on the same
day, the duration for the 12 am collection has to be 11 hours or less, so the monitor
can start again at 12 pm successfully.
The same is true for a repeated run. If you want to run the same monitor daily, be
sure the duration of the monitor will be 23 hours or less. If you want to run the same
monitor weekly, the duration of the monitor will need to be 7x24 -1 = 167 hours or
less.
4. Later versions of IBM Tivoli Storage Productivity Center no longer have the limitation
of running back-to-back performance monitors for the same device with a 1 hour
reduction of the duration required. To avoid overlapping performance monitors for
back-to-back monitors, v4.1.1 and higher will automatically reduce the duration of the
preceding run by one interval period to allow sufficient time for the monitor to end.
There is no data loss, as the subsequent monitor will retain the previous interval’s
performance data for the next delta.
During a performance sample collection, the hourly and daily summary for each performance
metric are computed based on the sample data. The summary data reflects the performance
characteristics of the component over certain time periods while the sample data shows the
performance right at that moment.
14
One more thing to notice for a performance monitor and the sample data: the clock on the
server might be different from that on the device. The performance monitor always uses device
time on the sample it collects, then converts it into the time zone (if it’s different) of the IBM Tivoli
Storage Productivity Center server.
3.1.4 Check Performance Monitor Status
When the performance monitor job starts to run, you begin to collect performance data for the
device. You should check the status of the monitor job, make sure it runs and continues running.
Expand on Subsystem Performance Monitors, right click on the monitor, select Job History, and
check the status of the job you want to view. Alternatively, you may navigate to IBM Tivoli
Storage Productivity Center -> Job Management and find the performance job in the list of
jobs. If the status is blue, the monitor is still running without issues. If the status is yellow, you can
check out the warning messages. The monitor will continue to run with warning messages. For
example, if there are “missing a sample data” warning messages, the monitor will continue to run,
and only if the monitor misses all the data it should collect, the status will turn red, and the
monitor failed. If the status is green, the monitor completed successfully.
To view the job log, select the performance job from the list of scheduled jobs. The runs of the
particular job will be listed in the bottom panel. Expand the run you are interested in, select the
job, and click the View Log File(s) button. Normally the job log will have error messages logged
for a failed collection.
There are a few common issues that may lead to failed data collection. See section 3.3 for
details.
3.2 Retention for performance data
After the monitor is created, the user should configure the retention of performance data in the
database. Expand Administrative Services –> Configuration –> Resource History Retention,
under Performance Monitors there are three options: retention for collected sample data (labeled
“per performance monitoring task”), retention for aggregated hourly data, and retention for daily
data.
Sample data is the data that is collected at the specified interval length of the monitor, for
example data collected every 5 minutes. The default retention period for sample data is 14 days.
This means that by default, IBM Tivoli Storage Productivity Center keeps the individual 5 minute
samples for 14 days before they are purged. Individual samples are summarized into hourly and
daily data, for example the sum of 12 of the 5 minute samples are saved as an hourly
performance data record, and the sum of 288 such samples are saved as a daily performance
data record. The default retention periods for hourly and daily data are 30 days and 90 days,
respectively. You can change all those values based on your need to retain historical
performance information, but please be aware of the implication to the size of IBM Tivoli Storage
Productivity Center database if performance data is kept longer, especially the sample data.
Here are a few formulas the user can use to estimate the size of performance data:
For subsystems, the biggest component is volume, and the biggest performance sample data will
be that of volumes. For switches, the performance data is proportional to the number of ports in a
switch. Assuming:
NumSS = number of subsystems
NumSW = number of switches
15
NumV = average number of volumes in a subsystem
NumPt = average number of ports in a switch
CR = number of sample data collected per hour (for a sample interval of 5 minutes, this
should be 60/5 = 12 samples )
Rs = retention for sample data in days
Rh = retention for hourly summarized data in days
Rd = retention for daily summarized data in days
The estimated space required may be calculated using the following formulas:
Storage subsystem performance sample data size = NumSS * NumV * CR * 24 * Rs * 200 byte
Storage subsystem performance aggregated data size = NumSS * NumV * (24 * Rh + Rd) * 200
byte
Switch performance sample size= NumSw * NumPt * CR * 24 * Rs * 150
Switch performance aggregated data size = NumSw * NumPt * (24 * Rh + Rd) * 150
3.3 Common Issues
There are a few known issues that may lead to failed performance data collection. Most of them
are related to the configuration of devices or the environment. Here are a few hints and tips on
those known issues:
3.3.1 General issues
Invalid data returned by either the firmware or CIMOM of managed devices may cause a data
spike during a polling interval. These spikes will skew the results of averaged and aggregated
data, resulting in unreliable performance data that are also capable of reducing the effectiveness
of the Storage Optimizer. To alleviate this, code was introduced in 3.1.3 to detect when a data
spike occurred and the performance data for that polling interval wouldn’t be inserted into the
database. This will result in a message indicating that the polling interval was skipped in the
performance monitor’s job log when a spike is detected.
3.3.2 ESS and DS Related Issues
Any firewalls between the ESS CIMOM host server and the ESS subsystem should be configured
to allow LIST PERFSTATS traffic through. If this is not possible, then both the ESS CIMOM host
server and the ESS subsystem must be on the same side of any existing firewalls. In addition to
this, all IP ports on the CIMOM server above 1023 opened to receive performance data from the
ESS.
The port bandwidth usage percentage for DS6000 subsystems may be displayed as “N/A” in
the reports. This is due to the port speeds not being available from the device. The DS6000
CIMOM may be upgraded to version 5.4.0.99 to reduce the likelihood.
The storage pools of DS8000 subsystems containing space efficient volumes will have
incomplete performance data collected. The performance manager is unable to determine if the
space efficient volumes are fully allocated, making it impossible to manage the performance for
the ranks, the arrays associated with those ranks, and the device adapters associated with those
arrays, since it cannot determine the performance impact of those volumes. Rather than present
16
the user with inaccurate and misleading information, the performance manager will not aggregate
the volumes’ metrics to the higher level components.
3.3.3. DS4000/DS5000 Related Issues
Both clusters need to be defined to the Engenio CIMOM. If only one cluster of the DS4000 or
DS5000 is defined to the Engenio CIMOM, performance data will be collected only for the one
cluster, while volume and other components information are still collected for both clusters.
3.3.4 HDS Related Issues
Tivoli Storage Productivity Center 4.1.1 is capable of collecting some performance statistics from
HDS subsystems with HDvM 6.2, but there are currently known limitations to the performance
metrics being returned. As such, Tivoli Storage Productivity Center 4.1.1 does not claim support
for monitoring HDS subsystems with HDvM 6.2.
For more information regarding this limitation, please see:
http://www-01.ibm.com/support/docview.wss?uid=swg21406692
4. Top Reports and Graphs a Storage Administrator
May Want to Run
After performance data is collected for the subsystem and the switch, there are a few ways to use
the data and interpret key metrics – via reports, via charts, drill down to problematic components,
review of constraint violation reports, or export of the data for customized reports.
The following table describes the performance reports supported for each device type in IBM
Tivoli Storage Productivity Center 4.1.1. Note that not all SMI-S BSP subsystems support each
report type.
For example, only certain versions of DS3000, DS4000, and DS5000 return performance data
that can be displayed in the By Controller and By Port reports. Please see Appendix C for a list of
metrics and reports supported by DS3000, DS4000, and DS5000.
Device Type
ESS, DS6000, DS8000
Performance Report Type
By Subsystem
By Controller
By Array
By Volume
By Port
SVC, Storwize V7000
By Subsystem
By Volume
By I/O Group
By Module/Node
By Managed Disk Group
17
By Managed Disk
By Port
XIV
By Subsystem
By Volume
By Module/Node
SMI-S BSP (eg. DS4000, DS5000)
By Subsystem
By Controller
By Volume
By Port
Switch
By Switch
By Port
Tabular Reports
The most straight forward way to view performance data is to go through the corresponding
component manager’s reporting function to view the most recent data. For subsystem
performance reports, go to Disk Manager -> Reporting -> Storage Subsystem Performance,
then select one of the options to view the data.
The options on report types as shown in Figure 2:
• By Subsystem – for box level aggregate/averages for ESS, DS, SVC, Storwize V7000,
XIV, and SMI-S BSP
• By Controller – for ESS clusters, and DS and select SMI-S BSP controllers
• By Array – for ESS and DS arrays
• By Volume – for ESS, DS, XIV, and SMI-S BSP volumes/LUNs, and SVC and Storwize
V7000 vdisks
• By Port – ESS, DS, SVC, Storwize V7000, and SMI-S BSP FC ports onto storage box
• By I/O Group – for SVC and Storwize V7000 I/O groups
• By Node – for SVC and Storwize V7000 nodes
• By Managed Disk Group – for SVC and Storwize V7000 managed disk groups
• By Managed Disk – for SVC and Storwize V7000 managed disks
18
Figure 2. IBM Tivoli Storage Productivity Center 4.1.1 Performance Reports Options for Disk
Manager.
On the right-hand panel, the available metrics for the particular type of device are in the
included columns, and the user can pick which metric not to include in the performance report.
The user also could use the selection button to pick components specific to the device type, and
use the filter button to define criteria of their choosing.
Select the “display latest performance data” option to generate a report on the most recent
data. Historic reports can be created by choosing either the date/time range or by defining how
many days in the past to include in the report. You may display either the latest sample, hourly, or
daily data for either the latest or historic reports.
If the selection is saved, this customized report will show up under IBM Tivoli Storage
Productivity Center -> Reporting -> My Reports –> [Admin]’s Reports. [Admin] here is the
login name used to define the report. See more information regarding this topic in the IBM Tivoli
Storage Productivity Center 4.1.1 User’s Guide.
For switch performance reports, go to Fabric Manager -> Reporting -> Switch Performance.
A report may be created in a similar fashion as a subsystem report. The supported report types
are:
• By Switch
• By Port
19
Drill up and Drill down
Based on the latest sample data, drill up and drill down can be done between different
components for ESS, DS8000, DS6000, SVC, Storwize V7000, and XIV. For
ESS/DS8000/DS6000, user can drill down to reports in this path by clicking on the magnifying
glass icon by the row: “By Controller” -> “By Array” -> “By Volume” reports. And drill up works on
reverse direction. For example, by looking at the performance of a volume, you can drill up to the
performance of the underlying array to see if there is more information; and while you are at the
performance report for an array, you can drill down to all the corresponding volumes to see which
volume is imposing significant load on the array.
For SVC and Storwize V7000, a user can drill down to reports in this path: “by Mdisk Group” ->
“by Mdisk”.
Historic Charts
A historic chart can be created by clicking the chart icon at the top of the tabular report and
choosing “history chart” on the panel that is displayed. The history chart may be created for all the
rows from the previous tabular report or for a selected subset chosen prior to clicking the chart
icon. One or more metrics may be chosen to display in the report, as well as whether to sort the
chart by the metric(s) chosen or by component.
Once the historic chart is generated, you may modify the date and time ranges included in the
chart and click “generate chart” again. You may view the trend of the data by clicking the “show
trends” button.
The historic data in tabular form may not always be available in older versions of IBM Tivoli
Storage Productivity Center; however, in those cases the report can be exported into different
formats for analysis in other tools once the “by sample” history chart is displayed. Click on the File
option, there are various print and export options. You can print the graph to a format such as
html, pdf etc. and export the data into CSV file for archiving or input to a spreadsheet.
It’s desirable to track growth in the I/O rate and response time for a particular subsystem or
switch using the historic chart. Also, retention of the performance data in the IBM Tivoli Storage
Productivity Center database is limited, and eventually it will be purged out of the database. It is
important to develop a set of graphs and reports to summarize and visualize the data, and keep
periodic snapshots of performance. The key is to monitor normal operations with key metrics,
develop an understanding of expected behaviors, and then track the behavior for either
performance anomalies or simple growth in the workload.
See Figure 3 for an example of the throughput chart (Total I/O Rate) for a DS8000. This data is
an hourly summary of I/O rate for this DS8000 for past day. This data can also easily be exported
into other format for analysis.
20
Figure 3. Historic chart on Total I/O rate for a DS8000.
Batch Reports
Another way to backup the performance data is to use batch reports. This saves a report into a
file on a regular scheduled basis. You can create a batch report by going to IBM Tivoli Storage
Productivity Center -> Reporting -> My Reports, right-click on the Batch Reports node, and
select Create Batch Report. In order to create batch reports, a data agent must be installed
locally or remotely. Starting with IBM Tivoli Storage Productivity Center V4.1, fix pack 1, the data
agent will only be available with the Data and Standard Edition. Prior to installing a data agent, an
agent manager must be installed, and the device and data servers must have been registered
with the agent manager. For additional information regarding installing the data agent or agent
manager see:
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp?topic=/com.ibm.tpc_V411.doc/fqz
0_t_installing_agents.html
When creating a batch report, the Report tab allows you to choose the type of performance
report for the batch job (see Figure 4). Select either a Storage Subsystem Performance report or
Switch Performance report under their respective nodes in the Available list and click the “>>”
button. Only one performance report may be chosen per batch report job. Once a performance
report type has been chosen, the Selection tab will be populated with the available metric
columns for that report type. The panel is similar to the tabular report panel in section 4.1 (see
Figure 2) and features the same options.
On the Options tab, select which agent to run the report on (this will determine the location of
the output file), and choose the type of output file to generate (see Figure 5), such as a CSV file
that may be imported into a spreadsheet or an HTML file. Then choose when and how often you
want this job to run on the When to Run tab. Then save the batch report. When the batch report
is run, the file location is described in the batch job’s log.
21
For additional information regarding batch reports, see the IBM Tivoli Storage Productivity
Center V4.1.1 info center:
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp?topic=/com.ibm.tpc_V411.doc/fqz
0_c_batch_reports.html
Figure 4. Choose a performance report for the batch reporting job.
22
Figure 5. Choose the agent to run the report on and the type of output to generate.
Constraint Violation Reports
Another important way to view performance data is through constraint violation reports. For each
device type, there are only certain metrics you can set thresholds. See Figure 5 for constraint
violation options. Go to Disk Manager -> Reporting –> Storage Subsystem Performance ->
Constraint Violations, all subsystems with thresholds violated will show up in the first general
report. Similar constraint violation reports exist for switches. Go to Fabric Manager -> Reporting
–> Switch Performance -> Constraint Violations, you will get switch constraint violation
reports.
It is very important you set meaningful threshold values that the constraint report can be used
to diagnose problems in your environment. The first report shows the number of violations for
each subsystem during the last 24 hours. If the normal behavior pattern is studied and the
threshold values truly reflect an abnormal condition, the number of violations will indicate the
severity of the problem on the subsystem. This can be used daily to monitor all the devices and to
analyze the trend in your environment.
You can also drill down from the subsystem level to get details on the violations on each
component from this general report. In the detailed report panel, you can click on the Affected
Volumes tab to generate a report showing details on the affected volumes. Under Volumes,
select whether you want the report to show all volumes or only the most active volumes
associated with the subsystem and component. Under Performance data, select whether you
want the report to display performance data for the volumes. The user can also click on the
Affected Hosts tab to generate a report showing details on the affected hosts for
ESS/DS6000/DS8000. The volume report here will show the user which host is affected by this
threshold violation.
In the meantime, a historic graph can be generated based on constraint violation by clicking on
the chart icon. All the options described in section 4.3 exist here too.
23
Figure 4. Constraint Violation Reports Options
Top Hit Reports
One more way to view performance reports for subsystem devices is to look at top 25 volumes
with highest performance metrics (for cache hit, this will be the lowest). Here are the reports
available for subsystems under IBM Tivoli Storage Productivity Center -> Reporting ->
System Reports –> Disk:
• Top Active Volumes Cache Hit Performance
• Top Volumes Data Rate Performance
• Top Volumes Disk Performance
• Top Volumes I/O Rate Performance
• Top Volumes Response Performance
For example, the Top Volumes I/O Rate Performance report will show the 25 busiest volumes
by I/O rate. The main metrics shown in this report are:
• Overall read/write/total I/O rate
• Overall read/write/total data rate
• Read/write/overall response time
Similar top hit reports are available for switches under IBM Tivoli Storage Productivity Center > Reporting –> System Reports –> Fabric:
• Top Switch Ports Data Rate Performance
• Top Switch Ports Packet Rate Performance
24
Figure 5. Top hits reporting choices.
These reports will help the user to look up quickly the top hit volumes/ports for bottleneck
analysis. One caveat is that these top reports are based on the latest sample data, and in some
cases, may not reflect the problem on a component over a certain period. For example, if the
daily average I/O rate is high for a volume but the last sample data is normal, this volume may not
show up on the top 25 reports. Another complication in storage performance data is data wrap,
that is, from one sample interval to the next, the metric value may appear extremely large. This
will also skew these top reports. It is also possible to see some volumes in these reports with low
or no I/O (0 or N/A values for their metrics) if fewer than 25 volumes have high I/O.
There are also other predefined reports. Under the same node “System Reports -> Disk”, it
has reports such as “Array Performance” and “Subsystem Performance”. Those predefined
reports are provided with the product which shows also the latest sample data.
5. SAN Planner and Storage Optimizer
The SAN Planner function is available for ESS, DS6000, DS8000, SVC and Storwize V7000
systems in IBM Tivoli Storage Productivity Center 4.2.1. This is an approach to the automation of
storage provisioning decisions using an expert advisor designed to automate decisions that could
be made by a storage administrator with time and information at hand. The goal is to give good
advice using algorithms that consider many of the same factors that an administrator would when
deciding where best to allocate storage. A performance advisor must take several factors into
account when recommending volume allocations:
25
• Total amount of space required
• Minimum number, maximum number, and sizes of volumes
• Workload requirements
• Contention from other workloads
Only subsystems that have been discovered and probed will show up in the SAN Planner. To use
the SAN Planner, the user needs to define the capacity and the workload profile of the new
volumes to be allocated. A few standard workload profiles are provided. Once performance data
has been collected, you can use historic performance data to define a profile to be used for new
volumes whose workloads will be similar to some existing volumes. See the following link for
more information: IBM Tivoli Storage Productivity Center User’s Guide, Chapter 4. Managing
storage resources, under the section titled “Planning and modifying storage configurations”
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_usersguid
e_v411.pdf
While the SAN Planner tries to identify the RAID arrays or pools with the least workload in
order to recommend where to create new volumes, the Storage Optimizer examines existing
volumes to determine if there are any performance bottlenecks. The Storage Optimizer then goes
through several scenarios to determine if the performance bottlenecks may be eliminated by
moving the problem volumes to other pools. The Storage Optimizer supports ESS, DS8000,
DS6000, DS4000, SVC and Storwize V7000 in IBM Tivoli Storage Productivity Center 4.2.1.
Additional information regarding the Storage Optimizer may be found in the IBM Tivoli Storage
Productivity Center User’s Guide, Chapter 4. Managing storage resources, under the section
titled “Optimizing storage configurations”.
6. Summary
This paper attempts to give an overview of performance monitoring and management function
that can be achieved using IBM Tivoli Storage Productivity Center 4.2.1. It lays out all the
configuration steps necessary to start a performance monitor, to set a threshold, and to generate
some useful reports and charts for problem diagnostics. It also attempts to interpret a small
number of performance metrics. The reporting of those metrics can form the foundation for
capacity planning and performance tuning.
7. Reference
IBM Tivoli Storage Productivity Center Support Site
http://www01.ibm.com/software/sysmgmt/products/support/IBMTotalStorageProductivityCenterStandardEditi
on.html
IBM Tivoli Storage Productivity Center Information Center
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp
IBM Tivoli Storage Productivity Center Installation and Configuration Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_t_installin
g_main.html
IBM Tivoli Storage Productivity Center V4.1.1 User’s Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_usersguid
e_v411.pdf
26
IBM Tivoli Storage Productivity Center V4.1.1 Messages
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp?topic=/com.ibm.tpc_V411.doc/tpc
msg41122.html
IBM TotalStorage Productivity Center V3.1 Problem Determination Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.itpc.doc/tpcpdg31.htm
IBM TotalStorage Productivity Center V3.3.2/4.1 Hints and Tips Guide
http://www01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&context=SSMN28&context=SSMMU
P&context=SS8JB5&context=SS8JFM&dc=DB700&dc=DA4A10&uid=swg27008254&loc=en_US
&cs=utf-8&lang=en
IBM TotalStorage Productivity Center V3.3 SAN Storage Provisioning Planner White Paper
ftp://ftp.software.ibm.com/common/ssi/sa/wh/n/tsw03026usen/TSW03026USEN.PDF
IBM Tivoli Storage Productivity Center V4.1 Storage Optimizer White Paper
http://www-01.ibm.com/support/docview.wss?uid=swg21389271
SAN Storage Performance Management Using TotalStorage Productivity Center Redbook
http://www.redbooks.ibm.com/redpieces/abstracts/sg247364.html?Open
Supported Storage Products
http://www01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&q1=subsystem&uid=swg21384734&l
oc=en_US&cs=utf-8&lang=en
Supported Fabric Devices
http://www01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&dc=DA420&dc=DA480&dc=DA490&
dc=DA430&dc=DA410&dc=DB600&dc=DA400&dc=D600&dc=D700&dc=DB520&dc=DB510&dc
=DA500&dc=DA470&dc=DA4A20&dc=DA460&dc=DA440&dc=DB550&dc=DB560&dc=DB700&d
c=DB530&dc=DA4A10&dc=DA4A30&dc=DB540&q1=switch&uid=swg21384219&loc=en_US&cs
=utf-8&lang=en
Support for Agents, CLI, and GUI
http://www01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&uid=swg21384678&loc=en_US&cs=
UTF-8&lang=en
27
Appendix A Available Metrics
This table lists the metric name, the types of components for which each metric is available, and
a description. The SMI-S BSP device type mentioned in the table below refers to any storage
subsystem that is managed via a CIMOM which supports SMI-S 1.1 with Block Server
Performance (BSP) subprofile.
Metrics that require specific versions of IBM Tivoli Storage Productivity Center are noted in
parenthesis.
IBM Tivoli Storage Productivity Center V4.2 and higher supports XIV version 10.2.2 and higher.
Limited metrics are supported by XIV version 10.1, including total I/O rate, total data rate, overall
response time, and overall transfer size. The read and write components of these metrics, as well
as read, write, and total cache hits, are provided by XIV version 10.2.2.
Metric
Metric
Type
Device/Component
Type
Description
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
Average number of I/O
operations per second for nonsequential read operations, for a
particular component over a
time interval.
Average number of I/O
operations per second for
sequential read operations, for a
particular component over a
time interval.
Average number of I/O
operations per second for both
sequential and non-sequential
read operations, for a particular
component over a time interval.
Volume Based Metrics
I/O Rates
Read I/O Rate
(normal)
801
Read I/O rate
(sequential)
802
Read I/O Rate
(overall)
803
Write I/O Rate
(normal)
804
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of I/O
operations per second for nonsequential write operations, for
a particular component over a
time interval.
28
Write I/O Rate
(sequential)
805
Write I/O Rate
(overall)
806
Total I/O Rate
(normal)
807
Total I/O Rate
(sequential)
808
Total I/O Rate
(overall)
809
Global Mirror Write
I/O Rate
(3.1.3)
937
Global Mirror
Overlapping Write
Percentage
(3.1.3)
938
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of I/O
operations per second for
sequential write operations, for
a particular component over a
time interval.
Average number of I/O
operations per second for both
sequential and non-sequential
write operations, for a particular
component over a time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of I/O
operations per second for nonsequential read and write
operations, for a particular
component over a time interval.
Average number of I/O
operations per second for
sequential read and write
operations, for a particular
component over a time interval.
Average number of I/O
operations per second for both
sequential and non-sequential
read and write operations, for a
particular component over a
time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.1 or higher.
Average number of write
operations per second issued to
the Global Mirror secondary
site, for a particular component
over a time interval.
Average percentage of write
operations issued by the Global
Mirror primary site which were
serialized overlapping writes, for
29
Global Mirror
Overlapping Write
I/O Rate
(3.1.3)
939
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
HPF Read I/O Rate
(4.1.1)
943
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
HPF Write I/O Rate
(4.1.1)
944
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
Total HPF I/O Rate
(4.1.1)
945
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
HPF I/O
Percentage
(4.1.1)
946
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
PPRC Transfer
Rate
(4.1.1)
947
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
a particular component over a
time interval. For SVC 4.3.1
and later, some overlapping
writes are processed in parallel
(are not serialized), so are
excluded. For earlier SVC
versions, all overlapping writes
were serialized.
Average number of serialized
overlapping write operations per
second encountered by the
Global Mirror primary site, for a
particular component over a
time interval. For SVC 4.3.1
and later, some overlapping
writes are processed in parallel
(are not serialized), so are
excluded. For earlier SVC
versions, all overlapping writes
were serialized.
Average number of read
operations per second that were
issued via the High
Performance FICON (HPF)
feature of the storage
subsystem, for a particular
component over a particular
time interval.
Average number of write
operations per second that were
issued via the High
Performance FICON (HPF)
feature of the storage
subsystem, for a particular
component over a particular
time interval.
Average number of read and
write operations per second that
were issued via the High
Performance FICON (HPF)
feature of the storage
subsystem, for a particular
component over a particular
time interval.
The percentage of all I/O
operations that were issued via
the High Performance FICON
(HPF) feature of the storage
subsystem for a particular
component over a particular
time interval.
Average number of track
transfer operations per second
for Peer-to-Peer Remote Copy
usage, for a particular
component over a particular
30
Read Data Cache
Hit Percentage
998
XIV Volume
XIV Module
XIV Subsystem
time interval.
Percentage of read data that
was read from the cache, for a
particular component over a
particular time interval.
Write Data Cache
Hit Percentage
999
XIV Volume
XIV Module
XIV Subsystem
Note: Available in v4.2.1.163.
Percentage of write data that
was written to the cache, for a
particular component over a
particular time interval.
Total Data Cache
Hit Percentage
1000
XIV Volume
XIV Module
XIV Subsystem
Note: Available in v4.2.1.163.
Percentage of all data that was
written to the cache, for a
particular component over a
particular time interval.
Small Transfers I/O
Percentage
1007
XIV Volume
XIV Module
XIV Subsystem
Medium Transfers
I/O Percentage
1008
XIV Volume
XIV Module
XIV Subsystem
Large Transfers I/O
Percentage
1009
XIV Volume
XIV Module
XIV Subsystem
Very Large
Transfers I/O
Percentage
1010
XIV Volume
XIV Module
XIV Subsystem
Note: Available in v4.2.1.163.
Percentage of all I/Os that were
operations with small transfer
sizes (<= 8 KB), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Percentage of all I/Os that were
operations with medium transfer
sizes (> 8 KB and <= 64 KB), for
a particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Percentage of all I/Os that were
operations with large transfer
sizes (> 64 KB and <= 512 KB),
for a particular component over
a particular time interval.
Note: Available in v4.2.1.163.
Percentage of all I/Os that were
operations with very large
transfer sizes (> 512 KB), for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Cache Hit Percentages
Read Cache Hits
810
ESS/DS6K/DS8K Volume
(normal)
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
Read Cache Hits
811
ESS/DS6K/DS8K Volume
(sequential)
ESS/DS6K/DS8K Array
Percentage of cache hits for
non-sequential read operations,
for a particular component over
a time interval.
Percentage of cache hits for
sequential read operations, for a
31
Read Cache Hits
(overall)
812
Write Cache Hits
(normal)
813
Write Cache Hits
(sequential)
814
Write Cache Hits
(overall)
815
Total Cache Hits
(normal)
816
Total Cache Hits
(sequential)
817
Total Cache Hits
(overall)
818
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
particular component over a
time interval.
Percentage of cache hits for
both sequential and nonsequential read operations, for a
particular component over a
time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Percentage of cache hits for
non-sequential write operations,
for a particular component over
a time interval.
Percentage of cache hits for
sequential write operations, for
a particular component over a
time interval.
Percentage of cache hits for
both sequential and nonsequential write operations, for
a particular component over a
time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Percentage of cache hits for
non-sequential read and write
operations, for a particular
component over a time interval.
Percentage of cache hits for
sequential read and write
operations, for a particular
component over a time interval.
Percentage of cache hits for
both sequential and non-
32
Readahead
Percentage of
Cache Hits
(3.1.3)
Dirty-Write
Percentage of
Cache Hits
(3.1.3)
Data Rates
Read Data Rate
Write Data Rate
Total Data Rate
890
891
819
820
821
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
sequential read and write
operations, for a particular
component over a time interval.
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
Average number of megabytes
(2^20 bytes) per second that
were transferred for read
operations, for a particular
component over a time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Percentage of all read cache
hits which occurred on
prestaged data, for a particular
component over a time interval.
Percentage of all write cache
hits which occurred on already
dirty data in the cache, for a
particular component over a
time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of megabytes
(2^20 bytes) per second that
were transferred for write
operations, for a particular
component over a time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of megabytes
33
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
XIV Volume
XIV Module
XIV Subsystem
Small Transfers
Data Percentage
1011
Medium Transfers
Data Percentage
1012
XIV Volume
XIV Module
XIV Subsystem
Large Transfers
Data Percentage
1013
XIV Volume
XIV Module
XIV Subsystem
Very Large
Transfers Data
Percentage
1014
XIV Volume
XIV Module
XIV Subsystem
(2^20 bytes) per second that
were transferred for read and
write operations, for a particular
component over a time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.1 or higher.
Percentage of all data that was
transferred via I/O operations
with small transfer sizes
(<= 8 KB), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Percentage of all data that was
transferred via I/O operations
with medium transfer sizes
(> 8 KB and <= 64 KB), for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Percentage of all data that was
transferred via I/O operations
with large transfer sizes
(> 64 KB and <= 512 KB), for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Percentage of all data that was
transferred via I/O operations
with very large transfer sizes
(> 512 KB), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Response Times
Read Response
822
Time
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
Average number of milliseconds
that it took to service each read
operation, for a particular
component over a time interval.
Note: SVC VDisk, Node, I/O
Group, MDisk Group, and
Subsystem support requires
34
Write Response
Time
Overall Response
Time
823
824
Peak Read
Response Time
(3.1.3)
940
Peak Write
Response Time
(3.1.3)
941
Global Mirror
Secondary Write
Lag
(3.1.3)
942
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of milliseconds
that it took to service each write
operation, for a particular
component over a time interval.
Note: SVC VDisk, Node, I/O
Group, MDisk Group, and
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of milliseconds
that it took to service each I/O
operation (read and write), for a
particular component over a
time interval.
Note: SVC VDisk, Node, I/O
Group, MDisk Group, and
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.1 or higher.
The peak (worst) response time
among all read operations, for a
particular component over a
time interval.
The peak (worst) response time
among all write operations, for a
particular component over a
time interval.
The average number of
additional milliseconds it took to
service each secondary write
operation for Global Mirror,
beyond the time needed to
service the primary writes, for a
particular component over a
particular time interval.
35
Overall Host
Attributed
Response Time
Percentage
(4.1.1)
948
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Read Cache Hit
Response Time
1001
XIV Volume
XIV Module
XIV Subsystem
Write Cache Hit
Response Time
1002
XIV Volume
XIV Module
XIV Subsystem
Overall Cache Hit
Response
1003
XIV Volume
XIV Module
XIV Subsystem
Read Cache Miss
Response Time
1004
XIV Volume
XIV Module
XIV Subsystem
Write Cache Miss
Response Time
1005
XIV Volume
XIV Module
XIV Subsystem
Overall Cache Miss
Response Time
1006
XIV Volume
XIV Module
XIV Subsystem
This is the percentage of the
average response time
(read+write) which can be
attributed to delays from the
host systems. This is provided
as an aid to diagnose slow
hosts and poorly performing
fabrics. This value is based on
the time taken for hosts to
respond to transfer-ready
notifications from the SVC
nodes (for read) and the time
taken for hosts to send the write
data after the node has
responded to a transfer-ready
notification (for write).
Average number of milliseconds
that it took to service each read
cache hit operation, for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each write
cache hit operation, for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each
cache hit operation (reads and
writes), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each read
cache miss operation, for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each write
cache miss operation, for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each
cache miss operation (reads
and writes), for a particular
36
component over a particular
time interval.
Small Transfers
Response Time
1015
XIV Volume
XIV Module
XIV Subsystem
Medium Transfers
Response Time
1016
XIV Volume
XIV Module
XIV Subsystem
Large Transfers
Response Time
1017
XIV Volume
XIV Module
XIV Subsystem
Very Large
Transfers
Response Time
1018
XIV Volume
XIV Module
XIV Subsystem
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each I/O
with a small transfer size
(<= 8 KB), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each I/O
with a medium transfer size
(> 8 KB and <= 64 KB), for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each I/O
with a large transfer size
(> 64 KB and <= 512 KB), for a
particular component over a
particular time interval.
Note: Available in v4.2.1.163.
Average number of milliseconds
that it took to service each I/O
with a very large transfer size
(> 512 KB), for a particular
component over a particular
time interval.
Note: Available in v4.2.1.163.
Transfer Sizes
Read Transfer Size
825
Write Transfer Size
826
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Average number of KB per I/O
for read operations, for a
particular component over a
time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of KB per I/O
for write operations, for a
37
Overall Transfer
Size
827
Record Mode Reads
Record Mode Read
828
I/O Rate
Record Mode Read
Cache Hits
829
Cache Transfers
Disk to Cache I/O
830
Rate
Cache to Disk I/O
Rate
831
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
XIV Volume
XIV Module
XIV Subsystem
particular component over a
time interval.
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
Average number of I/O
operations per second for
record mode read operations,
for a particular component over
a time interval.
Percentage of cache hits for
record mode read operations,
for a particular component over
a time interval.
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
SVC/Storwize VDisk
SVC/Storwize Node
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.2.2 or higher.
Average number of KB per I/O
for read and write operations,
for a particular component over
a time interval.
Note: SVC Node and SVC
Subsystem support requires
v3.1.3 or above.
Note: XIV metrics require XIV
version 10.1 or higher.
Average number of I/O
operations (track transfers) per
second for disk to cache
transfers, for a particular
component over a time interval.
Note: SVC VDisk, Node, I/O
Group, and Subsystem support
requires v3.1.3 or above.
Average number of I/O
operations (track transfers) per
second for cache to disk
transfers, for a particular
component over a time interval.
38
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Constraints
Write-cache Delay
832
ESS/DS6K/DS8K Volume
Percentage
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Delay
I/O Rate
833
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache
Overflow
Percentage
(3.1.3)
894
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache
Overflow I/O Rate
(3.1.3)
895
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Flushthrough
Percentage
(3.1.3)
896
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Flushthrough I/O Rate
(3.1.3)
897
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Writethrough
Percentage
(3.1.3)
898
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Write-cache Writethrough I/O Rate
(3.1.3)
899
SVC/Storwize VDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Note: SVC VDisk, Node, I/O
Group, and Subsystem support
requires v3.1.3 or above.
Percentage of I/O operations
that were delayed due to writecache space constraints or
other conditions, for a particular
component over a time interval.
(The ratio of delayed operations
to total I/Os.)
Note: SVC VDisk, Node, I/O
Group, and Subsystem support
requires v3.1.3 or above.
Average number of I/O
operations per second that were
delayed due to write-cache
space constraints or other
conditions, for a particular
component over a time interval.
Note: SVC VDisk, Node, I/O
Group, and Subsystem support
requires v3.1.3 or above.
Percentage of write operations
that were delayed due to lack of
write-cache space, for a
particular component over a
time interval.
Average number of tracks per
second that were delayed due
to lack of write-cache space, for
a particular component over a
time interval.
Percentage of write operations
that were processed in Flushthrough write mode, for a
particular component over a
time interval.
Average number of tracks per
second that were processed in
Flush-through write mode, for a
particular component over a
time interval.
Percentage of write operations
that were processed in Writethrough write mode, for a
particular component over a
time interval.
Average number of tracks per
second that were processed in
Write-through write mode, for a
particular component over a
39
time interval.
Miscellaneous Computed Values
Cache Holding
834
ESS/DS6K/DS8K Controller
Time
ESS/DS6K/DS8K
Subsystem
CPU Utilization
(3.1.3)
900
Non-Preferred
Node Usage
Percentage
(4.1.1)
949
Volume Utilization
(4.1.1)
978
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize VDisk
SVC/Storwize I/O Group
ESS/DS6K/DS8K Volume
SVC/Storwize VDisk
Average cache holding time, in
seconds, for I/O data in this
subsystem controller (cluster).
Shorter time periods indicate
adverse performance.
Average utilization percentage
of the CPU(s) for a particular
component over a time interval.
The overall percentage of I/O
performed or data transferred
via the non-preferred nodes of
the VDisks, for a particular
component over a particular
time interval.
The approximate utilization
percentage of a particular
volume over a time interval, i.e.
the average amount of time that
the volume was busy reading or
writing data.
Backend Based Metrics
I/O Rates
Backend Read I/O
Rate
Backend Write I/O
Rate
Total Backend I/O
Rate
835
836
837
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of I/O
operations per second for read
operations, for a particular
component over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of I/O
operations per second for write
operations, for a particular
component over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of I/O
operations per second for read
and write operations, for a
particular component over a
time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Data Rates
40
Backend Read
Data Rate
Backend Write
Data Rate
Total Backend
Data Rate
838
839
840
Response Times
Backend Read
841
Response Time
Backend Write
Response Time
Overall Backend
Response Time
842
843
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of megabytes
(2^20 bytes) per second that
were transferred for read
operations, for a particular
component over a time interval.
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of milliseconds
that it took to service each read
operation, for a particular
component over a time interval.
For SVC, this is the external
response time time of the
MDisks.
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of megabytes
(2^20 bytes) per second that
were transferred for write
operations, for a particular
component over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of megabytes
(2^20 bytes) per second that
were transferred for read and
write operations, for a particular
component over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that it took to service each write
operation, for a particular
component over a time interval.
For SVC, this is the external
response time time of the
MDisks.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that it took to service each I/O
operation (read and write), for a
41
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Backend Read
Queue Time
Backend Write
Queue Time
Overall Backend
Queue Time
844
845
846
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Peak Backend
Read Response
Time
(4.1.1)
950
SVC/Storwize MDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
Peak Backend
Write Response
Time
(4.1.1)
951
SVC/Storwize MDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
Peak Backend
Read Queue Time
(4.1.1)
952
SVC/Storwize MDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
particular component over a
time interval.
For SVC, this is the external
response time time of the
MDisks.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
issued to the backend device,
for a particular MDisk or MDisk
Group over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
issued to the backend device,
for a particular MDisk or MDisk
Group over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
issued to the backend device,
for a particular MDisk or MDisk
Group over a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
The peak (worst) response time
among all read operations, for a
particular component over a
time interval. For SVC, this is
the external response time of
the MDisks.
The peak (worst) response time
among all write operations, for a
particular component over a
time interval. For SVC, this is
the external response time of
the MDisks.
The lower bound on the peak
(worst) queue time for read
operations, for a particular
component over a time interval.
The queue time is the amount of
42
Peak Backend
Write Queue Time
(4.1.1)
953
Transfer Sizes
Backend Read
847
Transfer Size
Backend Write
Transfer Size
Overall
BackendTransfer
Size
848
849
Disk Utilization
Disk Utilization
850
Percentage
Sequential I/O
Percentage
851
SVC/Storwize MDisk
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize MDisk Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
ESS/DS6K/DS8K
Subsystem
SVC/Storwize MDisk
SVC/Storwize MDisk Group
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Array
time that the read operation
spent on the queue before
being issued to the backend
device.
The lower bound on the peak
(worst) queue time for write
operations, for a particular
component over a time interval.
The queue time is the amount of
time that the write operation
spent on the queue before
being issued to the backend
device.
Average number of KB per I/O
for read operations, for a
particular component over a
time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of KB per I/O
for write operations, for a
particular component over a
time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of KB per I/O
for read and write operations,
for a particular component over
a time interval.
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
The approximate utilization
percentage of a particular rank
over a time interval, i.e. the
average percent of time that the
disks associated with the array
were busy
Percent of all I/O operations
performed for a particular array
over a time interval that were
sequential operations.
43
Front-end and Switch Based Metrics
I/O or Packet Rates
Port Send I/O Rate
852
Port Receive I/O
Rate
Total Port I/O Rate
853
854
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
XIV Port
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
XIV Port
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
XIV Port
Port Send Packet
Rate
855
Switch Port
Switch
Port Receive
Packet Rate
856
Switch Port
Switch
Total Port Packet
Rate
857
Switch Port
Switch
Port to Host Send
I/O Rate
(3.1.3)
901
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
Average number of I/O
operations per second for send
operations, for a particular port
over a time interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
Average number of I/O
operations per second for
receive operations, for a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
Average number of I/O
operations per second for send
and receive operations, for a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
Average number of packets per
second for send operations, for
a particular port over a time
interval.
Average number of packets per
second for receive operations,
for a particular port over a time
interval.
Average number of packets per
second for send and receive
operations, for a particular port
over a time interval.
Average number of exchanges
(I/Os) per second sent to host
computers by a particular
44
SVC/Storwize Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Host
Receive I/O Rate
(3.1.3)
902
Total Port to Host
I/O Rate
(3.1.3)
903
Port to Disk Send
I/O Rate
(3.1.3)
904
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Disk
Receive I/O Rate
(3.1.3)
905
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Disk
I/O Rate
(3.1.3)
906
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Send I/O Rate
(3.1.3)
907
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Receive I/O Rate
(3.1.3)
908
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Local
Node I/O Rate
(3.1.3)
909
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Send I/O
Rate
(3.1.3)
910
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Receive I/O
Rate
(3.1.3)
911
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to
Remote Node I/O
Rate
(3.1.3)
912
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
component over a time interval.
Average number of exchanges
(I/Os) per second received from
host computers by a particular
component over a time interval.
Average number of exchanges
(I/Os) per second transmitted
between host computers and a
particular component over a
time interval.
Average number of exchanges
(I/Os) per second sent to
storage subsystems by a
particular component over a
time interval.
Average number of exchanges
(I/Os) per second received from
storage subsystems by a
particular component over a
time interval.
Average number of exchanges
(I/Os) per second transmitted
between storage subsystems
and a particular component over
a time interval.
Average number of exchanges
(I/Os) per second sent to other
nodes in the local SVC cluster
by a particular component over
a time interval.
Average number of exchanges
(I/Os) per second received from
other nodes in the local SVC
cluster by a particular
component over a time interval.
Average number of exchanges
(I/Os) per second transmitted
between other nodes in the local
SVC cluster and a particular
component over a time interval.
Average number of exchanges
(I/Os) per second sent to nodes
in the remote SVC cluster by a
particular component over a
time interval.
Average number of exchanges
(I/Os) per second received from
nodes in the remote SVC cluster
by a particular component over
a time interval.
Average number of exchanges
(I/Os) per second transmitted
between nodes in the remote
SVC cluster and a particular
component over a time interval.
45
Port FCP Send I/O
Rate
(4.1.1)
979
ESS/DS6K/DS8K Port
Port FCP Receive
I/O Rate
(4.1.1)
980
ESS/DS6K/DS8K Port
Total Port FCP I/O
Rate
(4.1.1)
981
ESS/DS6K/DS8K Port
Port FICON Send
I/O Rate
(4.1.1)
954
ESS/DS6K/DS8K Port
Port FICON
Receive I/O Rate
(4.1.1)
955
ESS/DS6K/DS8K Port
Total Port FICON
I/O Rate
(4.1.1)
956
ESS/DS6K/DS8K Port
Port PPRC Send
I/O Rate
(4.1.1)
957
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Port PPRC
Receive I/O Rate
(4.1.1)
958
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Total Port PPRC
I/O Rate
(4.1.1)
959
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
858
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
Switch Port
Switch
XIV Port
Data Rates
Port Send Data
Rate
Average number of send
operations per second using the
FCP protocol, for a particular
port over a time interval.
Average number of receive
operations per second using the
FCP protocol, for a particular
port over a time interval.
Average number of send and
receive operations per second
using the FCP protocol, for a
particular port over a time
interval.
Average number of send
operations per second using the
FICON protocol, for a particular
port over a time interval.
Average number of receive
operations per second using the
FICON protocol, for a particular
port over a time interval.
Average number of send and
receive operations per second
using the FICON protocol, for a
particular port over a time
interval.
Average number of send
operations per second for Peerto-Peer Remote Copy usage, for
a particular port over a time
interval.
Average number of receive
operations per second for Peerto-Peer Remote Copy usage, for
a particular port over a time
interval.
Average number of send and
receive operations per second
for Peer-to-Peer Remote Copy
usage for a particular port over
a time interval.
Average number of megabytes
(2^20 bytes) per second that
were transferred for send (read)
operations, for a particular port
over a time interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
46
Port Receive Data
Rate
Total Port Data
Rate
859
860
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
Switch Port
Switch
XIV Port
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SMI-S BSP Port
Switch Port
Switch
XIV Port
Port Peak Send
Data Rate
861
Switch Port
Switch
Port Peak Receive
Data Rate
862
Switch Port
Switch
Port to Host Send
Data Rate
(3.1.3)
913
Port to Host
Receive Data Rate
(3.1.3)
914
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Host
Data Rate
(3.1.3)
915
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Disk Send
Data Rate
(3.1.3)
916
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of megabytes
(2^20 bytes) per second that
were transferred for receive
(write) operations, for a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
Average number of megabytes
(2^20 bytes) per second that
were transferred for send and
receive operations, for a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem and SVC Port,
Node, I/O Group, and
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port support requires v3.3; XIV
Port supported in v4.2.1.163.
Peak number of megabytes
(2^20 bytes) per second that
were sent by a particular port
over a time interval.
Peak number of megabytes
(2^20 bytes) per second that
were received by a particular
port over a time interval.
Average number of megabytes
(2^20 bytes) per second sent to
host computers by a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second
received from host computers
by a particular component over
a time interval.
Average number of megabytes
(2^20 bytes) per second
transmitted between host
computers and a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second sent to
storage subsystems by a
particular component over a
time interval.
47
Port to Disk
Receive Data Rate
(3.1.3)
917
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Disk
Data Rate
(3.1.3)
918
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Send Data Rate
(3.1.3)
919
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Receive Data Rate
(3.1.3)
920
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Local
Node Data Rate
(3.1.3)
921
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Send Data
Rate
(3.1.3)
922
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Receive Data
Rate
(3.1.3)
923
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to
Remote Node Data
Rate
(3.1.3)
924
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port FCP Send
Data Rate
(4.1.1)
982
ESS/DS6K/DS8K Port
Port FCP Receive
Data Rate
(4.1.1)
983
ESS/DS6K/DS8K Port
Total Port FCP
Data Rate
984
ESS/DS6K/DS8K Port
Average number of megabytes
(2^20 bytes) per second
received from storage
subsystems by a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second
transmitted between storage
subsystems and a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second sent to
other nodes in the local SVC
cluster by a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second
received from other nodes in the
local SVC cluster by a particular
component over a time interval.
Average number of megabytes
(2^20 bytes) per second
transmitted between other
nodes in the local SVC cluster
and a particular component over
a time interval.
Average number of megabytes
(2^20 bytes) per second sent to
nodes in the remote SVC cluster
by a particular component over
a time interval.
Average number of megabytes
(2^20 bytes) per second
received from nodes in the
remote SVC cluster by a
particular component over a
time interval.
Average number of megabytes
(2^20 bytes) per second
transmitted between nodes in
the remote SVC cluster and a
particular component over a
time interval.
Average number of megabytes
(2^20 bytes) per second sent
using the FCP protocol, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second
received using the FCP
protocol, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second sent or
48
(4.1.1)
Port FICON Send
Data Rate
(4.1.1)
960
ESS/DS6K/DS8K Port
Port FICON
Receive Data Rate
(4.1.1)
961
ESS/DS6K/DS8K Port
Total Port FICON
Data Rate
(4.1.1)
962
ESS/DS6K/DS8K Port
Port PPRC Send
Data Rate
(4.1.1)
963
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Port PPRC
Receive Data Rate
(4.1.1)
964
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Total Port PPRC
Data Rate
(4.1.1)
965
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Response Times
Port Send
863
Response Time
Port Receive
Response Time
864
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
XIV Port
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
XIV Port
received using the FCP
protocol, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second sent
using the FICON protocol, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second
received using the FICON
protocol, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second sent or
received using the FICON
protocol, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second sent for
Peer-to-Peer Remote Copy
usage, for a particular port over
a time interval.
Average number of megabytes
(2^20 bytes) per second
received for Peer-to-Peer
Remote Copy usage, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second
transferred for Peer-to-Peer
Remote Copy usage, for a
particular port over a time
interval.
Average number of milliseconds
that it took to service each send
(read) operation, for a particular
port over a time interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above; XIV Port
supported in v4.2.1.163.
Average number of milliseconds
that it took to service each
receive (write) operation, for a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above; XIV Port
supported in v4.2.1.163.
49
Overall Port
Response Time
865
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
XIV Port
Port to Local Node
Send Response
Time
(3.1.3)
925
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Receive Response
Time
(3.1.3)
926
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Local
Node Response
Time
(3.1.3)
927
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Send Queued Time
(3.1.3)
928
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Local Node
Receive Queued
Time
(3.1.3)
929
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to Local
Node Queued Time
(3.1.3)
930
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Average number of milliseconds
that it took to service each
operation (send and receive),
for a particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above; XIV Port
supported in v4.2.1.163.
Average number of milliseconds
it took to service each send
operation to another node in the
local SVC cluster, for a
particular component over a
time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each receive
operation from another node in
the local SVC cluster, for a
particular component over a
time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each send or
receive operation between
another node in the local SVC
cluster and a particular
component over a time interval.
For SVC, this is the external
response time of the transfers.
Average number of milliseconds
that each send operation issued
to another node in the local SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
that each receive operation from
another node in the local SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
that each operation issued to
another node in the local SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
50
Port to Remote
Node Send
Response Time
(3.1.3)
931
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Receive
Response Time
(3.1.3)
932
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to
Remote Node
Response Time
(3.1.3)
933
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Send
Queued Time
(3.1.3)
934
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port to Remote
Node Receive
Queued Time
(3.1.3)
935
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Total Port to
Remote Node
Queued Time
(3.1.3)
936
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Port FCP Send
Response Time
(4.1.1)
985
ESS/DS6K/DS8K Port
Port FCP Receive
Response Time
(4.1.1)
986
ESS/DS6K/DS8K Port
Overall Port FCP
987
ESS/DS6K/DS8K Port
Average number of milliseconds
it took to service each send
operation to a node in the
remote SVC cluster, for a
particular component over a
time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each receive
operation from a node in the
remote SVC cluster, for a
particular component over a
time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each send or
receive operation between a
node in the remote SVC cluster
and a particular component over
a time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
that each send operation issued
to a node in the remote SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
that each receive operation from
a node in the remote SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
that each operation issued to a
node in the remote SVC cluster
spent on the queue before
being issued, for a particular
component over a time interval.
Average number of milliseconds
it took to service all send
operations using the FCP
protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all receive
operations using the FCP
protocol, for a particular port
over a time interval.
Average number of milliseconds
51
Response Time
(4.1.1)
Port FICON Send
Response Time
(4.1.1)
966
ESS/DS6K/DS8K Port
Port FICON
Receive Response
Time
(4.1.1)
967
ESS/DS6K/DS8K Port
Overall Port FICON
Response Time
(4.1.1)
968
ESS/DS6K/DS8K Port
Port PPRC Send
Response Time
(4.1.1)
969
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Port PPRC
Receive Response
Time
(4.1.1)
970
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Overall Port PPRC
Response Time
(4.1.1)
971
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
Transfer Sizes
Port Send Transfer
866
Size
Port Receive
Transfer Size
Overall Port
867
868
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SMI-S BSP Port
ESS/DS6K/DS8K Port
ESS/DS6K/DS8K
Subsystem
SMI-S BSP Port
ESS/DS6K/DS8K Port
it took to service all I/O
operations using the FCP
protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all send
operations using the FICON
protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all receive
operations using the FICON
protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all I/O
operations using the FICON
protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all send
operations for Peer-to-Peer
Remote Copy usage, for a
particular port over a time
interval.
Average number of milliseconds
it took to service all receive
operations for Peer-to-Peer
Remote Copy usage, for a
particular port over a time
interval.
Average number of milliseconds
it took to service all I/O
operations for Peer-to-Peer
Remote Copy usage, for a
particular port over a time
interval.
Average number of KB sent per
I/O by a particular port over a
time interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port requires v3.3.
Average number of KB received
per I/O by a particular port over
a time interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port requires v3.3.
Average number of KB
52
Transfer Size
ESS/DS6K/DS8K
Subsystem
SMI-S BSP Port
Port Send Packet
Size
869
Switch Port
Switch
Port Receive
Packet Size
870
Switch Port
Switch
Overall Port Packet
Size
871
Switch Port
Switch
Special Computed Values
Port Send
972
ESS/DS6K/DS8K Port
Utilization
Percentage
(4.1.1)
Port Receive
973
ESS/DS6K/DS8K Port
Utilization
Percentage
(4.1.1)
Overall Port
974
ESS/DS6K/DS8K Port
Utilization
Percentage
(4.1.1)
Port Send
975
ESS/DS8K Port
Bandwidth
SVC Port
Percentage
Switch Port
(4.1.1)
XIV Port
Port Receive
Bandwidth
Percentage
(4.1.1)
Overall Port
Bandwidth
Percentage
(4.1.1)
976
977
ESS/DS8K Port
SVC Port
Switch Port
XIV Port
ESS/DS8K Port
SVC Port
Switch Port
XIV Port
transferred per I/O by a
particular port over a time
interval.
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above.
Average number of KB sent per
packet by a particular port over
a time interval.
Average number of KB received
per packet by a particular port
over a time interval.
Average number of KB
transferred per packet by a
particular port over a time
interval.
Average amount of time that the
port was busy sending data,
over a particular time interval.
Average amount of time that the
port was busy receiving data,
over a particular time interval.
Average amount of time that the
port was busy sending or
receiving data, over a particular
time interval.
The approximate bandwidth
utilization percentage for send
operations by this port, over a
particular time interval, based
on its current negotiated speed.
Note: XIV support available in
v4.2.1.163.
The approximate bandwidth
utilization percentage for receive
operations by this port, over a
particular time interval, based
on its current negotiated speed.
Note: XIV support available in
v4.2.1.163.
The approximate bandwidth
utilization percentage for send
and receive operations by this
port, over a particular time
interval.
Note: XIV support available in
v4.2.1.163.
Error Rates
53
Error Frame Rate
872
DS8K Port
DS8K Subsystem
Switch Port
Switch
Dumped Frame
Rate
873
Switch Port
Switch
Link Failure Rate
874
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
Loss of Sync Rate
Loss of Signal Rate
CRC Error Rate
875
876
877
Short Frame Rate
878
Switch Port
Switch
Long Frame Rate
879
Switch Port
Switch
The number of frames per
second that were received in
error by a particular port over a
time interval.
Note: DS8K support requires
TPC v4.2.1.
The number of frames per
second that were lost due to a
lack of available host buffers, for
a particular port over a time
interval.
The number of link errors per
second that were experienced
by a particular port over a time
interval.
Note: DS8K and SVC/Storwize
support requires TPC v4.2.1.
The average number of times
per second that synchronization
was lost, for a particular
component over a particular
time interval.
Note: DS8K and SVC/Storwize
support requires TPC v4.2.1.
The average number of times
per second that the signal was
lost, for a particular component
over a particular time interval.
Note: DS8K and SVC/Storwize
support requires TPC v4.2.1.
The average number of frames
received per second in which
the CRC in the frame did not
match the CRC computed by
the receiver, for a particular
component over a particular
time interval.
Note: DS8K and SVC/Storwize
support requires TPC v4.2.1.
The average number of frames
received per second that were
shorter than 28 octets (24
header + 4 CRC) not including
any SOF/EOF bytes, for a
particular component over a
particular time interval.
The average number of frames
received per second that were
54
Encoding Disparity
Error Rate
880
Switch Port
Switch
Discarded Class3
Frame Rate
881
Switch Port
Switch
F-BSY Frame Rate
882
Switch Port
Switch
F-RJT Frame Rate
883
Switch Port
Switch
Primitive Sequence
Protocol Error Rate
988
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
DS8K Port
DS8K Subsystem
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Switch Port
Switch
longer than 2140 octets (24
header + 4 CRC + 2112 data)
not including any SOF/EOF
bytes, for a particular
component over a particular
time interval.
The average number of disparity
errors received per second, for
a particular component over a
particular time interval.
The average number of class-3
frames per second that were
discarded by a particular
component over a particular
time interval.
The average number of F-BSY
frames per second that were
generated by a particular
component over a particular
time interval.
The average number of F-RJT
frames per second that were
generated by a particular
component over a particular
time interval.
The average number of primitive
sequence protocol errors
detected for a particular
component over a particular
time interval.
Note: Added in TPC v4.2.1.
Invalid
Transmission Word
Rate
989
The average number of
transmission words per second
that had an 8b10 code violation
in one or more of its characters;
had a K28.5 in its second, third,
or fourth character positions;
and/or was an ordered set that
had an incorrect Beginning
Running Disparity.
Zero Buffer-Buffer
Credit Timer
990
SVC/Storwize Port
SVC/Storwize Node
SVC/Storwize I/O Group
SVC/Storwize Subsystem
Note: Added in TPC v4.2.1.
The number of microseconds for
which the port has been unable
to send frames due to lack of
buffer credit since the last node
reset.
Link Reset
Transmitted Rate
991
DS8K Port
DS8K Subsystem
Switch Port
Switch
Note: Added in TPC v4.2.1.
The average number of times
per second a port has
transitioned from an active (AC)
state to a Link Recovery (LR1)
state over a particular time
55
interval.
Link Reset
Received Rate
992
DS8K Port
DS8K Subsystem
Switch Port
Switch
Out of Order Data
Rate
993
DS8K Port
DS8K Subsystem
Out of Order ACK
Rate
994
DS8K Port
DS8K Subsystem
Duplicate Frame
Rate
995
DS8K Port
DS8K Subsystem
Invalid Relative
Offset Rate
996
DS8K Port
DS8K Subsystem
Sequence Timeout
Rate
997
DS8K Port
DS8K Subsystem
Note: Added in TPC v4.2.1.
The average number of times
per second a port has
transitioned from an active (AC)
state to a Link Recovery (LR2)
state over a particular time
interval
Note: Added in TPC v4.2.1.
The average number of times
per second that an out of order
frame was detected for a
particular port over a particular
time interval.
Note: Added in TPC v4.2.1.
The average number of times
per second that an out of order
ACK frame was detected for a
particular port over a particular
time interval.
Note: Added in TPC v4.2.1.
The average number of times
per second that a frame was
received that has been detected
as previously processed for a
particular port over a particular
time interval.
Note: Added in TPC v4.2.1.
The average number of times
per second that a frame was
received with bad relative offset
in the frame header for a
particular port over a particular
time interval.
Note: Added in TPC v4.2.1.
The average number of times
per second the port has
detected a timeout condition on
receiving sequence initiative for
a fibre channel exchange for a
particular port over a particular
time interval.
Note: Added in TPC v4.2.1.
56
Appendix B Available Thresholds
This table lists the threshold name, the types of components for which each threshold is available,
and a description. The SMI-S BSP device type mentioned in the table below refers to any storage
subsystem that is managed via a CIMOM which supports SMI-S 1.1 with Block Server
Performance (BSP) subprofile.Thresholds that require specific versions of IBM Tivoli Storage Productivity Center are noted in
parenthesis.
Threshold
(Metric)
Type
Device/Component
Type
Description
Array Thresholds
Disk Utilization
Percentage
850
ESS/DS6K/DS8K Array
837
ESS/DS6K/DS8K Array
SVC/Storwize MDisk
SVC/Storwize MDisk
Group
Total Backend
Data Rate
(4.1.1)
840
ESS/DS6K/DS8K Array
SVC/Storwize MDisk
SVC/Storwize MDisk
Group
Backend Read
Response
Time
(4.1.1)
841
ESS/DS6K/DS8K Array
SVC/Storwize MDisk
Total Backend
I/O Rate
(4.1.1)
Sets thresholds on the approximate
utilization percentage of the arrays in a
particular subsystem, i.e. the average
percent of time that the disks associated
with the array were busy. The Disk
Utilization metric for each array is
checked against the threshold boundaries
for each collection interval. This threshold
is enabled by default for ESS
subsystems, and disabled by default for
others. The default threshold boundaries
are 80%, 50%, -1, -1. In addition, a filter
is available for this threshold which will
ignore any boundary violations if the
Sequential I/O Percentage is less than a
specified filter value. The pre-populated
filter value is 80.
Sets thresholds on the average number of
I/O operations per second for array and
MDisk read and write operations. The
Total I/O Rate metric for each array,
MDisk, or MDisk Group is checked
against the threshold boundaries for each
collection interval. This threshold is
disabled by default.
Sets thresholds on the average number of
megabytes (2^20 bytes) per second that
were transferred for array or MDisk read
and write operations. The Total Data
Rate metric for each array, MDisk, or
MDisk Group is checked against the
threshold boundaries for each collection
interval. This threshold is disabled by
default.
Sets thresholds on the average number of
milliseconds that it took to service each
array and MDisk read operation. The
Backend Read Response Time metric for
57
Threshold
(Metric)
Type
Device/Component
Type
Backend Write
Response
Time
(4.1.1)
842
ESS/DS6K/DS8K Array
SVC/Storwize MDisk
Overall
Backend
Response
Time
843
SVC/Storwize MDisk
Backend Read
Queue Time
(4.1.1)
844
SVC/Storwize Mdisk
Description
each array or MDisk is checked against
the threshold boundaries for each
collection interval. Though this threshold
is disabled by default, suggested
boundary values of 35,25,-1,-1 are prepopulated. In addition, a filter is available
for this threshold which will ignore any
boundary violations if the Backend Read
I/O Rate is less than a specified filter
value. The pre-populated filter value is 5.
Sets thresholds on the average number of
milliseconds that it took to service each
array and MDisk write operation. The
Backend Write Response Time metric for
each array or MDisk is checked against
the threshold boundaries for each
collection interval. Though this threshold
is disabled by default, suggested
boundary values of 120,80,-1,-1 are prepopulated. In addition, a filter is available
for this threshold which will ignore any
boundary violations if the Backend Write
I/O Rate is less than a specified filter
value. The pre-populated filter value is 5.
Sets thresholds on the average number of
milliseconds that it took to service each
MDisk I/O operation, measured at the
MDisk level. The Total Response Time
(external) metric for each MDisk is
checked against the threshold boundaries
for each collection interval. This threshold
is disabled by default. In addition, a filter
is available for this threshold which will
ignore any boundary violations if the Total
Backend I/O Rate is less than a specified
filter value. The pre-populated filter value
is 10.
Sets thresholds on the average number of
milliseconds that each read operation
spent on the queue before being issued to
the backend device. The Backend Read
Queue Time metric for each MDisk is
checked against the threshold boundaries
for each collection interval. Though this
threshold is disabled by default,
suggested boundary values of 5,3,-1,-1
are pre-populated. In addition, a filter is
available for this threshold which will
ignore any boundary violations if the
Backend Read I/O Rate is less than a
specified filter value. The pre-populated
filter value is 5. Violation of these
58
Threshold
(Metric)
Type
Device/Component
Type
Backend Write
Queue Time
(4.1.1)
845
SVC/Storwize MDisk
Peak Backend
Write
Response
Time
(4.1.1)
951
SVC/Storwize Node
Description
threshold boundaries means that the SVC
deems the MDisk to be overloaded. There
is a queue algorithm that determines the
number of concurrent I/O ops that the
SVC will send to a given MDisk. If there
is any queuing (other than during maybe
a backup process) then this suggests
performance can be improved by
resolving the queuing issue.
Sets thresholds on the average number of
milliseconds that each write operation
spent on the queue before being issued to
the backend device. The Backend Write
Queue Time metric for each MDisk is
checked against the threshold boundaries
for each collection interval. Though this
threshold is disabled by default,
suggested boundary values of 5,3,-1,-1
are pre-populated. In addition, a filter is
available for this threshold which will
ignore any boundary violations if the
Backend Read I/O Rate is less than a
specified filter value. The pre-populated
filter value is 5. Violation of these
threshold boundaries means that the SVC
deems the MDisk to be overloaded. There
is a queue algorithm that determines the
number of concurrent I/O ops that the
SVC will send to a given MDisk. If there
is any queuing (other than during maybe
a backup process) then this suggests
performance can be improved by
resolving the queuing issue.
Sets thresholds on the peak (worst)
response time among all MDisk write
operations by a node. The Backend Peak
Write Response Time metric for each
Node is checked against the threshold
boundaries for each collection interval.
This threshold is enabled by default, with
default boundary values of 30000,10000,1,-1. Violation of these threshold
boundaries means that the SVC cache is
having to “partition limit” for a given MDisk
group – that is, the destage data from the
SVC cache for this MDisk group is
causing the cache to fill up (writes are
being received faster than they can be
destaged to disk). If delays reach 30
seconds or more, then the SVC will switch
into “short term mode” where writes are
no longer cached for the MDisk Group.
59
Threshold
(Metric)
Type
Device/Component
Type
Description
Controller Thresholds
Total I/O Rate
(overall)
809
SMI-S BSP Subsystem
ESS/DS6K/DS8K
Controller
SVC/Storwize I/O Group
XIV Subsystem
SMI-S BSP Subsystem
ESS/DS6K/DS8K
Controller
SVC/Storwize I/O Group
XIV Subsystem
Total Data
Rate
821
Write-cache
Delay
Percentage
832
ESS/DS6K/DS8K
Controller
SVC/Storwize Node
Cache Holding
Time
834
ESS/DS6K/DS8K
Controller
CPU
Utilization
(3.1.3)
900
SVC/Storwize Node
Non-Preferred
Node Usage
949
SVC/Storwize I/O Group
Sets threshold on the average number of
I/O operations per second for read and
write operations, for the subsystem
controllers (clusters) or I/O Groups. The
Total I/O Rate metric for each controller or
I/O Group is checked against the
threshold boundaries for each collection
interval. These thresholds are disabled
by default.
Sets threshold on the average number of
MB per second for read and write
operations, for the subsystem controllers
(clusters) or I/O Groups. The Total Data
Rate metric for each controller or I/O
Group is checked against the threshold
boundaries for each collection interval.
These thresholds are disabled by default.
Sets thresholds on the percentage of I/O
operations that were delayed due to writecache space constraints. The Writecache Full Percentage metric for each
controller or node is checked against the
threshold boundaries for each collection
interval. This threshold is enabled by
default, with default boundaries of 10, 3, 1, -1. In addition, a filter is available for
this threshold which will ignore any
boundary violations if the Write-cache
Delay I/O Rate is less than a specified
filter value. The pre-populated filter value
is 10 I/Os per sec.
Sets thresholds on the average cache
holding time, in seconds, for I/O data in
the subsystem controllers (clusters).
Shorter time periods indicate adverse
performance. The Cache Holding Time
metric for each controller is checked
against the threshold boundaries for each
collection interval. This threshold is
enabled by default, with default
boundaries of 30, 60, -1, -1.
Sets thresholds on the average utilization
percentage of the CPU(s) in the SVC
nodes. The CPU Utilization metric for
each node is checked against the
threshold boundaries for each collection
interval. This threshold is enabled by
default, with default boundaries of 90,75,1,-1.
Sets thresholds on the Non-Preferred
Node Usage Percentage of an I/O Group.
60
Threshold
(Metric)
Type
Device/Component
Type
Percentage
(4.1.1)
Description
This metric of each I/O Group is checked
against the threshold boundaries at each
collection interval. This threshold is
disabled by default. In addition, a filter is
available for this threshold which will
ignore any boundary violations if the Total
I/O Rate of the I/O Group is less than a
specified filter value.
Port Thresholds
Total Port I/O
Rate
Total Port
Packet Rate
Total Port
Data Rate
854
ESS/DS6K/DS8K Port
SVC/Storwize Port (3.1.3)
SMI-S BSP Port
XIV Port
857
Switch Port
860
ESS/DS6K/DS8K Port
SVC/Storwize Port (3.1.3)
SMI-S BSP Port
Switch Port
XIV Port
Overall Port
Response
Time
865
ESS/DS6K/DS8K Port
Port to Local
Node Send
Response
Time
(4.1.1)
925
SVC/Storwize Node
Sets thresholds on the average number of
I/O operations per second for send and
receive operations, for the ports. The
Total I/O Rate metric for each port is
checked against the threshold boundaries
for each collection interval. This threshold
is disabled by default.
Note: XIV support available in v4.2.1.163.
Sets thresholds on the average number of
packets per second for send and receive
operations, for the ports. The Total I/O
Rate metric for each port is checked
against the threshold boundaries for each
collection interval. This threshold is
disabled by default.
Sets thresholds on the average number of
MB per second for send and receive
operations, for the ports. The Total Data
Rate metric for each port is checked
against the threshold boundaries for each
collection interval. This threshold is
disabled by default.
Note: XIV support available in v4.2.1.163.
Sets thresholds on the average number of
milliseconds that it took to service each
I/O operation (send and receive) for ports.
The Total Response Time metric for each
port is checked against the threshold
boundaries for each collection interval.
This threshold is disabled by default.
Sets thresholds on the average number of
milliseconds it took to service each send
operation to another node in the local
SVC cluster. The Port to Local Node
Send Response Time metric for each
Node is checked against the threshold
boundaries for each collection interval.
This threshold is enabled by default, with
default boundary values of 3,1.5,-1,-1.
Violation of these threshold boundaries
means that it is taking too long to send
61
Threshold
(Metric)
Type
Device/Component
Type
Port to Local
Node Receive
Response
Time
(4.1.1)
926
SVC/Storwize Node
Port to Local
Node Send
Queue Time
(4.1.1)
928
SVC/Storwize Node
Port to Local
Node Receive
Queue Time
(4.1.1)
929
SVC/Storwize Node
Port Send
Utilization
Percentage
(4.1.1)
972
ESS/DS6K/DS8K Port
Description
data between nodes (on the fabric), and
suggests either congestion around these
FC ports, or an internal SVC microcode
problem.
Sets thresholds on the average number of
milliseconds it took to service each
receive operation from another node in
the local SVC cluster. The Port to Local
Node Receive Response Time metric for
each Node is checked against the
threshold boundaries for each collection
interval. This threshold is enabled by
default, with default boundary values of
1,0.5,-1,-1. Violation of these threshold
boundaries means that it is taking too
long to send data between nodes (on the
fabric), and suggests either congestion
around these FC ports, or an internal SVC
microcode problem.
Sets thresholds on the average number of
milliseconds that each send operation
issued to another node in the local SVC
cluster spent on the queue before being
issued. The Port to Local Node Send
Queued Time metric for each node is
checked against the threshold boundaries
for each collection interval. This threshold
is enabled by default, with default
boundary values of 2,1,-1,-1. Violation of
these threshold boundaries means that
the node has to wait too long to send data
to other nodes (on the fabric), and
suggests congestion on the fabric.
Sets thresholds on the average number of
milliseconds that each receive operation
issued to another node in the local SVC
cluster spent on the queue before being
issued. The Port to Local Node Receive
Queued Time metric for each node is
checked against the threshold boundaries
for each collection interval. This threshold
is enabled by default, with default
boundary values of 1,0.5,-1,-1. Violation
of these threshold boundaries means that
the node has to wait too long to receive
data from other nodes (on the fabric), and
suggests congestion on the fabric.
Sets thresholds on the average amount of
time that ports are busy sending data.
The Overall Port Busy Percentage metric
for each port is checked against the
threshold boundaries for each collection
62
Threshold
(Metric)
Port Receive
Utilization
Percentage
(4.1.1)
Port Send
Bandwidth
Percentage
(4.1.1)
Port Receive
Bandwidth
Percentage
(4.1.1)
Error Frame
Rate
Link Failure
Rate
CRC Error
Rate
Type
Device/Component
Type
973
ESS/DS6K/DS8K Port
975
ESS/DS8K Port
SVC/Storwize Port
Switch Port
XIV Port
976
872
874
877
ESS/DS8K Port
SVC/Storwize Port
Switch Port
XIV Port
DS8K Port
Switch Port
DS8K Port
SVC/Storwize Port
Switch Port
DS8K Port
SVC/Storwize Port
Switch Port
Description
interval. This threshold is disabled by
default.
Sets thresholds on the average amount of
time that ports are busy receiving data.
The Overall Port Busy Percentage metric
for each port is checked against the
threshold boundaries for each collection
interval. This threshold is disabled by
default.
Sets thresholds on the average port
bandwidth utilization percentage for send
operations. The Port Send Utilization
Percentage metric is checked against the
threshold boundaries for each collection
interval. This threshold is enabled by
default, with default boundaries 85,75,-1,1.
Note: XIV support available in v4.2.1.163.
Sets thresholds on the average port
bandwidth utilization percentage for
receive operations. The Port Send
Utilization Percentage metric is checked
against the threshold boundaries for each
collection interval. This threshold is
enabled by default, with default
boundaries 85,75,-1,-1.
Note: XIV support available in v4.2.1.163.
Sets thresholds on the average number of
frames per second received in error for
the switch ports. The Error Frame Rate
metric for each port is checked against
the threshold boundary for each collection
interval. This threshold is disabled by
default.
Note: DS8K support requires TPC v4.2.1.
Sets thresholds on the average number of
link errors per second experienced by the
switch ports. The Link Failure Rate metric
for each port is checked against the
threshold boundary for each collection
interval. This threshold is disabled by
default.
Note: DS8K and SVC/Storwize support
requires TPC v4.2.1.
Sets thresholds on the average number of
frames received in which the CRC in a
frame does not match the CRC computed
by the receiver. The CRC Error Rate
63
Threshold
(Metric)
Type
Device/Component
Type
Description
metric for each port is checked against
the threshold boundary for each collection
interval. This threshold is disabled by
default.
Invalid
Transmission
Word Rate
989
DS8K Port
SVC/Storwize Port
Switch Port
Note: Added in TPC v4.2.1.
Sets thresholds on the average number of
bit errors detected on a port. The Invalid
Transmission Word Rate metric for each
port is checked against the threshold
boundary for each collection interval. This
threshold is disabled by default.
Note: Added in TPC v4.2.1.
Zero BufferBuffer Credit
Timer
990
SVC/Storwize Port
Sets thresholds on the number of
microseconds for which the port has been
unable to send frames due to lack of
buffer credit since the last node reset. The
Zero Buffer-Buffer Credit Timer metric for
each port is checked against the
threshold boundary for each collection
interval. This threshold is disabled by
default.
Note: Added in TPC v4.2.1.
64
Appendix C DS3000, DS4000 and DS5000 Metrics
This table lists the metrics supported by DS3000, DS4000 and DS5000 subsystems, a
description of the metric, and the reports that will include the metrics.
Older DS3000, DS4000, and DS5000 subsystems managed by Engenio providers, e.g.
10.50.G0.04, only support a subset of the following metrics in their reports. Later levels of
DS3000, DS4000, and DS5000 subsystems managed by LSI SMI-S Provider 1.3 and above, e.g.
10.06.GG.33, support more metrics – these are denoted by an asterisk. For more information
regarding supported DS3000, DS4000, and DS5000 subsystems and their related providers,
please see:
http://www01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&q1=subsystem&uid=swg21384734&l
oc=en_US&cs=utf-8&lang=en
Metric
Description
Read I/O Rate
(overall)
Average number of I/O
operations per second for both
sequential and non-sequential
read operations, for a particular
component over a time interval.
Write I/O Rate
(overall)
Average number of I/O
operations per second for both
sequential and non-sequential
write operations, for a particular
component over a time interval.
Total I/O Rate
(overall)
Average number of I/O
operations per second for both
sequential and non-sequential
read and write operations, for a
particular component over a
time interval.
Read Cache Hits
(overall)
* Write Cache
Hits (overall)
Percentage of cache hits for
both sequential and nonsequential read operations, for
a particular component over a
time interval.
Percentage of cache hits for
both sequential and nonsequential write operations, for
Report Type
By Volume
By Controller*
By Subsystem
Controller Cache Performance*
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
By Subsystem
Controller Cache Performance*
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
By Subsystem
Controller Cache Performance*
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
By Subsystem
Controller Cache Performance*
Top Active Volumes Cache Hit Performance
By Volume*
By Controller*
By Subsystem*
65
Metric
* Total Cache
Hits (overall)
Description
Report Type
a particular component over a
time interval.
Percentage of cache hits for
both sequential and nonsequential read and write
operations, for a particular
component over a time interval.
Controller Cache Performance*
Top Active Volumes Cache Hit Performance*
By Volume*
By Controller*
By Subsystem*
Controller Cache Performance*
Top Active Volumes Cache Hit Performance*
By Volume
By Controller*
By Subsystem
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
By Subsystem
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
By Subsystem
Controller Performance*
Subsystem Performance
Top Active Volumes Cache Hit Performance
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
Read Data Rate
Average number of megabytes
(2^20 bytes) per second that
were transferred for read
operations, for a particular
component over a time interval.
Write Data Rate
Average number of megabytes
(2^20 bytes) per second that
were transferred for write
operations, for a particular
component over a time interval.
Total Data Rate
Average number of megabytes
(2^20 bytes) per second that
were transferred for read and
write operations, for a particular
component over a time interval.
Read Transfer
Size
Write Transfer
Size
Overall Transfer
Size
* Port Send I/O
Rate
* Port Receive
I/O Rate
* Total Port I/O
Average number of KB per I/O
for read operations, for a
particular component over a
time interval.
Average number of KB per I/O
for write operations, for a
particular component over a
time interval.
Average number of KB per I/O
for read and write operations,
for a particular component over
a time interval.
Average number of I/O
operations per second for send
operations, for a particular port
over a time interval.
Average number of I/O
operations per second for
receive operations, for a
particular port over a time
interval.
Average number of I/O
By Volume
By Controller*
By Subsystem
By Volume
By Controller*
By Subsystem
By Volume
By Controller*
By Subsystem
By Port*
Port Performance*
By Port*
Port Performance*
By Port*
66
Metric
Description
Rate
operations per second for send
and receive operations, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second that
were transferred for send (read)
operations, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second that
were transferred for receive
(write) operations, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second that
were transferred for send and
receive operations, for a
particular port over a time
interval.
Average number of KB sent per
I/O by a particular port over a
time interval.
Average number of KB received
per I/O by a particular port over
a time interval.
Average number of KB
transferred per I/O by a
particular port over a time
interval.
* Port Send Data
Rate
* Port Receive
Data Rate
* Total Port Data
Rate
* Port Send
Transfer Size
* Port Receive
Transfer Size
* Overall Port
Transfer Size
Report Type
Port Performance*
By Port*
Port Performance*
By Port*
Port Performance*
By Port*
Port Performance*
By Port*
Port Performance*
By Port*
Port Performance*
By Port*
Port Performance*
67