Monitoring System Performance and Health of i5/OS

advertisement
IBM Power Systems
Monitoring System Performance and Health of IBM i
Dawn May - dmmay@us.ibm.com
© 2012 IBM Corporation
IBM Power Systems
Goals
 Do you need to understand system performance in real-time?
 Do you need to know who is consuming the system CPU?
 Do you want to notify an operator when a partition is not performing as
expected?
 Do you want to know immediately when an inquiry message is sent or when
a critical job ends?
 This session will show you how to automate monitoring so you can focus on
other aspects of your job.
 By the end of this session, you will be able to:
– Create and use management central monitors
– Understand monitoring with IBM Systems Director
– Understand Health Indicators with IBM Systems Director Navigator
2
© 2012 IBM Corporation
IBM Power Systems
Agenda

Three Toolsets

Monitoring with Management Central

Monitoring with IBM Systems Director

Performance Health Indicators
3
© 2012 IBM Corporation
IBM Power Systems
Monitoring Interfaces
 Three different management tools
– System i Navigator
• Windows client application
 “iSeries Navigator”, “Operations Navigator”
• Management Central Monitors
– IBM Systems Director
• Cross-platform systems management solution
• Platform management, HW alerts
• Real-time performance monitoring
– IBM Systems Director Navigator
• Browser-based interface to manage a single IBM i partition
• IBM i Performance Tasks - Performance Data Investigator
• Health Indicators
4
© 2012 IBM Corporation
IBM Power Systems
Notes: Monitoring Interfaces - where to get them?
System i Navigator, is included with IBM iTM at no additional cost. The IBM i function is integrated into the base of the operating
system. The client function is shipped as part of System i Access for Windows.
Management Central is a technology integrated into System i Navigator and is not directly installed. When installing System i
Access for Windows, choose 'Custom Install'. Expand the System i Navigator option tree and select the appropriate
components such as Monitors, Commands.....
The general rule for Management Central functions for connectivity is that N-2 and N+2 releases are supported. However, for
the best performance and most functions available it is strongly recommended that your IBM System i Navigator and your
Central System be at the highest release you have available.
Your endpoint systems can then be at a mix of previous releases.
Also, note that the functions available to you are only as current as the client and Central System combination. (i.e. if System i
Navigator is at 6.1 and the Central System is at 5.4, only 5.4 functions will be available). Also, in almost all cases, function
new in a certain release can’t be run to endpoints at older releases.
IBM Systems Director Navigator is the Web console available beginning with IBM i 6.1. It is included with IBM i; no installation
is necessary, all you need in a Web browser.
IBM Systems Director is a browser-based management tool. You need to download and install the management server (it
doesn't run on i) as well as installing the agents required on each endpoint. For monitoring on i, you need to install the 5770UME licensed program product and 5733-SC1, Option 1 (SSH) on each IBM i partition you want to monitor.
IBM Systems Director can be downloaded from http://www-03.ibm.com/systems/software/director/downloads/index.html
5
© 2012 IBM Corporation
IBM Power Systems
Management Central
Monitors
6
© 2012 IBM Corporation
IBM Power Systems
Know your Tools
Firewall and Internet
Web
application
server
Central System
IBM Systems Director
Navigator for i
– Performance tasks
7
Endpoint Systems
System
Group
IBM System i Navigator
for Windows
-- monitoring
-- historical trending
© 2012 IBM Corporation
IBM Power Systems
Notes - Terminology
IBM System i Navigator provides a graphical user interface to IBM i® . It comes in 2 complimentary
options: Windows and Web.
System i Navigator for Windows is in installed on the PC so the user can have a rich graphical
interface to interact with their systems.
System i Navigator tasks on the Web perform a subset of Navigator tasks through an Internet Web
browser. These are URL-addressable links only.
IBM Systems Director Navigator for i is the browser-based console that has much of the function of
System i Navigator; monitors however, are not available through this interface. The Performance
tasks is only available through this Web console.
IBM® i® integrated Web application server - The i integrated Web application server (5761-SS1)
integrates an OSGi-based Web-servlet container with the i operating system. (5.4 and later)
Central System helps to manager your other systems (called endpoints) and stores most management
information (inventory, command, package, product, and user definitions, etc).
Endpoints are the systems which your PC does not need to be in direct contact with in order to
"manage".
Source System is the system from which objects, files and information are sent using Management
Central's send tasks. The Source System is the source of the objects, files and information being
sent.
Model System has all and only desired fixes installed or has all system values set properly for the
targets.
Target System is where objects, files and information are sent within Management Central's send tasks.
The Target Systems are the destinations of the objects, files and information being sent. Target
Systems (and more generally, endpoint systems) are often grouped into System Groups.
8
© 2012 IBM Corporation
IBM Power Systems
Real-time performance graphs with System Monitors
9
© 2012 IBM Corporation
IBM Power Systems
System Monitors

Monitor system performance
– Predefined metrics
●

Enabled with...
–
–
–
–
Event logging
Trigger/Reset notification
CL Command Automation
System Actions
●
10
CPU Utilization, Disk Arm Utilization...
Job actions (hold, release...)
© 2012 IBM Corporation
IBM Power Systems
System Monitors
Select 'New Monitor...' and specify General properties
11
© 2012 IBM Corporation
IBM Power Systems
Notes: Select 'New Monitor...' & specify General properties
You need to name your monitor – specific names work well.
You can select one or multiple metrics in a single monitor
Question: How many metrics do you put in a single monitor?
Answer: It depends
1. do you like to see all monitors on a single screen?
2. do you prefer to have more granular monitor notifications
There is no limit on the number of endpoint systems that a monitor can be started on. However you do
get into usability issues when displaying the graph, too many systems on a graph and it might get to
difficult to view.
General - The General page for New Monitor or Monitor Properties allows you to view and change
general information about the monitor. The general information includes the name of the monitor, a
brief description of the monitor.
Name - The unique name of the monitor. You can change the name, using up to 64 characters for the
new name. Do not use any of the following characters: asterisk (*), backslash (\), colon (:), greater
than (>), less than (<), question mark (?), quotation mark (“), slash (/), or vertical bar (|).
Description - A brief description to help you identify this monitor in a list of monitors. You can change the
description, using up to 64 characters for the new description.
12
© 2012 IBM Corporation
IBM Power Systems
System Monitors
Select the 'Metrics to monitor‘, when done press OK to create
What
How often
Vertical axis
Horizontal axis
13
© 2012 IBM Corporation
IBM Power Systems
Notes: Define A Monitor
The Metrics page for New Monitor or Monitor Properties allows you to select the
metrics that you want to monitor. You can view and change information about
the collection interval, the maximum graphing value, and the display time for each metric. You can also
click Threshold 1 or Threshold 2 to specify information about the thresholds for each metric.
Metrics is the piece of information to collect. Possible values are:
CPU Utilization (Average)
Communications IOP Utilization (Average)
CPU Utilization (Interactive Jobs)
Communications IOP Utilization (Maximum)
CPU Utilization (Interactive Feature)
Communications Line Utilization (Average)
CPU Utilization Basic (Average)
Communications Line Utilization (Maximum)
CPU Utilization (Secondary Workloads)
LAN Utilization (Average)
CPU Utilization (Database Capability)
LAN Utilization (Maximum)
Interactive Response Time (Average)
Machine Pool Faults
Interactive Response Time (Maximum)
User Pool Faults (Average)
Transaction Rate (Average)
User Pool Faults (Maximum)
Transaction Rate (Interactive)
Disk Storage (Average)
Batch Logical Database I/O
Disk Storage (Maximum)
Disk Arm Utilization (Average)
Disk IOP Utilization (Average)
Disk Arm Utilization (Maximum)
Disk IOP Utilization (Maximum)
Collection Interval is the time to wait in-between each collection of data.
Maximum graphing value is the highest value to be displayed on the vertical axis of the graph.
Display time is how many minutes you want displayed on the horizontal axis of the graph.
14
© 2012 IBM Corporation
IBM Power Systems
System Monitors
Setting Thresholds
What
Condition
Automation
15
© 2012 IBM Corporation
IBM Power Systems
Notes: Setting Thresholds
Threshold - A threshold is a setting for a metric that is being collected by a monitor.
This setting allows you to specify actions to be taken when a specified value (called
the trigger value) is reached. You can also specify actions to be taken when a
second value (called the reset value) is reached. For example, you can specify a CL command that
stops any new jobs from starting when CPU utilization reaches 90% and another command that
allows new jobs to start when CPU utilization falls to less than 70%. You can also choose to add an
event to the Event Log whenever the trigger value or the reset value is reached. You can set up to
two thresholds for each metric that the monitor is collecting. Trigger - considered bad (usually high
but can be low), reset - consider good (opposite of trigger)
The two Thresholds tabs on the metrics page provide a place for you to specify whether or not you
want to monitor this metric for a particular threshold. You must check the Enable threshold box
before you can specify the conditions to trigger and to reset this threshold. You can also specify the
action to be taken when the threshold is triggered and when it is reset. The action that you specify
must be a CL command. When you click OK, this metric will be actively monitored for this threshold
if the monitor is currently running. If the monitor is not currently running, this metric will be monitored
for this threshold the next time the monitor is started
You can specify the following conditions and commands for Threshold trigger and for Threshold reset:
Value - Specifies the condition that must be met to trigger or to reset this threshold.
Duration - Specifies the number of consecutive collection intervals that the value must meet the
criterion to cause a threshold trigger or reset event. Specifying a higher number of collection
intervals for Duration helps to avoid unnecessary threshold activity due to frequent spiking of values.
i command - Specifies the command to be run on the i endpoint system when the threshold is
triggered or reset on that endpoint. This command can be as simple as sending a message, or as
complex as submitting or calling a program.
16
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
System Monitor Replacement Variables:
Parameter Passed Data
&DATE
&INTVL
&MON
&RDUR
&RVAL
&SEQ
&TDUR
&TIME
&TVAL
&VAL
The Date the monitor triggered or reset
Collection interval: How often the monitor collected data (in seconds)
The Monitor name
Reset duration: How many intervals does the reset value have to be met before the monitor resets.
Reset value: The value that the metric was monitoring for when the monitor reset
Sequence number: A unique, incrementing number assigned to each collection interval. Can be used in a
program to compare when triggers happened and in what sequence.
Trigger duration: How many intervals does the trigger value have to be met before the monitor triggers
The time the monitor triggered or reset
Trigger value: The value that the metric was monitoring for when the monitor triggered
Current value: The actual value of the metric when the monitor triggered (2)
Note: A couple of things to note about system monitor replacement parameters:
- The dollar sign ($) that was available in previous releases is still supported, for example, $TIME.
- The wording is a bit different on some metrics and values:
- Batch I/O is shown as I/O operations rather than transactions per second.
- Transaction rates are shown as transactions rather than transactions per second.
- Interactive response times (both average and maximum) are shown in milliseconds rather than seconds.
17
© 2012 IBM Corporation
IBM Power Systems
System Monitors
Select the monitor, then the start button
18
© 2012 IBM Corporation
IBM Power Systems
Notes: Start A Monitor
The Start Monitor dialog allows you to select the endpoint systems and system groups
on which you want to start the monitor (if no endpoint systems or system groups have
been previously selected for this monitor).
To add a system or group to the Selected systems and groups list, select it in the Available systems and
groups list, and then click Add. If a monitor is started and then a system is added, the monitor will be
started on that endpoint system automatically.
To remove a system or group from the Selected systems and groups list, select it in the list, and then
click Remove. If a monitor is started and then a system is removed, the monitor will be stopped on
that endpoint system automatically.
Available systems and groups - A list of endpoint systems and system groups from which you can select
a system or group. Click the plus sign (+) next to any group to see the systems that are included in
the group.
Monitor data is collected and stored on the endpoint system. A minimum amount of data is actually sent
back to the client when viewing the graph, The more specific, detailed data is only sent to the client
when the graphs are open
PC is not required to be connected once monitor is started. The graph window can also be minimized
and the monitor will still be active.
The data shown in the graph is obtained from Collection Services. Collection Services houses the data
in management collection objects. This data is used by system monitors, job monitors and other
performance tools.
19
© 2012 IBM Corporation
IBM Power Systems
System Monitors
View the status
Overall status
20
© 2012 IBM Corporation
IBM Power Systems
Notes: View Status for a Monitor
The Status dialog allows you to see the current status of each endpoint system and system group
associated with a monitor. The status of each system and group is updated automatically as changes
occur. You can expand any group in the System or Group list to see the status of individual systems in
the group. By clicking the Restart button, you can restart the monitor on any systems on which it has
failed.
Overall status - The current status of the monitor. Possible values are:
x thresholds triggered - The number of thresholds that are currently active for the monitor (that is, x
represents the number of thresholds that have been triggered but have not been reset).
Started on x of y systems - The monitor is collecting data on x of y endpoint systems, where x represents
the number of systems where the monitor is running and y represents the number of systems where you
requested to start the monitor. The monitor is in the process of starting on the remaining systems.
Started - The monitor is collecting data on all endpoint systems where you requested to start the monitor.
Starting - The monitor is in the process of starting.
Stopping - The monitor is in the process of stopping.
Stopped - The monitor is no longer collecting data.
Failed - An attempt was made to start the monitor on the specified systems or groups, but the monitor was
not started on any systems. The failure may have occurred because the systems were not running when
you tried to start the monitor, or it may be because a connection was lost or a server was not started.
Click Restart to try starting this monitor again.
Failed on x of y systems - The monitor has failed to start or unexpectedly stopped working on x of y
systems (where x is the number of systems on which work has stopped and y is the total number of
systems on which the monitor is to be run). The monitor is starting or started on the remaining systems.
The failure may have occurred because the systems were not running when you tried to start the
monitor, or it may be because a connection was lost or a server was not started. Click Restart to try
starting this monitor again. See the System or Group Status for a list of the endpoint systems and system
21
© 2012 IBM Corporation
groups associated with the monitor and the current status of each system and group.
IBM Power Systems
System Monitors
Viewing the thresholds
Threshold Indicators
Drill down with Actions
22
© 2012 IBM Corporation
IBM Power Systems
Changing Thresholds
Properties
Active Control
Menu
23
© 2012 IBM Corporation
IBM Power Systems
Notes: Changing Thresholds
You can change the thresholds several ways.
Properties
Active graphical control
Menu items
You can change thresholds while a monitor is started, e.g., you do not need to stop the monitor to change
the thresholds. The general properties of the monitor can be accessed view the toolbar or menu items for
making any changes or additions to the thresholds and values.
To change the trigger value or the reset value for a threshold using the active graphical control, place the
mouse pointer on the threshold indicator. When the ToolTip indicates Trigger, hold the mouse button down
and move up or down to change the trigger value. The changing values are shown in the ToolTip. When
the ToolTip indicates Reset, hold the mouse button down and move up or down to change the reset value .
Click any collection point on a Monitor graph line to see Details of the data associated with the collection
point.
By accessing the menu items, you will taken directly to the thresholds page in properties to make any
changes.
There are several visual indicators when a threshold occurs:
Status in the toolbar area.
Upper Left corner icon will change.
Line in the graph will change to red.
Metric graph title will change to red with icon indicator
24
© 2012 IBM Corporation
IBM Power Systems
Threshold Actions
IBM i
PC Client
25
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold Actions
The Actions page for Monitor Properties allows you to specify the actions to occur when a threshold is
triggered and when a threshold is reset which apply to all metrics.
Log event - Adds an entry to the Event Log on the central system indicating that the threshold was
triggered. The entry also includes the date and time the event occurred, the endpoint system being
monitored, the metric being collected, and the monitor that logged the event.
Open Event Log - Displays the Event Log, which is a list of threshold trigger and reset events that have
occurred.
Open monitor - Displays a graphical view of the metrics as they are being collected.
Sound alarm - Sounds an alarm on the PC.
Threshold commands will be run under the monitor's owner's user profile.
When a threshold gets triggered/reset, your PC client does not need to be up and running to run the
Operating System command. However, if it is not up the corresponding PC action will not happen.
26
© 2012 IBM Corporation
IBM Power Systems
Viewing Events From Thresholds
27
© 2012 IBM Corporation
IBM Power Systems
Notes: Event Logs
The Event Log window displays a list of threshold trigger and reset events for all of your monitors. You can
specify on the Properties page for each monitor whether or not you want events added to the Event Log.
To see the Properties page for any monitor, select the monitor in the Monitors window and then select
Properties from the File menu. The list of events is arranged in order by date and time by default, but you
can change the order by clicking on any column heading. For example, to sort the list by the endpoint
system where the event occurred, click on System.
An icon to the left of each event indicates the type of event::
A red circle with white x - indicates that this event is a trigger event for which you did not specify a host
command to be run when the threshold was triggered.
A yellow circle with red x - Indicates that this event is a trigger event for which you specified a host
command to be run when the threshold was triggered.
A white check with a check mark -indicates that this event is a threshold reset event.
You can customize the list of events to include only those that meet specific criteria by selecting Options
from the menu bar and then selecting Include. You can have more than one Event Log window open at
the same time, and you can work with other windows while the Event Log windows are open. Event Log
windows are updated continuously as events occur.
28
© 2012 IBM Corporation
IBM Power Systems
Customize Event Log Information
29
© 2012 IBM Corporation
IBM Power Systems
Notes: Options – Include
Options menu choices
Click Options on the menu bar to display the actions you can perform to change what information is
displayed. The possible choices are:
Include...
Displays the Include dialog, which allows you to specify which events you want to display in the list.
Columns...
Displays the Columns dialog, which allows you to specify which columns of information you want to display
in the list. You can also specify the order in which you want the columns to be displayed.
30
© 2012 IBM Corporation
IBM Power Systems
Event Properties - Trigger (reset similar)
31
© 2012 IBM Corporation
IBM Power Systems
Notes: Event Properties
The Trigger/Reset page for Event Properties allows you to view additional information about the event.
This information includes the value, the duration, the Operating System command and the sequence
number of the event.
Trigger/Reset value - The value specified in the monitor properties.
Actual value - The actual value that exceeded the trigger value and caused the trigger event.
Duration - The number of collection intervals specified for the duration in the monitor properties.
Operating System command - The command that was run on the endpoint system when the event
occurred.
The General page for Event Properties allows you to view general information about the event. The
general information includes the type of event (trigger or reset), the date and time the event occurred, the
endpoint system that the event occurred on, the metric that was being collected, and the name of the
monitor that logged the event.
For more information, select the following:
Event type
System
Date
Time
Monitor
Metric
32
© 2012 IBM Corporation
IBM Power Systems
Management Central Job Monitors

Monitor the system by selecting ...
– Criteria to subset data
●
Job criteria - subsystem, job name, job type or user
●
Server name - web server, ftp server...
– Predefined metrics
●

Enabled with...
–
–
–
–
Event logging
Trigger/Reset notification
CL Command Automation
System Actions
●
33
Job Count, Thread count, CPU Utilization, …
Job actions such hold, release...
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Select 'New Monitor...' and specify General properties
34
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Select 'Metrics to monitor' and thresholds then press OK to create
Different Metrics
 Types
 Levels
Problem Condition
Automation
35
© 2012 IBM Corporation
IBM Power Systems
Notes: Define A Monitor
The Metrics page for New Monitor or Monitor Properties allows you to select
the metrics that you want to monitor. You can view and change information for
each metric. You can also click Threshold 1 or Threshold 2 to specify
information about the thresholds for each metric.
Metrics are the pieces of information to collect. Possible values are:
Job Count, Job Log Message and Job Status
Job Numeric Values:
CPU Percent Utilization
Logical I/O Rate
Disk I/O Rate
Communications I/O Rate
Transaction Rate
Transaction Time
Thread Count
Page Fault Rate
Summary Numeric Values (same a job level)
36
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
Job Monitor Replacement Variables:
37
Parameter
&DATE
&INTVL
&MON
&TIME
&ENDPOINT
&EVENTTYPE
Passed Data
The Date the monitor triggered or reset
Collection interval: How often the monitor collected data (in seconds)
The monitor name
The time the monitor triggered or reset
The endpoint system name
Event type: The type of trigger or reset that is happening, defined as follows:
Triggered Event
=1
Auto Reset Event
=2
Manual Reset Event
=3
&JOBNAME
&JOBNUMBER
&JOBSTATUS
&JOBTYPE
&JOBUSER
The job name of the job causing the trigger/reset
The job number of the job causing the trigger/reset
The job status causing a trigger/reset
The job type of the job causing the trigger/reset
The job user of the job causing the trigger/reset
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
Job Monitor Replacement Variables (continued):
Parameter
&METRICTYPE
Passed Data
The category of the metric. For a Job monitor, the categories are as follows:
Status Metric
= 10010
Message Metric
= 10020
Numeric Metric
= 10030
&METRIC
Metric that has triggered/reset, defined as follows:
Job CPU Utilization
= 1010
Summary Comm I/O
Job Logical I/O
= 1020
Summary Trans. Rate
Job Disk I/
= 1030
Summary Trans. Time
Job Comm I/O
= 1040
Summary Thread Cnt
Job Transaction Rate
= 1050
Summary Page Faults
Job Transaction Time
= 1060
Job Status
Job Thread Count
= 1070
Job Log Messages
Job Page Faults
= 1080
Summary Job Count
Summary CPU Utilization = 2010
Summary Logical I/O
= 2020
Summary Disk I/O
= 2030
38
= 2040
= 2050
= 2060
= 2070
= 2080
= 3010
= 3020
= 4010
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
Job Monitor Replacement Variables (continued):
Parameter
&NUMCURRENT
&NUMRESET
Passed Data
Current numeric value
Threshold value to cause auto-reset of numeric metric
&NUMTRIGGER
Threshold value to cause trigger of a numeric metric
&OWNER
&RDUR
&RESETTYPE
&TDUR
Monitor owner
Reset duration, in intervals, as set in the threshold
Reset type and defined as follows:
Manual reset = 1
Automatic reset = 2
Subsystem of the job causing the trigger/reset
Server type of the job causing the trigger/reset.
Note: Not supported for summary metrics.
Trigger duration, in intervals, as set in the threshold
&THRESHOLD
&MSGID
&MSGSEV
Threshold number causing the trigger
Message ID causing the trigger/reset
Message severity causing the trigger/reset
&MSGTYPE
Message type causing the trigger/reset
&SBS
&SERVER
39
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
Invalid Job Monitor Replacement Variable Combinations:
Job Monitor substitution parameter notes:
• If a monitor is triggered and the user performs a manual reset ("Reset with Commands" or "Reset Only") , there is no
substitution value for the Parm &NUMRESET, &RDUR. It will only have a value if the reset is automated.
• &MSGID, &MSGSEV, or &MSGTYPE you need to be monitoring the 'Job Log Message' metric - otherwise there is no
substitution value for these. Additionally, these are only valid in the trigger and reset commands of Job Log Messages
thresholds.
• &RESETTYPE only has a valid substitution value on a reset command. Constant values are used to determine
whether the reset type is manual or automated.
• &EVENTTYPE is valid for all substitution and has constant values that are used to determine the type of monitor event
that occurred (automated trigger, automated reset, or manual reset). In a trigger command, the value is always the
trigger constant; in a reset command, it can either be the automated reset or manual reset constant.
• &TDUR, &NUMTRIGGER, and &NUMCURRENT only have valid substitution when a trigger occurs, in the trigger
command.
• &NUMTRIGGER, &NUMCURRENT, and &NUMRESET only have valid substitution when a "numeric" metric is being
monitored, in the trigger and reset commands of numeric metric thresholds.
• &JOBSTATUS only has valid substitution when the Job Status metric is monitored, in the trigger and reset commands
of Job Status thresholds.
• Job Count metric not valid with: &JOBNAME, &JOBUSER, &JOBNUMBER, &JOBTYPE, &SBS, &SERVER, &MSGID,
&MSGSEV, &MSGTYPE, AND &JOBSTATUS
• Job Log Message metric not valid with: &RDUR, &NUMRESET, &TDUR, &NUMTRIGGER, &NUMCURRENT, and
&JOBSTATUS
• Job Status metric not valid with: &NUMRESET, &NUMTRIGGER, &NUMCURRENT, &MSGID, &MSGSEV, AND
&MSGTYPE
• The 'Job Numeric Values' metrics of CPU Percent Utilization, Logical I/) Rate, Disk I/O Rate, Communications I/) Rate,
Transaction Rate, Transaction Time, Thread Count, and Page Fault Rate are not valid with: &MSGID, &MSGSEV,
&MSGTYPE AND &JOBSTATUS
• The 'Summary Numeric Values' metrics of CPU Percent Utilization, Logical I/) Rate, Disk I/O Rate, Communications I/)
Rate, Transaction Rate, Transaction Time, Thread Count, and Page Fault Rate are not valid with: &JOBNAME,
&JOBUSER, &JOBNUMBER, &JOBTYPE, &SBS, &SERVER &MSGID, &MSGSEV, &MSGTYPE AND &JOBSTATUS
40
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Setting collection interval
How often?
5,15, 30 minutes or 1 hour
Tuning options
Caution
Uses system resources!
41
© 2012 IBM Corporation
IBM Power Systems
Notes: Job Monitors and System Resources
Job monitors connect to a QZRCSRVS job for each job that is being monitored for the Job Log Messages
and the Job Status metrics.
QZRCSRVS jobs are not Management Central jobs. They are IBM i TCP Remote Command Server jobs
that the Management Central Java server uses for calling commands and APIs. In order to process the
API calls for the Job Log Messages and Job Status metrics in a timely fashion within the job monitor’s
interval length, the APIs are called for each job concurrently at interval time.
When both metrics are specified on the same monitor, two QZRCSRVS jobs are started for each job. For
example, if 5 jobs are monitored for Job Log Messages, 5 QZRCSRVS jobs are started to support the
monitor. If 5 jobs are monitored for Job Log Messages and Job Status, then 10 QZRCSRVS jobs are
started.
Thus, it is recommended that for standard systems, when you are using the Job Log Message and Job
Status metrics, you limit the number of jobs monitored on a small system to 40 jobs or less.
42
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Actions
Server
PC Client
Job, Message and
File Monitors
Data will
collect
without
thresholds
and actions
43
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Select the monitor, then the start button
44
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
View the status
Overall status
Restart on failed systems
45
© 2012 IBM Corporation
IBM Power Systems
Job Monitors
Viewing the system and job information
Status
Detailed
Information
Actions
46
© 2012 IBM Corporation
IBM Power Systems
Message Monitors
Message monitors can be used to view messages across systems that match monitor
criteria. You can work with the messages listed in the monitor (display details, reply, and
delete).

Monitor the system by selecting ...
– A single message queue
– Message criteria
●
ID, severity or type
– Predefined metrics
●
Count...

Enabled with...
–
–
–
–
Event logging
Trigger/Reset notification
CL Command Automation
System Actions
●
47
Message reply, delete...
© 2012 IBM Corporation
IBM Power Systems
Message Monitors
Select 'New Monitor...' and specify General properties
48
© 2012 IBM Corporation
IBM Power Systems
Message Monitors
Select 'Metrics to monitor' and thresholds then press OK to create
n
Co
on
i
t
i
d
Messages
n
o
i
t
ma
o
t
Au
49
© 2012 IBM Corporation
IBM Power Systems
Notes: Threshold variables
Message Monitor Replacement Variables:
Parameter
&DATE
&MON
&INTVL
&TIME
&ENDPOINT
&EVENTTYPE
&FRMJOBNUMBER
&FRMJOBNAME
&FRMUSER
&FRMPROGRAM
&MSGKEY
&MSGID
&MSGSEV
&MSGTYPE
&MSGCOUNT
&OWNER
&THRESHOLD
&TOLIB
&TOMSGQ
50
Passed Data
Date
Monitor name
Collection interval length in seconds
Time
Endpoint system name
Event type and defined as follows:
Triggered Event = 1
Manual Reset Event = 3
Job number for the job causing the triggering message
Job name for the job causing the triggering message
User owning the job causing the triggering message
Name of the program causing the triggering message
4-byte message key for the message causing the trigger (as a hex string)
Message ID causing the trigger
Message severity causing the trigger
Message type causing the trigger
Current message count (that caused the trigger)
Monitor owner
Threshold number causing the trigger
Message queue's library to which this message was sent (the library of the
queue being monitored)
Message queue name to which this message was sent (the queue being
monitored)
© 2012 IBM Corporation
IBM Power Systems
Message Monitors
Viewing the system and message information
Status
Actions
51
© 2012 IBM Corporation
IBM Power Systems
Monitoring with Watches
 Watches can be use to automate the actions taken when the
following occur:
– Message
– Licensed Internal Code Log (LIC Log)
– Problem Activity Log Entry (PAL entry)
 Start Watch (STRWCH) command or API (QSCSWCH)
 End Watch (ENDWCH) command or API (QSCEWCH)
 When the condition being watched occurs, your program gets control and
you can take any action you want
http://ibmsystemsmag.blogs.com/i_can/2010/01/i-can-automate-monitoring-with-watches.html
http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/rzahb/rzahb_eventfunction.htm
52
© 2012 IBM Corporation
IBM Power Systems
Watches
 Low Overhead
– Watches are an exit
– Almost no overhead until the watched condition occurs
– Your program gets control to determine what action to take
– For message watches
• Can watch for messages sent to any message queue, including
 QSYSOPR, History Log
• Can watch for messages sent to any job log
 Can specify generic job name
 Can specify *ALL to watch for a message to all job logs
53
© 2012 IBM Corporation
IBM Power Systems
File Monitors
You can use a file monitor to notify you whenever a selected file has changed,
reached a specified size, or for specified text strings.

Monitor the system
by selecting ...
– History Log (QHST) or
specific files
– File criteria
●
File location
– Predefined metrics
●
Status, size and text

Enabled with...
– Event logging
– Trigger/Reset notification
– CL Command Automation
54
© 2012 IBM Corporation
IBM Power Systems
File Monitors
You can select to monitor all system log files or selected files.
55
© 2012 IBM Corporation
IBM Power Systems
File Monitors
 Metrics
– Text
– File Status
– Size
56
© 2012 IBM Corporation
IBM Power Systems
Automatic Reset for Message and File Monitor Triggers

57
Some metrics can be reset automatically after a trigger command runs
– Only available for Message and File monitors
© 2012 IBM Corporation
IBM Power Systems
Sharing of Monitors

Monitors can be shared

Owner
– User that created the monitor

None
– No one else can see it

Read-only
– Others can see it
●
i.e. view properties & copy it

Controlled
– Other can perform actions
●
i.e. start and stop
58
© 2012 IBM Corporation
IBM Power Systems
Notes: Sharing
The owner has specified one of the following levels of sharing:
None
Other users cannot view this item.
Read-Only
Other users can view this item and use it (but can not start or stop it). Other users can create a
new item based on this one and make changes to the new one as needed. However, other
users cannot delete or change this item in any way. If you are the owner of a monitor and
have specified actions (such as opening the event log window or sounding an alarm on the
PC), these actions occur for all users of the monitor whenever a threshold is triggered or reset.
The other users cannot change these actions.
Controlled
Other users can start and stop this item. Only the owner can change the level of sharing.
Other users can also view this item and use it to create a new item based on this one. If you
are the owner of a monitor and have specified actions (such as opening the event log window
or sounding an alarm on the PC), these actions occur for all users of the monitor whenever a
threshold is triggered or reset. The other users cannot change these actions. Actions are run
under the profile of owner!
59
© 2012 IBM Corporation
IBM Power Systems
Monitoring with IBM
Systems Director
60
© 2012 IBM Corporation
AIX® and PowerVM™ Workshop
Cross-platform Management
Upward Integration modules supporting
Tivoli, Computer Associates, Hewlett Packard, Microsoft
Windows
Administrator
Linux
Administrator
AIX Administrator
IBM i Administrator
Microsoft
Windows™
VM
VMware
ESX™ VM
VM
VM
VM
VIO
VIO
Managed Systems
Common agent, Platform agent, No agent
66
66
© 2012 IBM Corporation
IBM Power Systems
IBM i Prerequisites for IBM Systems Director 6.3
 Agentless - SSH is required to discover IBM i
– SSH, 5733-SC1 Option 1
 Platform agent – IBM Universal Management Enablement
– For CIM capabilities (for example, IBM i Monitors)
– 5770-UME V1R3
– Operating System and UME fixes are required
http://www-912.ibm.com/s_dir/slkbase.NSF/DocNumber/618224537
 Common Agent (CAS)
– Download from IBM Systems Director web site
– Install manually on i or through the IBM Systems Director UI
http://publib.boulder.ibm.com/infocenter/director/pubs/topic/com.ibm.director.tbs.helps.doc/fqm0_tbs_ibm_i_endpoints.html
62
© 2011 IBM Corporation
IBM Power Systems
Collection Services Prerequisites for Monitoring
 Collection Services must be started
 Consider the collection interval for frequency of updates through
System Director
– Default is every 15 minutes
 Data must be in Collection Services DB2 files to be displayed on
IBM i Monitors in Systems Director
– CFGPFRCOL – CRTDBF parameter must be *YES
 “No Data Available”
– Symptom of possible Collection Services problem when viewing IBM i Monitors
63
© 2011 IBM Corporation
IBM Power Systems
Performance Summary
64
© 2012 IBM Corporation
IBM Power Systems
Monitor Your i
65
© 2012 IBM Corporation
IBM Power Systems
Common CIM Monitors
66
© 2012 IBM Corporation
IBM Power Systems
IBM i Monitors
67
© 2012 IBM Corporation
IBM Power Systems
IBM i Monitors
 34 Metrics for common monitoring scenarios
68
© 2011 IBM Corporation
IBM Power Systems
69
© 2011 IBM Corporation
IBM Power Systems
Create Your Own Monitor
70
© 2011 IBM Corporation
IBM Power Systems
Director Agent Monitors (requires CAS agent)
71
© 2011 IBM Corporation
IBM Power Systems
72
© 2011 IBM Corporation
IBM Power Systems
73
© 2011 IBM Corporation
IBM Power Systems
CIM Monitors
The are many metrics that you can monitor
with CIM, but it can be difficult to figure out
what's available.
74
© 2011 IBM Corporation
IBM Power Systems
Monitor Your IOA Cache Batteries
http://ibmsystemsmag.blogs.com/i_can/2012/04/monitoring-cache-battery-status-with-ibm-systems-director-63.html
75
© 2011 IBM Corporation
IBM Power Systems
IOA Cache Battery Monitor Metrics
76
© 2011 IBM Corporation
IBM Power Systems
Manage Processes … aka, Work with Active Jobs
(Requires CAS Agent)
77
© 2011 IBM Corporation
IBM Power Systems
Monitoring IBM i Processes
 Create a process monitor from the
Manage Processes task
 Once a process monitor is created
additional metrics can be
monitored for that process
using the Create monitor
task
 Requires CAS agent
78
© 2011 IBM Corporation
IBM Power Systems
Monitor Thresholds
 The process of setting threshold and enabling automation is the
same regardless of the metric type
 For any metric that you want to monitor, you can set thresholds
 Once thresholds are set, you then can create an event filter for the
event that occurs when the threshold setting is hit
 Automation plans complete the set up to enable automatic
notification that the threshold was hit
79
© 2012 IBM Corporation
IBM Power Systems
Monitor Thresholds
80
© 2012 IBM Corporation
IBM Power Systems
Monitor Thresholds
81
© 2012 IBM Corporation
IBM Power Systems
Create Event Filters
82
© 2012 IBM Corporation
IBM Power Systems
Event Automation Plans
83
© 2012 IBM Corporation
IBM Power Systems
84
© 2012 IBM Corporation
IBM Power Systems
Event Actions
85
© 2012 IBM Corporation
IBM Power Systems
Graphs and Dashboard
86
© 2012 IBM Corporation
IBM Power Systems
Health
Summary
87
© 2012 IBM Corporation
IBM Power Systems
Monitoring Messages with IBM Systems Director Events
88
© 2012 IBM Corporation
IBM Power Systems
Event Filters for QSYSOPR messages
89
© 2012 IBM Corporation
IBM Power Systems
Event Filters for QSYSOPR message
90
© 2012 IBM Corporation
IBM Power Systems
Event Filters for QSYSOPR message
91
© 2012 IBM Corporation
IBM Power Systems
Monitoring QSYSOPR .... Event Automation Plan
 After you establish what you want to monitor by creating the event
filter for IBM i messages ...
 You must also create an event automation plan for the action
Director is to take when those events occur
Even though these events are generated by their respective operating systems (or an optional layer
that is installed on the operating system), IBM Systems Director does not process these events unless
you create an event automation plan to do so.
92
© 2012 IBM Corporation
IBM Power Systems
Monitoring System Performance with IBM Systems
Director Navigator
93
© 2012 IBM Corporation
IBM Power Systems
IBM Systems Director Navigator for i
Performance Tasks
94
© 2012 IBM Corporation
IBM Power Systems
Health Indicators
Manually Monitor your System Performance
95
© 2012 IBM Corporation
IBM Power Systems
Health Indicators
Customize Health Indicator Thresholds
96
© 2012 IBM Corporation
IBM Power Systems
Continuing Education
A Few More
Things....
97
© 2012 IBM Corporation
IBM Power Systems
Monitor Restart Options
98
© 2012 IBM Corporation
IBM Power Systems
Notes: Changing the Monitor Restart Options
Restarting Monitors
Monitor Restart was added to provide a way to automatically restart monitors when the Management
Central servers have been interrupted. These interruptions could be as simple as the MC Central
Server or MC Endpoint Server being restarted, or something more dramatic such as the temporary
loss of communications between the Central Server and an Endpoint Server or a system being IPLed.
If you select to have the system automatically attempt to restart your monitors, you may also specify
how long you want the central system to keep trying to restart the monitors and how often you want
the system to try during that time period.
For example, if you want the system to try to restart monitors every five minutes for a period of 3
hours, you select 'Automatically restart monitors on failed systems' and then specify 180 minutes for
'How long to attempt restart' and 5 minutes for 'How often to attempt restart'.
A change to this setting takes effect the next time the Management Central servers are restarted.
All Monitors support the restart option.
Default behavior is OFF.
Documentation on how to automatically restart Management Central Monitors:
https://www-304.ibm.com/support/entdocview.wss?uid=nas16e5b0871315547a68625729e004737ce
99
© 2012 IBM Corporation
IBM Power Systems
References

IBM i Home Page
– http://www-03.ibm.com/systems/power/software/i/

Information Center
– http://publib.boulder.ibm.com/iseries/

IBM i Systems Management
– www.ibm.com/systems/i/solutions/management/
– www.ibm.com/systems/i/software/navigator/index.html
http://www.redbooks.ibm.com/redbooks/pdfs/sg246226.pdf
100
© 2012 IBM Corporation
IBM Power Systems
Special notices
This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in
other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM
offerings available in your area.
Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions
on the capabilities of non-IBM products should be addressed to the suppliers of those products.
IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give
you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY
10504-1785 USA.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives
only.
The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or
guarantees either expressed or implied.
All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the
results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations
and conditions.
IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions
worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment
type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal
without notice.
IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.
All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are
dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this
document may have been made on development-level systems. There is no guarantee these measurements will be the same on generallyavailable systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document
should verify the applicable data for their specific environment.
Revised September 26, 2006
101
© 2012 IBM Corporation
IBM Power Systems
Special notices (cont.)
IBM, the IBM logo, ibm.com AIX, AIX (logo), AIX 5L, AIX 6 (logo), AS/400, BladeCenter, Blue Gene, ClusterProven, DB2, ESCON, i5/OS, i5/OS (logo), IBM Business
Partner (logo), IntelliStation, LoadLeveler, Lotus, Lotus Notes, Notes, Operating System/400, OS/400, PartnerLink, PartnerWorld, PowerPC, pSeries, Rational, RISC
System/6000, RS/6000, THINK, Tivoli, Tivoli (logo), Tivoli Management Environment, WebSphere, xSeries, z/OS, zSeries, Active Memory, Balanced Warehouse,
CacheFlow, Cool Blue, IBM Systems Director VMControl, pureScale, TurboCore, Chiphopper, Cloudscape, DB2 Universal Database, DS4000, DS6000, DS8000,
EnergyScale, Enterprise Workload Manager, General Parallel File System, , GPFS, HACMP, HACMP/6000, HASM, IBM Systems Director Active Energy Manager,
iSeries, Micro-Partitioning, POWER, PowerExecutive, PowerVM, PowerVM (logo), PowerHA, Power Architecture, Power Everywhere, Power Family, POWER Hypervisor,
Power Systems, Power Systems (logo), Power Systems Software, Power Systems Software (logo), POWER2, POWER3, POWER4, POWER4+, POWER5, POWER5+,
POWER6, POWER6+, POWER7, System i, System p, System p5, System Storage, System z, TME 10, Workload Partitions Manager and X-Architecture are trademarks
or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are
marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at
the time this information was published. Such trademarks may also be registered or common law trademarks in other countries.
A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or
other countries.
AltiVec is a trademark of Freescale Semiconductor, Inc.
AMD Opteron is a trademark of Advanced Micro Devices, Inc.
InfiniBand, InfiniBand Trade Association and the InfiniBand design marks are trademarks and/or service marks of the InfiniBand Trade Association.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered
trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.
Microsoft, Windows and the Windows logo are registered trademarks of Microsoft Corporation in the United States, other countries or both.
NetBench is a registered trademark of Ziff Davis Media in the United States, other countries or both.
SPECint, SPECfp, SPECjbb, SPECweb, SPECjAppServer, SPEC OMP, SPECviewperf, SPECapc, SPEChpc, SPECjvm, SPECmail, SPECimap and SPECsfs are
trademarks of the Standard Performance Evaluation Corp (SPEC).
The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
TPC-C and TPC-H are trademarks of the Transaction Performance Processing Council (TPPC).
UNIX is a registered trademark of The Open Group in the United States, other countries or both.
Other company, product and service names may be trademarks or service marks of others.
Revised December 2, 2010
102
© 2012 IBM Corporation
Download